[4.1.0] Orange ZScript console warnings

Forum rules
Please don't bump threads here if you have a problem - it will often be forgotten about if you do. Instead, make a new thread here.

Post a reply

Smilies
:D :) :( :o :shock: :? 8-) :lol: :x :P :oops: :cry: :evil: :twisted: :roll: :wink: :geek: :ugeek: :!: :?: :idea: :arrow: :| :mrgreen: :3: :wub: >:( :blergh:
View more smilies

BBCode is OFF
Smilies are ON

Topic review
   

Expand view Topic review: [4.1.0] Orange ZScript console warnings

Re: [4.1.0] Orange ZScript console warnings

by Rachael » Sat May 30, 2020 9:30 am

That seems like the kind of thing that would be useful if included in GZDoom itself, somehow, after proper review and/or revision if necessary - just an idea.

Re: [4.1.0] Orange ZScript console warnings

by phantombeta » Sat May 30, 2020 4:15 am

burp. Coded at around 6~7 AM. UTF-8 aware, tested with CJK code points. Public domain, though credits in the form of my name (Chronos "phantombeta" Ouroboros) and a link to the gist's page would be appreciated.
I hold no responsibility if the code does not work, breaks something, crashes, corrupts your hard drive, sets your computer on fire and/or kills your pet(s).

UTF-8 sucks. It's awful. Fuck user-inputted text. Fuck Unicode.
It's not very surprising why no one likes working with this shit.

Re: [4.1.0] Orange ZScript console warnings

by Lagi » Sat May 30, 2020 2:38 am

Thank You
Spoiler:
Spoiler: P.S.

Re: [4.1.0] Orange ZScript console warnings

by m8f » Sat May 30, 2020 2:34 am

I have an example here. This function compares two unicode strings character by character, and composes a colored string where matching characters are green, and not matching characters are red.

Re: [4.1.0] Orange ZScript console warnings

by SanyaWaffles » Sat May 30, 2020 2:21 am

Then how do you use GetNextCodePoint and it's related functions reliably to account for this? I've been looking at this for over an hour now and I cannot wrap my head around it and I'm not finding many examples. The only thing I've seen mentioned is in this thread and it gives no examples of how the functions are used.

Re: [4.1.0] Orange ZScript console warnings

by Graf Zahl » Sat May 30, 2020 2:16 am

No, that's not correct. Virtually everything posted here ignores that UTF-8 is a variable length encoding and for proper processing the code needs to deal with this fact.
This all will break as soon as you try to print the text character by character.

Especially GetNextCodePoint's second return value is important, it is the incremented index so that it points to the start of the next code point, not the next byte.

Re: [4.1.0] Orange ZScript console warnings

by SanyaWaffles » Sat May 30, 2020 1:34 am

I've figured out how to properly do ByteAt based on what you said above.

Code: Select all

String unichar = String.Format("%c", str.ByteAt(currentPos));
seems to be how it's done, with str being whatever string you want to work on.

All the Wiki says about GetNextCodePoint is "(Need more info)". It seems to return two ints. Looking at the source code, it returns the codepoint and the current position.

I tried writing some functions for these:

Code: Select all

	String UniCharAt(String str, int pos)
	{
		String ret = String.Format("%c", str.ByteAt(pos));
		
		Console.PrintF(ret);
		
		return ret;
	}
	
	String UniLeft(String str, int len)
	{
		String find;
		
		for (int i = 0; i < len; i++)
		{
			find.AppendCharacter(str.GetNextCodePoint(i));
		}
		
		Console.PrintF(find);
		
		return find;
	}
	
	String UniMid(String str, int pos, int len)
	{
		String find;
		int max = pos + len;
		
		for (int i = pos; i < max; i++)
		{
			find.AppendCharacter(str.GetNextCodePoint(i));
		}
		
		Console.PrintF(find);
		
		return find;
	}
I tested it and it seems to work fine, but I'm sure there's some missing checks. I'm still working on the code, but this is a good starting point I think for anyone wanting to explore this.

Re: [4.1.0] Orange ZScript console warnings

by Graf Zahl » Sat May 30, 2020 12:06 am

Left and Mid are not a single bit safer than CharAt, they are just another way to ignore Unicode and create broken code that will fail if subjected to non-English text. The proper way to process a string character by character is to iterate over it with GetNextCodePoint, but that seems to be a *bit* too inconvenient for some people...

Re: [4.1.0] Orange ZScript console warnings

by 3saster » Fri May 29, 2020 6:49 pm

Lagi wrote:Big mess
If you need help, you might want to speak like a normal person...

That being said, the "safe" way, if the deprecation messages are anything to go by, instead of

Code: Select all

lumpString.CharAt(currentPos)
do

Code: Select all

lumpString.Mid(currentPos,1)
You might want to use a devbuild for development purposes, if only for the fact that the latest devbuilds feature deprecation messages that actually say what you should use instead. In this case, Left and Mid are recommended instead of CharAt.

Re: [4.1.0] Orange ZScript console warnings

by Graf Zahl » Fri May 29, 2020 3:46 am

Aside from the fact that you just ignored the entire motivation not to access bytes, this doesn't work because you need to specify the string you want to read from.
But your code still isn't Unicode safe and will only work if your text is guaranteed to be ASCII only.

These deprecations were made to clearly show that CharAt is not a safe function in a Unicode environment.

Re: [4.1.0] Orange ZScript console warnings

by Lagi » Fri May 29, 2020 2:03 am

sorry I think its simple, but im defeated by syntax

i have this (it works, just want to remove errors from startup log)
lumpString.CharAt(currentPos)
charAt is now excommunicado, and ByteAt is halal.

so i come up with this :roll:
lumpString.Format("%c", ByteAt(currentPos) )
Spoiler: my logic
and my school teacher said : How can you be so stupid:
Script error, "Bitter Heretic.pk3:zscript/pyw/parse.zsc" line 139:
Call to unknown function 'ByteAt'

Re: [4.1.0] Orange ZScript console warnings

by Graf Zahl » Fri May 29, 2020 12:37 am

At the moment you'll have to convert the character back to a string with String.Format("%c", ...), but the main reason these were deprecated is that with Unicode strings such algorithms that pick apart the string character by character won't work as expected and are generally discouraged.

Re: [4.1.0] Orange ZScript console warnings

by Lagi » Fri May 29, 2020 12:30 am

how to replace CharAt with BytaAt?

CharAt is a string function, while BytaAt is a int.
int ByteAt (int pos) const

(Need more info)

String CharAt (int pos) const (deprecated)

(Note: this has been deprecated in favor of ByteAt, CodePointCount and GetNextCodePoint.)
Returns the character at the specified position as string.

String s = "abcd";
String chrat = s.CharAt(1); // should be "b"
dont work as expected string
Spoiler:

Re: [4.1.0] Orange ZScript console warnings

by _mental_ » Mon Apr 29, 2019 12:56 am

Matt wrote:What do we use in place of CharAt and ToLower?
ByteAt() and MakeLower().
Matt wrote:I just noticed that no warnings appear for HD's use of the ToLower() function. Is this different?
This depends on ZScript version required for your mod.

Re: [4.1.0] Orange ZScript console warnings

by Matt » Mon Apr 29, 2019 12:47 am

What do we use in place of CharAt and ToLower?

EDIT: I just noticed that no warnings appear for HD's use of the ToLower() function. Is this different?

Top