[3.2.0] Nonbreaking spaces are silent but deadly

Bugs that have been investigated and resolved somehow.

Moderator: GZDoom Developers

Forum rules
Please don't bump threads here if you have a problem - it will often be forgotten about if you do. Instead, make a new thread here.
Post Reply
User avatar
Matt
Posts: 9696
Joined: Sun Jan 04, 2004 5:37 pm
Preferred Pronouns: They/Them
Operating System Version (Optional): Debian Bullseye
Location: Gotham City SAR, Wyld-Lands of the Lotus People, Dominionist PetroConfederacy of Saudi Canadia
Contact:

[3.2.0] Nonbreaking spaces are silent but deadly

Post by Matt »

viewtopic.php?f=3&t=58129

The actors HandgunHot and ShotHot are unrecognized, and attempts to summon or give them give an error that no such actor exists.

Trying to copypaste the actor name along with the nonbreaking space before the colon doesn't work:

Code: Select all

summon HandgunHot 
gives me

Code: Select all

Unknown item "HandgunHot"
with the nonbreaking space removed.
_mental_
 
 
Posts: 3812
Joined: Sun Aug 07, 2011 4:32 am

Re: [3.2.0] Nonbreaking spaces are silent but deadly

Post by _mental_ »

The main problem here is usage of UTF-8 encoding. While it's fine to have arbitrary encoded comments code itself should be strictly in ASCII character set.
Unfortunately there is no notion of encoding for text lumps in GZDoom. And I don't think that silent assumption of UTF-8 (or any other) is a good idea.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 49056
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: [3.2.0] Nonbreaking spaces are silent but deadly

Post by Graf Zahl »

GZDoom does not parse UTF-8. The only text format it can currently read is ISO-8859-1. So yes, an UTF-8 encoded file with non breaking spaces will create a class name that does not contain a non-breaking space but the corresponding UTF-8 encoding.
_mental_
 
 
Posts: 3812
Joined: Sun Aug 07, 2011 4:32 am

Re: [3.2.0] Nonbreaking spaces are silent but deadly

Post by _mental_ »

Now I'm a bit confused.

Let's assume that we have DECORATE encoded in ISO 8859 with non-breaking space before/after class' name.
Did I got it right that the space character must be a part of the name? It is so at the moment but it looks like a bug to me.

If we use particular encoding and don't support any other I think we should treat 0xA0 character as a whitespace.
Of course it should be done only if there are no known issues with this.
User avatar
JPL
 
 
Posts: 523
Joined: Mon Apr 09, 2012 12:27 pm
Contact:

Re: [3.2.0] Nonbreaking spaces are silent but deadly

Post by JPL »

If a fix isn't possible would an error message suffice? If I use a visible Unicode character it should be pretty obvious what my problem is, the fact that it's "silent but deadly" seems the worst aspect of this particular issue.
_mental_
 
 
Posts: 3812
Joined: Sun Aug 07, 2011 4:32 am

Re: [3.2.0] Nonbreaking spaces are silent but deadly

Post by _mental_ »

All characters from '\0' to ' ' (space) except '\n' (new line) are treated as whitespaces.
Other characters (non-breaking space or not) are handled like letters or numbers.
So any "problematic" characters are attached to an adjacent identifier.

Sorry but I have no idea what the error message you want to get in this case.
Gez
 
 
Posts: 17834
Joined: Fri Jul 06, 2007 3:22 pm

Re: [3.2.0] Nonbreaking spaces are silent but deadly

Post by Gez »

Looks like it's relatively easy to make an nbsp by mistake on Mac since the input code is just alt+space. (On Windows it's Alt+255, which makes it very hard to type unwittingly.)
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 49056
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: [3.2.0] Nonbreaking spaces are silent but deadly

Post by Graf Zahl »

This is 'can't fix' because the DECORATE parser has to go out of its way to allow the kinds of weird class names some people cooked up over all these years. It simply treats everything that isn't a space, newline, tab or ':' as part of the class name. ZScript uses strict identifier syntax where this cannot happen. Identifiers can only be 0..9, a..z and '_'.
Post Reply

Return to “Closed Bugs [GZDoom]”