DoomEdNumb > 32768

Post a reply

Smilies
:D :) :( :o :shock: :? 8-) :lol: :x :P :oops: :cry: :evil: :twisted: :roll: :wink: :geek: :ugeek: :!: :?: :idea: :arrow: :| :mrgreen: :3: :wub: >:( :blergh:
View more smilies

BBCode is OFF
Smilies are ON

Topic review
   

Expand view Topic review: DoomEdNumb > 32768

Re: DoomEdNumb > 32768

by Quasar » Tue Jun 02, 2009 5:26 pm

OK so the consensus seems to be to just keep the grammar for string literals the same, and to define additional escape characters. I will work on this and post a 1.2 candidate specification in the DWF source ports forum.

Re: DoomEdNumb > 32768

by Graf Zahl » Tue Jun 02, 2009 3:51 pm

Let's just hope it never happens...

Wanna bet that some port gets it wrong eventually?

Re: DoomEdNumb > 32768

by CodeImp » Tue Jun 02, 2009 3:18 pm

\\ and \" are already in the UDMF specifications.

And yes ofcourse the tool that reads/writes the UDMF should perform the string escaping, it has always been that way.

But I don't understand what you mean by texture names should not use escape characters? They are just strings as well and for them the same rules apply. No parser is going to 'think' any different about them than that they are strings just like any other. So a texture named BLA"BLA must always be stored in the UDMF text as upper="BLA\"BLA"; Unless we make some exceptional notation for them, but that would be dirty IMO. Ofcourse, after a parser reads the UDMF text, it will unescape that string and return the value of the upper field normally as BLA"BLA. The same goes for textures with \ in their name and if we decide to use these other escape characters, it also goes or textures with a newline in their name (laugh, but it is possible).

Re: DoomEdNumb > 32768

by Graf Zahl » Tue Jun 02, 2009 2:52 pm

CodeImp wrote:Then I suggest extending the use of the \ character to do more of the same things the C language also does: output special characters.
\n for newline
\r carriage return
\t for tab
\x00 for any character by ASCII hex code (for example \x44 for ASCII character 68, or we could just stick this to decimal notation and maybe use c for this instead of x, for example \c44 for ASCII character 44)

I think these are enough - except \\ and \", of course. In the rare case one of the others is needed \x should be sufficient.
But IMO this should not be exposed to the mapper, if possible so the editor should do some automatic conversion of user input. There's just one thing to look out for: Texture names must *NOT* be filtered this way. Although I doubt that many people have actually used backslashes in their texture names it's not something I'd rule out completely.

Re: DoomEdNumb > 32768

by CodeImp » Tue Jun 02, 2009 2:41 pm

Then I suggest extending the use of the \ character to do more of the same things the C language also does: output special characters.
\n for newline
\r carriage return
\t for tab
\x00 for any character by ASCII hex code (for example \x44 for ASCII character 68, or we could just stick this to decimal notation and maybe use c for this instead of x, for example \c44 for ASCII character 44)

That is really all, there aren't many more useful ones really. And no other characters than tabs and newline/return shouldn't cause any problems really. We may not even need \x00.

Re: DoomEdNumb > 32768

by Graf Zahl » Tue Jun 02, 2009 2:22 pm

Multiline string literals are a very bad idea because they impose severe limitations on the parser. The parser I am currently using to parse UDMF sure can't handle this. It will be even more problematic with tools that just read the lump line by line, store it temporarily and write it back out again, like ZDBSP, for example. The more syntax it has to preserve, the harder it gets.

Let's stick to something that doesn't break the current simplicity of the format!

Re: DoomEdNumb > 32768

by CodeImp » Tue Jun 02, 2009 1:29 pm

All this is a moot point. If you don't feel like escaping " with a \ then you shouldn't write UDMF by hand in a plain text editor, because UDMF has a whole lot more single-character tokens you have to take care of.
The heredoc suggestion is breakable, even with [char]@ as token (which is a two-char token by the way, requires a more complex parser) I could come up with sequences that break it (and soon or later someone might do this and it will break his whole map). Also, this requires a parser that writes UDMF to file to start looking for a suitable token beforehand, which also makes it more complex.

If you don't write UDMF by hand using a plain text editor then this is also a moot point, because any editor programmed to work with UDMF can (and should be) made to work with the string notation we already have (the " escaped with a \ ).

As I understand it (though, I may be incorrect) that UDMF specs already say that newlines and tab characters (hard, unescaped) inside a string must be preserved. So no change would be needed.

The above is all logic, no personal opinion.
Now for my personal opinion: Stick with the C-style string we have! They are well-know (every programmer knows this) and IMO easiest of all to parse. Sure the escaping of " with \ is not what a common person today would write a thriller book with, but neither are you going to make a map by writing UDMF by hand.

Re: DoomEdNumb > 32768

by Quasar » Tue Jun 02, 2009 1:21 pm

CodeImp objects to the fact that you cannot include an "@ sequence in this sort of heredoc. When I suggested the ability to use @[any character] [any character]@, where [any character] is selected to avoid conflicting with any given two character sequence in the string, he didn't like that idea either.

He instead suggests that we extend UDMF string literals to multiline, and specify that such hard linebreaks encountered be embedded into the strings. This loses most of the utility of heredocs, however, because it requires all strings placed into them to have all \ and " characters escaped, requiring the insertion of text only through tools which understand the translation that has to be applied (this completely removes the ability to just freely paste the code into an existing UDMF map, for example).

So what do you think? Worth it or not?

Re: DoomEdNumb > 32768

by Quasar » Mon Jun 01, 2009 1:41 am

The UDMF grammar as currently defined does not allow line breaks inside string literals; that is one limitation. The other is that having to escape characters makes the input of complex strings difficult, and may require an extra processing step to get rid of them between parsing the string out of UDMF and passing it to the code that uses it.

The primary thing this will be useful for to Eternity is embedding scripts. Back when we worked on defining UDMF, you referred to the lack of the ability to contain such things as a shame, but this is a simple solution that will allow them, and anything else that won't fit into UDMF's strict grammar. As an important bonus, non-implementors don't have to concern themselves with the syntax within; they can just treat the field like any other string value.

If you are still curious, I can give you more details about what we have planned, but I would rather do it in private as we are trying to avoid overhyping our current plans for scripting until they have fully materialized.

Re: DoomEdNumb > 32768

by Graf Zahl » Mon Jun 01, 2009 1:06 am

Quasar wrote: To me it's not important that any EE specials get crammed into the 0-255 range so you pretty much have freedom over that guaranteed ;) EE is going to freeze the Doom map format, and will consider the Hexen map format deprecated for purposes of making new maps. UDMF is going to be the "one way to the light" as far as EE editing goes in the future. We participated in defining this spec in order to put away problems like ExtraData and all the old BOOM hacks, so I feel justified in encouraging its use in any way possible, even if that encouragement might be construed as forceful to some extent :)
Agreed. But it sure won't suit the people still using DeepSea, WadAuthor or (God beware!) Zeth. Although I have to admit that in the long run they will be left behind eventually. UDMF should become the map format of choice for ZDoom mapping, too. It would make a lot of things much easier.
While I have your attention about UDMF-related things, there is an extension to the syntax which I believe would be very useful in general for expansions that need syntax that's not covered by the UDMF grammar (which was intentionally kept to the bare minimum, as you already know). This extension, heredocs, was already implemented in Eternity's EDF parser some time ago, and works like this:

Code: Select all

someblock 
{
   fieldname = 
   @" 
       Anything can be typed inside here. "" {} / \ * + () ; ; ;
       It is multilined and everything within is interpreted literally
       with no translation or syntactic evaluation.
   "@
}
The syntax which EE uses for heredocs was adopted from the Windows Powershell language (IIRC anyway); languages which support them all have different syntax, but the @" "@ delimiters are what worked best in EDF. The important property of heredocs, as mentioned above, is that they support long, multiline literals that contain ordinarily reserved characters. They are also relatively easy to parse, requiring only a single character of look-ahead to check for the @ after any quotation mark found within. Other than that, their tokenization tends to be similar to but simpler that needed by ordinary string literals.

Some things we are planning EE could strongly benefit from the ability to embed such literals. If you think it's a good idea, we could talk to CodeImp and see if it can be put into a UDMF 1.2 specification. Lemme know what you think.
Ok, but what's the purpose? Why do you need unquoted literals? Wouldn't it be as good to put such things into quotes and use standard filter chars like '\n', '\t' or \" to get these characters in?

But sure, if you see some use in this I could add it to the parser that fields defined like this are skipped properly.

Re: DoomEdNumb > 32768

by Quasar » Sun May 31, 2009 5:09 pm

What you said about special numbers and trying to stay within the range of 255 raises an important issue which I was touching upon in the previous post. If EE and ZDoom will cooperate on a higher level spec for establishing Hexen-based lines and some common extensions within UDMF, we'll need a way to decide where the extensions will fit in the future should we decide to share more common functionality beyond the initial specification.

To me it's not important that any EE specials get crammed into the 0-255 range so you pretty much have freedom over that guaranteed ;) EE is going to freeze the Doom map format, and will consider the Hexen map format deprecated for purposes of making new maps. UDMF is going to be the "one way to the light" as far as EE editing goes in the future. We participated in defining this spec in order to put away problems like ExtraData and all the old BOOM hacks, so I feel justified in encouraging its use in any way possible, even if that encouragement might be construed as forceful to some extent :)

While I have your attention about UDMF-related things, there is an extension to the syntax which I believe would be very useful in general for expansions that need syntax that's not covered by the UDMF grammar (which was intentionally kept to the bare minimum, as you already know). This extension, heredocs, was already implemented in Eternity's EDF parser some time ago, and works like this:

Code: Select all

someblock 
{
   fieldname = 
   @" 
       Anything can be typed inside here. "" {} / \ * + () ; ; ;
       It is multilined and everything within is interpreted literally
       with no translation or syntactic evaluation.
   "@
}
The syntax which EE uses for heredocs was adopted from the Windows Powershell language (IIRC anyway); languages which support them all have different syntax, but the @" "@ delimiters are what worked best in EDF. The important property of heredocs, as mentioned above, is that they support long, multiline literals that contain ordinarily reserved characters. They are also relatively easy to parse, requiring only a single character of look-ahead to check for the @ after any quotation mark found within. Other than that, their tokenization tends to be similar to but simpler that needed by ordinary string literals.

Some things we are planning EE could strongly benefit from the ability to embed such literals. If you think it's a good idea, we could talk to CodeImp and see if it can be put into a UDMF 1.2 specification. Lemme know what you think.

Re: DoomEdNumb > 32768

by Graf Zahl » Sun May 31, 2009 3:51 pm

I was planning to go beyond 255 myself once the low numbers are filled up. But currently it's not even close and there haven't been too many new specials recently so for the time being I try to fit anything new into the existing 255 entries to have it accessible by binary Hexen format maps as well.

The compatibility is a major factor though. As much as I understand why you try to keep it I think that there will eventually come the time when the work required to keep it operational will become disproportionate to the intended goal or stall future development.

Re: DoomEdNumb > 32768

by Quasar » Sun May 31, 2009 3:33 pm

I am interested in maintaining compatibility where it is practical to do so, so I'll let you know as this develops. Within the scope of UDMF, having long integer ranges on stuff like special numbers could allow a Unicode-like approach to stuff like line specials where nobody has to be forced to step on anybody else's toes. I have all along thought that defining a higher-level specification for UDMF that sits on top of the basic spec for advanced ports so that they can have shared means of specifying features that already work the same way is a good idea.

I have no idea how EE's more stringent compatibility goals might affect my efforts to fold together the line systems of the various games, since this is in the high conceptual stage at the moment. I really kind of dread the nitty gritty details that are going to emerge as I get into the meat of it.

Re: DoomEdNumb > 32768

by Graf Zahl » Sun May 31, 2009 3:06 pm

:)

A bit too late though... We could have avoided a lot of trouble if he had decided to do it a year ago.

I sure hope he ensures that he uses the same line special numbers as ZDoom. Maybe then we can define something like a Boom v2 standard, based on Hexen line types and UDMF.

Re: DoomEdNumb > 32768

by Gez » Sun May 31, 2009 3:02 pm

Graf Zahl wrote:Same about line specials. I would have gotten rid of the old Doom-type specials altogether because IMO they serve no real purpose in an advanced format but had I done that we again would have gotten nothing. Remember, the only other engine that is about to support UDMF is Eternity which uses a completely different approach to handle both Doom and Hexen specials - unfortunately one that goes out of its way to avoid streamlining and unification of features.
It seems Quasar agreed it was a bad idea. :wink:
Quasar said:
Eternity's line special system is being rewritten to accomodate complete streamlining of Doom/Heretic/Strife with Hexen in a manner similar to ZDoom

Top