ZScript VM bug with very long function

Bugs that have been investigated and resolved somehow.

Moderator: GZDoom Developers

Forum rules
Please don't bump threads here if you have a problem - it will often be forgotten about if you do. Instead, make a new thread here.
Post Reply
User avatar
phantombeta
Posts: 2088
Joined: Thu May 02, 2013 1:27 am
Operating System Version (Optional): Windows 10
Graphics Processor: nVidia with Vulkan support
Location: Brazil

ZScript VM bug with very long function

Post by phantombeta »

The only thing I can think of is a stack overflow or silently running out of registers, but when I theorized it might be this on the Discord, dpJudas said neither should happen due to how the VM handles registers and the stack.
So this time, I have no idea what could be causing this. Sorry.

Steps to reproduce:
  • Launch GZDoom with this example WAD.
  • Bind the "Buy menu" key.
  • Press the "Buy menu" key.
  • Click "Upgrades", then "Informational upgrades".
The shop entries should have human-readable names, but instead show the inventory items' actor names. If you make it print the "name" variable inside the function that creates the shop's data classes*, it displays the correct string, but printing the same variable elsewhere displays the wrong string instead.
Removing a few string literals from "S7_ShopDataEventHandler.mainShop_Populate" (or changing them to blank strings) makes those entries display correctly, which means this is very likely a GZDoom bug.

(* "S7_ShopDataEventHandler.mainShop_Populate" @ /S7ZScript/RPG Systems/ShopData/Lucius.ZS)

Edit: Added the link to the example WAD. <_>
Last edited by phantombeta on Tue Sep 25, 2018 1:13 am, edited 2 times in total.
_mental_
 
 
Posts: 3812
Joined: Sun Aug 07, 2011 4:32 am

Re: ZScript VM bug with very long function

Post by _mental_ »

Didn't you forget to attach an example?
User avatar
phantombeta
Posts: 2088
Joined: Thu May 02, 2013 1:27 am
Operating System Version (Optional): Windows 10
Graphics Processor: nVidia with Vulkan support
Location: Brazil

Re: ZScript VM bug with very long function

Post by phantombeta »

Oooops. I blame that on it being nearly 4 AM here when I posted it.
_mental_
 
 
Posts: 3812
Joined: Sun Aug 07, 2011 4:32 am

Re: ZScript VM bug with very long function

Post by _mental_ »

It's a code generation bug. When number of constants in one function exceeds 256, registers are used to pass values to callee.
The same register is used for all such occurrences. To do a quick check I commented out this line which frees a used register. It became to work as expected.
Unfortunately the given case seems to require more complicated handling to do it properly.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 49067
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: ZScript VM bug with very long function

Post by Graf Zahl »

This is more a design flaw of the VM's instruction set than anything else. Sadly one I discovered far too late. If it wasn't for the need to support 32 bit targets I'd just redesign its instruction set to act more sanely.
And the JIT compiler pretty much makes an instruction set redesign impossible, unless all of it was rewritten.
dpJudas
 
 
Posts: 3040
Joined: Sat May 28, 2016 1:01 pm

Re: ZScript VM bug with very long function

Post by dpJudas »

If you mean the 256 constant limit, which I assume is a limitation of how an opcode is packed, that can easily be changed for the JIT compiler.

Speaking of the JIT compiler, it is more less done (all instructions implemented). Unfortunately there seems to be some problem with the register allocator inside asmjit. It seems to mess up the state (spilled vs placed in physical register) when compiling StatusScreen.drawNum. I haven't been able to reproduce it in a test program yet so I can't report it to the asmjit author.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 49067
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: ZScript VM bug with very long function

Post by Graf Zahl »

dpJudas wrote:If you mean the 256 constant limit, which I assume is a limitation of how an opcode is packed,

It is indeed. But I do not like how the VM instructions are packed in general. Why all these indirections to constants? Was it really necessary to model the VM after some RISC processor with limited data access capabilities. It the context it makes absolutely no sense at all - and then put a hard byte-size limit on the index for most instructions (and to top it off, never properly check for the limit being exhausted - I only noticed that after having written the entire code generator when writing a function with more than 255 constants and getting random crashes.)

As much as I appreciate the JIT stuff, I always had it planned to rewrite the instruction set once 32 bit targets could be dumped for good. Now that'll get a lot harder.
User avatar
Rachael
Posts: 13560
Joined: Tue Jan 13, 2004 1:31 pm
Preferred Pronouns: She/Her
Contact:

Re: ZScript VM bug with very long function

Post by Rachael »

As far as I know, dpJudas and Gutawer abandoned the JIT compiler because of a fatal error that could not be resolved without an update to asmjit directly.

It might be viable to ask them if they would like to abandon it completely that it would be possible to redo the VM from the ground up - if they might be willing to redo their work on the JIT. I imagine they'd favor progress of the engine over what may potentially be a dead end feature.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 49067
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: ZScript VM bug with very long function

Post by Graf Zahl »

The code which is there will always be useful.

But for redoing the VM, the most important thing is not the instruction set, but to redo the calling convention, most importantly find a better way to deal with constructible local variables, like strings and arrays. There's some insane internal complexity here in the type system to make this all work and if it could be redone by moving these type dependencies out of the VM, I guess the rest will become a lot easier.

Unfortunately I still haven't had the right idea that both works and is compatible with the existing class interface for these two types.
dpJudas
 
 
Posts: 3040
Joined: Sat May 28, 2016 1:01 pm

Re: ZScript VM bug with very long function

Post by dpJudas »

Let's not throw out the baby with the bathwater. The register allocation bug in asmjit can be fixed, there's no need to delete/abandon the code.

I'm not sure what 32-bit targets has to do changing any of this. My stance has always been that once the JIT compiler supports all opcodes (which it does now, aside from bugs) then the JIT doesn't really have to follow the VM's calling convention. For example, if a function in ZScript is declared as 'int foobar(int, float)', then the JIT compiler could generate a plain x64 calling convention function taking and returning those types. When another jitted function calls foobar, it will generate a x64 call, not a VM call. Unlike the VM the JIT can generate the function signature at compile time.

There are only three places this strategy breaks down:

1) When it needs to call a native function
2) When native code calls the VM
3) Complex VM frame setup

For case #1 it already doesn't use the full VM convention. The native function expects a couple of arrays as input that the JIT itself can easily generate. For #2 it could generate some basic thunk functions that unpacks the VM data and calls a native x64 function. #2 could also be updated so that it expects the same input format as the native functions expects.

#3 can initially be solved by using the thunk function for complex stuff (strings and arrays, mostly). Later on it could allocate and free those objects on its own, once again reducing it to x64 convention calls.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 49067
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: ZScript VM bug with very long function

Post by Graf Zahl »

For #1 nothing really prevents us from changing the setup to have both a real function with a defined signature, so that the JIT can use it, and the VM thunk for scenarios where a direct call cannot be done. This really shouldn't be any different than calling another JITed function. The big issue originally was that it passed strings by value, but since that was extremely costly with the thunk system I already changed it to pass strings by reference so that all function parameters are trivial types.
#2 is obvious. There's really no way around it.
#3 is the big issue, and it really can only be solved if the strings and arrays are being handled differently - this has actually been my main concern with the entire VM's design since day one. If this can be solved most of the complex stuff would just evaporate into nothing and the JIT could just allocate its stack frame on the real stack and zero it before starting the actual function. The main issue with letting the JIT code handle it is that it's not exception safe. And correct me if I am wrong, but I do not think that Asmjit can create exception-aware code.

Just be glad that I already removed some of the really ugly stuff that plagued the VM. The original design had a separate type tag for each pointer and this was some truly hideous stuff which ultimately served no valuable purpose except allowing some debug asserts to be more robust.
dpJudas
 
 
Posts: 3040
Joined: Sat May 28, 2016 1:01 pm

Re: ZScript VM bug with very long function

Post by dpJudas »

Graf Zahl wrote:The main issue with letting the JIT code handle it is that it's not exception safe. And correct me if I am wrong, but I do not think that Asmjit can create exception-aware code.
It cannot. What I'm currently doing is to have a try/catch in any lambda call to C++ which converts the exception into something the JIT code can handle. With C++11 we can store the caught exception and rethrow it after the JIT function(s) returned. Not perfect, but that's the price for using C++ exceptions in the VM.
User avatar
Gutawer
Posts: 469
Joined: Sat Apr 16, 2016 6:01 am
Preferred Pronouns: She/Her

Re: ZScript VM bug with very long function

Post by Gutawer »

Graf Zahl wrote:As much as I appreciate the JIT stuff, I always had it planned to rewrite the instruction set once 32 bit targets could be dumped for good. Now that'll get a lot harder.
Just to chime in on this, I've had fun doing the JIT stuff, so I wouldn't be opposed to helping rewrite it for a different set of VM instructions. So if you do rewrite the instruction set I'll be happy to help with re-implementing JIT based on it.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 49067
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: ZScript VM bug with very long function

Post by Graf Zahl »

Post Reply

Return to “Closed Bugs [GZDoom]”