[4.0.0, possibly earlier] "Invalid Instruction: mov"

These bugs do plan to be resolved, when they can be.

Moderator: GZDoom Developers

[4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby Matt » Sat Apr 20, 2019 11:10 am

Seemingly totally at random I'd get this popup window and the game fails to start.

Does anyone have any idea what this even means?
User avatar
Matt
Putting the XD into *xdeath since 2007
 
Joined: 04 Jan 2004
Location: Gotham City SAR, Wyld-Lands of the Lotus People, Dominionist PetroConfederacy of Saudi Canadia

Re: [4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby phantombeta » Sat Apr 20, 2019 11:11 am

That means something went wrong when jitting a function into x86_64 code. Does it say anything else?
User avatar
phantombeta
In the meadow of sinful thoughts, every flower's a perfect one
 
Joined: 02 May 2013
Location: Brazil, South America, Earth, Orion-Cygnus Arm, Milky Way
Discord: phantombeta#2461
Twitch ID: phantombeta_
Github ID: Doom2fan
Operating System: Windows 10/8.1/8/201x 64-bit
OS Test Version: No (Using Stable Public Version)
Graphics Processor: nVidia with Vulkan support

Re: [4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby Matt » Sat Apr 20, 2019 11:25 am

Not that I'm aware of. It seems to be completely random and I've never had it happen twice in a row if I try to run GZDoom again with exactly the same command line parameters.

EDIT: i suppose if it were completely random I'd have had a twofer by now...
User avatar
Matt
Putting the XD into *xdeath since 2007
 
Joined: 04 Jan 2004
Location: Gotham City SAR, Wyld-Lands of the Lotus People, Dominionist PetroConfederacy of Saudi Canadia

Re: [4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby Caligari87 » Sat Apr 20, 2019 11:57 am

I got it as well a few days ago and was unable to reproduce it.

8-)
User avatar
Caligari87
I'm just here for the community
User Accounts Assistant
 
Joined: 26 Feb 2004
Location: Salt Lake City, Utah, USA
Discord: Caligari87#3089

Re: [4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby Player701 » Wed Dec 04, 2019 12:53 am

Sorry for the bump. This bug apparently still exists as of GZDoom 4.2.4, because just a few days ago, during a routine test of my mod, GZDoom crashed immediately with the following error:

Code: Select allExpand view
Invalid instruction: test <None>, <None>

EDIT: Here's another one I've got just now:
Code: Select allExpand view
Invalid instruction: mov <None>, qword [rbp+16]

Since I resumed work on my project, I've also got a few address zero VM aborts, which, judging by the stack traces, happened in completely random places in my code where it was simply not possible for such an error to happen. This issue has already been reported here, but we couldn't see stack traces before. Now that I do see them, they don't tell me anything useful at all.

I've also got an unexpected JIT menu error once, it looks like it has already been reported here. Unfortunately, I don't have the exact error message, but if I get it again, I will post it in the corresponding thread.

It might be possible that all these issues are connected to each other somehow, since they all either are JIT-related or have started to appear since the JIT merge. However, they seem to be extremely rare, and there is no definite means to reproduce any of them reliably.

I can rig an automation script to repeatedly run GZDoom in a loop and leave it running overnight; if it crashes, the window with the error message will remain, and I will be able to report it here. This way, I can catch invalid instruction and address zero errors. However, I have no idea if the exact error messages/stack traces will be of any use to the developer team. What if I run a debug build of GZDoom instead? If it crashes like this, will it be possible to retrieve any useful debugging information from the process when it has already errored out?
User avatar
Player701
 
Joined: 13 May 2009
Location: Russia
Discord: Player701#8214
Operating System: Windows 10/8.1/8/201x 64-bit
OS Test Version: No (Using Stable Public Version)
Graphics Processor: nVidia with Vulkan support

Re: [4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby phantombeta » Wed Dec 04, 2019 1:58 am

Player701 wrote:Sorry for the bump. This bug apparently still exists as of GZDoom 4.2.4, because just a few days ago, during a routine test of my mod, GZDoom crashed immediately with the following error:

Code: Select allExpand view
Invalid instruction: test <None>, <None>

EDIT: Here's another one I've got just now:
Code: Select allExpand view
Invalid instruction: mov <None>, qword [rbp+16]

This is kinda known. The bug isn't easily reproduced, so it's pretty hard to try to fix it. This is also very likely to be a bug in the AsmJit library itself, rather than in GZDoom's JIT code.
Since I resumed work on my project, I've also got a few address zero VM aborts, which, judging by the stack traces, happened in completely random places in my code where it was simply not possible for such an error to happen. This issue has already been reported here, but we couldn't see stack traces before. Now that I do see them, they don't tell me anything useful at all.

Unfortunately, without a stack trace, there's no way to tell if it might be the same bug. Please do post it if you get one.
It might be possible that all these issues are connected to each other somehow, since they all either are JIT-related or have started to appear since the JIT merge. However, they seem to be extremely rare, and there is no definite means to reproduce any of them reliably.?

Extremely unlikely. The bug this thread was made to report is a bug where it fails to output valid x86_64 code, while Nash's bug report seems to be something possibly related to order of execution that may already have existed before the JIT was added, but went unnoticed because the VM can be more lenient than the JIT when garbage data (specially pointers) is involved.
(Why this code generation bug happens is unknown, but it's known that the issue itself is pretty rare. Like I said above, it's most likely a bug in the AsmJit library itself.)

[Edit]: Moving this one to On Hold Bugs too so it doesn't get lost either.
User avatar
phantombeta
In the meadow of sinful thoughts, every flower's a perfect one
 
Joined: 02 May 2013
Location: Brazil, South America, Earth, Orion-Cygnus Arm, Milky Way
Discord: phantombeta#2461
Twitch ID: phantombeta_
Github ID: Doom2fan
Operating System: Windows 10/8.1/8/201x 64-bit
OS Test Version: No (Using Stable Public Version)
Graphics Processor: nVidia with Vulkan support

Re: [4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby Player701 » Wed Dec 04, 2019 2:05 am

phantombeta wrote:
Since I resumed work on my project, I've also got a few address zero VM aborts, which, judging by the stack traces, happened in completely random places in my code where it was simply not possible for such an error to happen. This issue has already been reported here, but we couldn't see stack traces before. Now that I do see them, they don't tell me anything useful at all.

Unfortunately, without a stack trace, there's no way to tell if it might be the same bug. Please do post it if you get one.

I can do that, but so far the error has always happened in mod code, so I'm not sure if such a stack trace would be of any use. I can, however, guarantee that this is not a mod bug, because 99.9% of the time it runs fine, and then once in a blue moon this VM abort happens, but my code doesn't involve any kind of randomness that could result in such an error, and the exact place where it aborts is always different.
User avatar
Player701
 
Joined: 13 May 2009
Location: Russia
Discord: Player701#8214
Operating System: Windows 10/8.1/8/201x 64-bit
OS Test Version: No (Using Stable Public Version)
Graphics Processor: nVidia with Vulkan support

Re: [4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby dpJudas » Wed Dec 04, 2019 2:08 am

Those errors are most likely caused by an uninitialized variable. Either in GZDoom's code or in Asmjit. Unfortunately it will be very difficult to track down as asmjit doesn't detect invalid instructions at insertion point. The call stack when it throws the exception will therefore be useless.

I am working on a compiler backend in a private project that I may open source to use it in GZD. That should solve the problem along with some issues I don't like about our current JIT implementation. However, that still isn't quite ready yet.
dpJudas
 
 
 
Joined: 28 May 2016

Re: [4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby Graf Zahl » Wed Dec 04, 2019 5:16 am

Now you really made me curious... ;)
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: [4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby dpJudas » Wed Dec 04, 2019 5:55 am

I have a script compiler that I originally wrote to output to LLVM. Then due to the issues we also had when I used it for GZD I wrote my own backend that was more or less API compatible with the IRBuilder in LLVM. So it is like a mini-llvm with the same general compiler strategy.

In my current version of the IR backend I'm using asmjit for register allocation and lowering it to x64 opcodes, but I'm working on writing my own register allocator and x64 asm writer. Once I'm done with that I'll have a complete compiler backend that can JIT with no external dependencies.

For GZD this means I can actually do certain things that I kind of hacked asmjit into doing: providing unwind info to the OS. It will also allow me to actually code optimization passes and get rid of the 256 virtual register limit that asmjit has. I wish I could have added these things to asmjit, but its code is written in such a way that I have no idea how to even begin writing a proper PR for it.
dpJudas
 
 
 
Joined: 28 May 2016

Re: [4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby Graf Zahl » Wed Dec 04, 2019 6:30 am

dpJudas wrote:Ibut its code is written in such a way that I have no idea how to even begin writing a proper PR for it.


Welcome to the club! I got the same problem with a certain other project I'm working on (I guess you know what I mean ;)), it's also written in a way that makes it very, very hard to implement stuff in a sane manner.

Regarding JIT in general, it's really a shame that there's no way to create Visual Studio debugger info for scripted content - if that existed a lot more of the engine could be scriptified.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: [4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby Nash » Wed Dec 04, 2019 6:48 am

Something I failed to mention in the past; I noticed that these "mysterious errors" starting popping up after the level rewrite was merged into master. And I mean the whole bunch - invalid instruction mov, unexplainable VM aborts, etc. I can say with 90% certainty that these things have never happened before said merge.

Edit for clarification: I am not blaming the level rewrite, perhaps it's not related at all... but I remember clearly _when_ these started manifesting.
User avatar
Nash
 
 
 
Joined: 27 Oct 2003
Location: Kuala Lumpur, Malaysia
Github ID: nashmuhandes

Re: [4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby dpJudas » Wed Dec 04, 2019 7:06 am

Nash wrote:Something I failed to mention in the past; I noticed that these "mysterious errors" starting popping up after the level rewrite was merged into master. And I mean the whole bunch - invalid instruction mov, unexplainable VM aborts, etc. I can say with 90% certainty that these things have never happened before said merge.

Edit for clarification: I am not blaming the level rewrite, perhaps it's not related at all... but I remember clearly _when_ these started manifesting.

I don't think the level rewrite can cause this. If anything, it was some other scripting backend related change during the same period that started it. The most likely candidate to the error is either that A) the JitCompiler compiler class receives a VM register index that is out of bounds, B) it itself fails to initialize an asmjit virtual register, or C) asmjit messes up its internal state.

We could add some validation for it in the JitCompiler, but I'd rather invest my time on my own IR backend. The unwind code in GZD is sort of a ticking time bomb in the sense that its extremely low level, IMO should be done by asmjit, and doesn't seem to become a feature there unless I add it myself (which I can't).

Graf Zahl wrote:Regarding JIT in general, it's really a shame that there's no way to create Visual Studio debugger info for scripted content - if that existed a lot more of the engine could be scriptified.

I'm actually not sure if that is impossible or not. Visual Studio is able to display the call stack for .net JIT code. The big question here is whether they implemented that in some .net specific way, or if it looks for a HMODULE header next to the function table. If it is the latter then it might be possible to give the functions names and reference source files.
dpJudas
 
 
 
Joined: 28 May 2016

Re: [4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby Graf Zahl » Wed Dec 04, 2019 7:25 am

Nash wrote:Something I failed to mention in the past; I noticed that these "mysterious errors" starting popping up after the level rewrite was merged into master. And I mean the whole bunch - invalid instruction mov, unexplainable VM aborts, etc. I can say with 90% certainty that these things have never happened before said merge.

Edit for clarification: I am not blaming the level rewrite, perhaps it's not related at all... but I remember clearly _when_ these started manifesting.



I also do not believe that the level rewrite messed things up, it never interacts with the VM's innards. The only thing I know is that it revealed some major architectural issues with the event handling but that's on a far higher level.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: [4.0.0, possibly earlier] "Invalid Instruction: mov"

Postby phantombeta » Wed Dec 04, 2019 1:49 pm

Graf Zahl wrote:I also do not believe that the level rewrite messed things up, it never interacts with the VM's innards. The only thing I know is that it revealed some major architectural issues with the event handling but that's on a far higher level.

Not for the JIT errors, but it would explain the random VM abort at level start.

By the way, I think I may have identified that bloody menu error.
This line checks the previous opcode by doing "pc - 1". While usually there would be stuff before it, I'm guessing in some rare case (perhaps a function call with no arguments at the start of a function), it's the first thing in the code, so "pc - 1" ends up in random data, and that random data sometimes ends up being equal to OP_VTBL. It should probably have some check to see if "pc" is at the start of the code, I guess.
User avatar
phantombeta
In the meadow of sinful thoughts, every flower's a perfect one
 
Joined: 02 May 2013
Location: Brazil, South America, Earth, Orion-Cygnus Arm, Milky Way
Discord: phantombeta#2461
Twitch ID: phantombeta_
Github ID: Doom2fan
Operating System: Windows 10/8.1/8/201x 64-bit
OS Test Version: No (Using Stable Public Version)
Graphics Processor: nVidia with Vulkan support

Next

Return to On Hold Bugs

Who is online

Users browsing this forum: No registered users and 0 guests