Crash on level load with 3.8pre-251-g4eea540a3
Moderator: GZDoom Developers
Forum rules
Please don't bump threads here if you have a problem - it will often be forgotten about if you do. Instead, make a new thread here.
Please don't bump threads here if you have a problem - it will often be forgotten about if you do. Instead, make a new thread here.
Crash on level load with 3.8pre-251-g4eea540a3
I'm getting level load crashes while playing Blade of Agony with the current dev build (3.8pre-251-g4eea540a3), as well as in the last several dev builds.
Basically, when using +map to load a map from the command line, as soon as the map loads, the engine crashes. This happens pretty much every time in 64-bit, and about a quarter of the time in 32-bit.
I haven't been able to reproduce this with a standard IWAD, so I really haven't been able to narrow things down, other than that it seems to only occur when you use +map to load a level (and may have something to do with the double map loading that's going on right now with +map). This does only seem to happen when the player camera is at the player (e.g., a map with a cutscene intro will load fine, but as soon as the camera switches back to the player, the engine crashes.).
Can someone who can do something with the crash logs assist in tracking down what's going on here, please?
EDIT: OK, so it looks like the two crashes might be different. With the 64-bit build, I occasionally get a very similar crash address, etc. to the 32-bit crash, but that's not what's in the crash log attached here. Ugh.
Basically, when using +map to load a map from the command line, as soon as the map loads, the engine crashes. This happens pretty much every time in 64-bit, and about a quarter of the time in 32-bit.
I haven't been able to reproduce this with a standard IWAD, so I really haven't been able to narrow things down, other than that it seems to only occur when you use +map to load a level (and may have something to do with the double map loading that's going on right now with +map). This does only seem to happen when the player camera is at the player (e.g., a map with a cutscene intro will load fine, but as soon as the camera switches back to the player, the engine crashes.).
Can someone who can do something with the crash logs assist in tracking down what's going on here, please?
EDIT: OK, so it looks like the two crashes might be different. With the 64-bit build, I occasionally get a very similar crash address, etc. to the 32-bit crash, but that's not what's in the crash log attached here. Ugh.
- Attachments
-
- CrashReport - 3.8pre-251-g4eea540a3 (64-bit).zip
- 64-bit version
- (46.03 KiB) Downloaded 57 times
-
- CrashReport - 3.8pre-251-g4eea540a3.zip
- 32-bit version
- (23.5 KiB) Downloaded 56 times
Re: Crash on level load with 3.8pre-251-g4eea540a3
I played with this for a while and managed to reproduce two crashes. They both are related to BoAPlayer.ForcedHealthBar.
The first one is during serialization of the mentioned member.
The second is during execution of this line. A non-null value has been assigned from ForcedHealthBar before the crash.
In both cases the pointed object is not valid. Moreover, I didn't find where it was created, and it didn't seem to be an issue with GC.
So, I checked BoAPlayer.GetClosestForcedHealthBar() function when this value is obtained.
Considering that we have the case with a potentially uninitialized variable I assigned null to mo explicitly.
And both crashes were gone.
The question is what will we do with that? The problem with uninitialized local variables strikes again
The first one is during serialization of the mentioned member.
The second is during execution of this line. A non-null value has been assigned from ForcedHealthBar before the crash.
In both cases the pointed object is not valid. Moreover, I didn't find where it was created, and it didn't seem to be an issue with GC.
So, I checked BoAPlayer.GetClosestForcedHealthBar() function when this value is obtained.
Considering that we have the case with a potentially uninitialized variable I assigned null to mo explicitly.
And both crashes were gone.
The question is what will we do with that? The problem with uninitialized local variables strikes again
Re: Crash on level load with 3.8pre-251-g4eea540a3
Don't some compilers try to give a warning if you use uninitalized variables?
Maybe we should compile a complete list of those warnings and evaluate them case-by-case. I imagine most could do with either some sort of initialization, get marked as static, or get their warnings #pragma'd out?
Maybe we should compile a complete list of those warnings and evaluate them case-by-case. I imagine most could do with either some sort of initialization, get marked as static, or get their warnings #pragma'd out?
Re: Crash on level load with 3.8pre-251-g4eea540a3
It's about uninitialized variables in ZScript. And ZScript compiler gives no such warnings.Rachael wrote:Don't some compilers try to give a warning if you use uninitalized variables?
Also, I reverted my naive attempt to handle uninitialized variables a few days after it was committed.
Re: Crash on level load with 3.8pre-251-g4eea540a3
Ah, my mistake, then.
- Graf Zahl
- Lead GZDoom+Raze Developer
- Posts: 49071
- Joined: Sat Jul 19, 2003 10:19 am
- Location: Germany
Re: Crash on level load with 3.8pre-251-g4eea540a3
Normally all registers should be nulled when the function is entered. Did something change about that?
Re: Crash on level load with 3.8pre-251-g4eea540a3
Thanks so much, _mental_!
Should these have thrown a VM abort? Or is a full engine crash expected behavior here?
I'll at least be able to fix (work around?) this in our dev version... probably should scrub all of my variable declarations to be sure.
Something must have changed about initialization... there's at least one function (a simple iterator loop with an incrementing int variable) that has been untouched since our Chapter 2 release that suddenly started counting at 1 instead of 0 a while back (weeks ago, i think, but I'm not 100% sure when it started). Initializing the counter variable to 0 fixed it.
Should these have thrown a VM abort? Or is a full engine crash expected behavior here?
I'll at least be able to fix (work around?) this in our dev version... probably should scrub all of my variable declarations to be sure.
Something must have changed about initialization... there's at least one function (a simple iterator loop with an incrementing int variable) that has been untouched since our Chapter 2 release that suddenly started counting at 1 instead of 0 a while back (weeks ago, i think, but I'm not 100% sure when it started). Initializing the counter variable to 0 fixed it.
Re: Crash on level load with 3.8pre-251-g4eea540a3
I had no time to investigate it further, so I wrote everything I've got at that moment. I didn’t even try without JIT.Graf Zahl wrote:Normally all registers should be nulled when the function is entered. Did something change about that?
The thing is it should work out of the box. The idea is to implicitly initialize all variables to zeroes.AFADoomer wrote:Should these have thrown a VM abort? Or is a full engine crash expected behavior here?
Unfortunately, there are still a few flaws in the code generator. At the moment, the safest way to avoid such bugs is to set values to all variables before using them.
- phantombeta
- Posts: 2089
- Joined: Thu May 02, 2013 1:27 am
- Operating System Version (Optional): Windows 10
- Graphics Processor: nVidia with Vulkan support
- Location: Brazil
Re: Crash on level load with 3.8pre-251-g4eea540a3
To be honest, this could be quite easily fixed by making variable declarations implicitly initialize to zero if there's no explicit initialization in the variable declaration._mental_ wrote:The thing is it should work out of the box. The idea is to implicitly initialize all variables to zeroes.
Unfortunately, there are still a few flaws in the code generator. At the moment, the safest way to avoid such bugs is to set values to all variables before using them.
It's not like it even costs anything to initialize the registers where it matters, anyway - with the JIT compiler, zeroing a register is as simple as a MOV or XOR instruction for the x86 registers - and even with the SSE registers, it's just as cheap to zero a register.
I'm willing to make a PR for variable initialization if that would be okay with Graf.
Re: Crash on level load with 3.8pre-251-g4eea540a3
Certainly it's better than nothing. It's not only about registers though. We have stack variables too.
In general, I was thinking about zero initialization for variables that are accessed before assignment of their values. Implicit initialization is useless if it's not the case.
This requires sophisticated analysis of execution flow. I doubt that it will be available in the nearest future.
In general, I was thinking about zero initialization for variables that are accessed before assignment of their values. Implicit initialization is useless if it's not the case.
This requires sophisticated analysis of execution flow. I doubt that it will be available in the nearest future.
- phantombeta
- Posts: 2089
- Joined: Thu May 02, 2013 1:27 am
- Operating System Version (Optional): Windows 10
- Graphics Processor: nVidia with Vulkan support
- Location: Brazil
Re: Crash on level load with 3.8pre-251-g4eea540a3
Memory addresses can be initialized the same way in x86 too. The ZScript VM's bytcode is based on RISC, though, and there's no XOR opcode for memory addresses and no opcodes for moving constants to memory directly, so one would need to either do it by moving the initialization value to a register then back to memory or add new opcodes. (The latter is already done by the compound array initializers for moving constants to arrays)_mental_ wrote:Certainly it's better than nothing. It's not only about registers though. We have stack variables too.
So, if it'd be okay, I could make the PR for this anytime - it's not really particularly hard or complicated to add implicit initialization.
That would certainly be better, but I'm not sure if it's even feasible with ZScript - it simply compiles directly to the VM's bytecode instead of compiling to IR first. This is also the same thing that blocks adding an optimization pass to the compiler, I believe.In general, I was thinking about zero initialization for variables that are accessed before assignment of their values. Implicit initialization is useless if it's not the case.
This requires sophisticated analysis of execution flow. I doubt that it will be available in the nearest future.
- Graf Zahl
- Lead GZDoom+Raze Developer
- Posts: 49071
- Joined: Sat Jul 19, 2003 10:19 am
- Location: Germany
Re: Crash on level load with 3.8pre-251-g4eea540a3
I don't think we need two of these threads...
viewtopic.php?f=2&t=63365
viewtopic.php?f=2&t=63365