PDAs Still Causing a Crash

Bugs that have been investigated and resolved somehow.

Moderator: GZDoom Developers

Forum rules
Please don't bump threads here if you have a problem - it will often be forgotten about if you do. Instead, make a new thread here.
dpJudas
 
 
Posts: 3040
Joined: Sat May 28, 2016 1:01 pm

Re: PDAs Still Causing a Crash

Post by dpJudas »

Okay it wasn't the actual error I thought it was (what the PR attempted to fix). Running vk_memstats first and then click a couple of items and a bit scrolling brings the number of allocated objects from 230 to 15771! Doing a bit more testing to see exactly what is leaking and why.
User avatar
Enjay
 
 
Posts: 26534
Joined: Tue Jul 15, 2003 4:58 pm
Location: Scotland
Contact:

Re: PDAs Still Causing a Crash

Post by Enjay »

That's brilliant. Thanks very much for doing this.

Normally if it was a mod that I'd produced from scratch myself, I'd have been able to pare it down further to a smaller example and, hopefully, pinpoint the exact parts that were causing a problem for a minimal example. However, because I'm building my stuff on top of quite a complicated little SDK that, to be perfectly honest, I'm only grasping on to the edge of understanding, I can't do that. It's why I've been spamming the thread with whatever information I can just in case something I post is useful. So I really appreciate you taking a look.
dpJudas
 
 
Posts: 3040
Joined: Sat May 28, 2016 1:01 pm

Re: PDAs Still Causing a Crash

Post by dpJudas »

No worries, this is at least an easily reproducible test. I'm fairly certain at this point that it has to be F2DDrawer::AddShape that just keeps on creating more and more vertex buffers.
dpJudas
 
 
Posts: 3040
Joined: Sat May 28, 2016 1:01 pm

Re: PDAs Still Causing a Crash

Post by dpJudas »

This is the call stack for what keeps creating more and more buffers (it is the call stack of the 10000th's vertex buffer allocation):

Code: Select all

>	gzdoom.exe!VKBuffer::VKBuffer() Line 38	C++
 	gzdoom.exe!VKVertexBuffer::VKVertexBuffer() Line 47	C++
 	gzdoom.exe!VulkanFrameBuffer::CreateVertexBuffer() Line 412	C++
 	gzdoom.exe!F2DVertexBuffer::F2DVertexBuffer() Line 1100	C++
 	gzdoom.exe!TArray<F2DVertexBuffer,F2DVertexBuffer>::ConstructEmpty(unsigned int first, unsigned int last) Line 592	C++
 	gzdoom.exe!TArray<F2DVertexBuffer,F2DVertexBuffer>::Reserve(unsigned __int64 amount) Line 513	C++
 	gzdoom.exe!F2DDrawer::AddShape(FGameTexture * img, DShape2D * shape, DrawParms & parms) Line 635	C++
 	gzdoom.exe!DrawShape(F2DDrawer * drawer, FGameTexture * img, DShape2D * shape, VMVa_List & args) Line 275	C++
 	gzdoom.exe!AF__Screen_DrawShape(VMValue * param, int numparam, VMReturn * ret, int numret, const unsigned char * reginfo) Line 292	C++
 	gzdoom.exe!VMNativeFunction::NativeScriptCall(VMFunction * func, VMValue * params, int numparams, VMReturn * returns, int numret) Line 315	C++
dpJudas
 
 
Posts: 3040
Joined: Sat May 28, 2016 1:01 pm

Re: PDAs Still Causing a Crash

Post by dpJudas »

One last update on this. I'm now 100% sure this is not a bug in the vulkan backend.

Something keeps creating new DShape2D objects. The constructor there keeps allocating new DShape2DBufferInfo objects, which in turn keeps allocating new F2DVertexBuffer objects. Either RefCountedPtr<DShape2DBufferInfo> is broken or the DShape2D objects are never destroyed.

Aren't all DObject objects garbage collected? That a garbage collected object holds a reference to a limited resource is rarely ending well. I tried to force a GC run but that didn't get it to release the buffers though.
User avatar
Nash
 
 
Posts: 17439
Joined: Mon Oct 27, 2003 12:07 am
Location: Kuala Lumpur, Malaysia
Contact:

Re: PDAs Still Causing a Crash

Post by Nash »

From dpJudas' description, I am now 99.99% convinced this is an issue with the ZForms 1.0 API, and will go away when I upgrade the PDA kit to ZForms 2.0.

I am currently very occupied with some releases coming up towards the end of October so it will only be in November when I can start taking a look at upgrade the PDA kit.

Note that ZForms 2.0 is very different from 1.0 and I may have to heavily rewrite the PDA kit - or rewrite it completely.

Overall not a pretty situation for everyone involved... but it has to be done.

I apologize for not following too closely with these last few reports. I think it's safe to conclude that these will all go away once I upgrade the kit.
User avatar
Enjay
 
 
Posts: 26534
Joined: Tue Jul 15, 2003 4:58 pm
Location: Scotland
Contact:

Re: PDAs Still Causing a Crash

Post by Enjay »

Thank you both very much for looking at this.

I know that I have never managed to get the crash when using OpenGL, but always get it within around 30 seconds with Vulkan. Quite what that means, I'm not qualified to say but if the problem does not lie with the Vulkan back end then wherever the problem lies, I guess it then creates a situation that the Vulkan backend is less tolerant of than OpenGL is? (Which, of course, does not have to mean anything wrong with the Vulkan backend.)

As I mentioned, something in one of the commits from either the 10th or the 11th of August is behind allowing this to happen but, again, I don't know enough to say if the commits themselves are to blame or merely that they exposed something that the PDA code was doing wrong all along and, up to that point, GZDoom had perhaps just tolerated it by chance?

I'm looking forward to the re-write of the PDA kit anyway (though I hope it will be reasonably easy to convert my PDAs across Image). I understand perfectly that you have other priorities Nash, so I look forward to it when it's done. I'll just keep a copy of GZDoom 4.6.1 in the project folder for this particular mod.

For what it's worth, I was also able to replicate the crash using the PDAs in Siren TC ( viewtopic.php?f=19&t=68686 ). It actually uses a slightly older version of the PDA starter kit (1.0) but is still crashes in a very similar way. The PDAs in that mod are less graphically intensive than mine and have shorter text entries too. Perhaps for one of those reasons it takes me longer (about 90 seconds) of reading and scrolling around PDA messages to get a crash, but the end result is the same. So I guess, that this will be something that could affect any project that is using the PDA starter kit.
dpJudas
 
 
Posts: 3040
Joined: Sat May 28, 2016 1:01 pm

Re: PDAs Still Causing a Crash

Post by dpJudas »

Enjay wrote:I know that I have never managed to get the crash when using OpenGL, but always get it within around 30 seconds with Vulkan. Quite what that means, I'm not qualified to say but if the problem does not lie with the Vulkan back end then wherever the problem lies, I guess it then creates a situation that the Vulkan backend is less tolerant of than OpenGL is? (Which, of course, does not have to mean anything wrong with the Vulkan backend.)
Vulkan crashes here much earlier than OpenGL due to how memory is allocated in the two backends. For vulkan the vertex buffers are placed in memory already committed to the GPU, while OpenGL gives certain guarantees that effectively means OpenGL will swap the leaking vertex buffers into system memory. If you did scroll for a very very long time then eventually you'd run out of system memory and the OpenGL backend would crash as well.
Nash wrote:From dpJudas' description, I am now 99.99% convinced this is an issue with the ZForms 1.0 API, and will go away when I upgrade the PDA kit to ZForms 2.0.
Yes and no. Maybe ZForms 1.0 does something naughty, maybe it does not. By the end of the day GZDoom itself should not allow a mod to create over 64000 vertex buffers. There is a memory leak in the shape drawing code that really should be fixed. Unfortunately I do not know enough about that code to really fix it without investing a lot of time figuring out what it's doing. See my posts more as a message to the other GZDoom developers (ideally, whoever wrote the shape 2d drawer or made changes to it) where the source to this crash originates.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 49067
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: PDAs Still Causing a Crash

Post by Graf Zahl »

That entire Shape2D allocating vertex buffers thing surely turned out to be a big problem. Yes, it can make things faster, but it also requires a lot of programming discipline, which obviouly cannot be expected from high level scripters.
The main issue with the GC is that it only considers allocations for the object, but is totally unaware of an object hogging limited resources, so it takes a while to get triggered - and we already have an iisue with GC triggering when the game is paused.
I already changed the original setup of Shape2D to let OnDestroy delete all allocated buffers - but this apparently is not enough, because just "forgetting" the buffer seems to be a natural thing in a GC'd languaage...
I have no idea how to address this on the engine side
dpJudas
 
 
Posts: 3040
Joined: Sat May 28, 2016 1:01 pm

Re: PDAs Still Causing a Crash

Post by dpJudas »

I only know two ways. Both involve detaching the 1:1 relationship between the managed object and the resource.

First is to use a cache. The managed object holds the data itself (system memory, with a vector or similar). When the object is used it looks it up in the cache, where it gets the hardware vertex buffer. The cache then keeps an upper limit of how many hardware buffers that can be in use at the same time.

The other way is what UDB does, where I only allocate one vertex and index buffer for the entire thing. Each managed object then gets put into this shared one. I am not sure if this approach works for this case tho as the total memory used will be the same.
User avatar
Gutawer
Posts: 469
Joined: Sat Apr 16, 2016 6:01 am
Preferred Pronouns: She/Her

Re: PDAs Still Causing a Crash

Post by Gutawer »

The 2D drawer normally does keep one vertex buffer for the entirety of 2D drawing, but I changed this for Shape2Ds in specific because while it works great for quads and shapes of any mildly small size, Shape2Ds are not really bounded in the size they can grow to and it was absolutely murdering ZForms' performance for drawing tiled 9-sliced images (as these buffers can grow large as they effectively have to tile the inside of a texture - i.e. UV wrapping won't work) - it was also affecting the performance of some other heavy Shape2D-using code I had written (namely, a 2D dungeon tile minimap for one of Nash's projects). For these larger buffers it certainly makes sense to give them their own buffer and only do reuploads if the data itself changes, but perhaps there should be a vertex count check to also let Shape2Ds just append to the normal drawing list if they're small enough (wouldn't help in this case though).

EDIT: what's confusing me here is that all of the Shape2Ds being created in ZForms 1.0 are being created in the scope of a function (and there aren't that many per frame, perhaps 20 or so) and should have their OnDestroy called on scope exit, which should cause the vertex buffer to be dropped as soon as the drawer is done with it. If the vertex buffers aren't being dropped then I suppose the code to extend their lifetimes long enough is broken?
DitheredOutput
Posts: 82
Joined: Tue Nov 12, 2019 5:26 pm

Re: PDAs Still Causing a Crash

Post by DitheredOutput »

I'm only here to confirm that this issue is replicable in Siren, basically as Enjay has described it, and to thank those that are looking into it!
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 49067
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: PDAs Still Causing a Crash

Post by Graf Zahl »

Ouch, this was really, really dumb.

10 points for the first person who can tell me why this cannot use NoVirtualRefCountedBase as base class...

Yes, you may guess correctly that the buffers pushed to the deferred deletion list were never destroyed so they happily accumulated until memory was exhausted - GC had no effect on these buffers at all.
User avatar
Enjay
 
 
Posts: 26534
Joined: Tue Jul 15, 2003 4:58 pm
Location: Scotland
Contact:

Re: PDAs Still Causing a Crash

Post by Enjay »

Well, I'd have never found that - I wouldn't have even known what to look for or where to find it, but I can certainly follow why it was a problem.

I've just got through a good few minutes (nearly 10) testing my PDAs and not a hint of a lock-up or crash. In fact, the menu inside the PDA seems to be more responsive than it was even with 4.6.1 and quitting the game after having had the PDAs open happens more quickly now too.

What's, more - I'm not clued in enough to know whether this constitutes a fix or whether it is merely a workaround, but this change also seems to have addressed this: viewtopic.php?f=2&t=70492

Running with the line menuactive = OnNoPause; commented out so that the game pauses while the PDAs are open (inserting that line was a workaround for the linked issue) and running with "stat gc", the garbage collector seems to be able to keep up now.

Thank you kindly.
User avatar
Nash
 
 
Posts: 17439
Joined: Mon Oct 27, 2003 12:07 am
Location: Kuala Lumpur, Malaysia
Contact:

Re: PDAs Still Causing a Crash

Post by Nash »

Does this mean I can postpone the upgrade work? ;)
Post Reply

Return to “Closed Bugs [GZDoom]”