PDAs Still Causing a Crash

Forum rules
Please don't bump threads here if you have a problem - it will often be forgotten about if you do. Instead, make a new thread here.

Post a reply

Smilies
:D :) :( :o :shock: :? 8-) :lol: :x :P :oops: :cry: :evil: :twisted: :roll: :wink: :geek: :ugeek: :!: :?: :idea: :arrow: :| :mrgreen: :3: :wub: >:( :blergh:
View more smilies

BBCode is OFF
Smilies are ON

Topic review
   

Expand view Topic review: PDAs Still Causing a Crash

Re: PDAs Still Causing a Crash

by Graf Zahl » Mon Oct 04, 2021 7:52 am

Enjay wrote:Well, I'd have never found that - I wouldn't have even known what to look for or where to find it, but I can certainly follow why it was a problem.
Yeah, it took me quite some time to find out what went wrong here. The entire idea to save the virtual destructor was a foolhardy idea from the start, because it renders the entire concept of an intrusive reference counter ad absurdum. You normally do not use such a thing with POD data, but with complex resources that require proper destruction when they are due to be deleted. And despite lack of an explicit destructor this struct had an implicit one so it needs to be virtual t be called through the base class's methods.

Enjay wrote: What's, more - I'm not clued in enough to know whether this constitutes a fix or whether it is merely a workaround, but this change also seems to have addressed this: viewtopic.php?f=2&t=70492 .
No, that was a different commit. It came out of trying to fix this one but turned out not to help at all for the PDA case.

Nash wrote:Does this mean I can postpone the upgrade work? ;)
Likely. The GC is now far more aggressive when the game is paused and it now also properly takes down the vertex buffers. Enjay's PDAs now call the GC every few seconds so they never have a chance to accumulate too much memory.

Re: PDAs Still Causing a Crash

by Enjay » Mon Oct 04, 2021 7:29 am

:lol: Well, it certainly means I don't need an immediate fix to the PDA Starter Kit.

Re: PDAs Still Causing a Crash

by Nash » Mon Oct 04, 2021 7:22 am

Does this mean I can postpone the upgrade work? ;)

Re: PDAs Still Causing a Crash

by Enjay » Mon Oct 04, 2021 6:59 am

Well, I'd have never found that - I wouldn't have even known what to look for or where to find it, but I can certainly follow why it was a problem.

I've just got through a good few minutes (nearly 10) testing my PDAs and not a hint of a lock-up or crash. In fact, the menu inside the PDA seems to be more responsive than it was even with 4.6.1 and quitting the game after having had the PDAs open happens more quickly now too.

What's, more - I'm not clued in enough to know whether this constitutes a fix or whether it is merely a workaround, but this change also seems to have addressed this: viewtopic.php?f=2&t=70492

Running with the line menuactive = OnNoPause; commented out so that the game pauses while the PDAs are open (inserting that line was a workaround for the linked issue) and running with "stat gc", the garbage collector seems to be able to keep up now.

Thank you kindly.

Re: PDAs Still Causing a Crash

by Graf Zahl » Sun Oct 03, 2021 5:11 am

Ouch, this was really, really dumb.

10 points for the first person who can tell me why this cannot use NoVirtualRefCountedBase as base class...

Yes, you may guess correctly that the buffers pushed to the deferred deletion list were never destroyed so they happily accumulated until memory was exhausted - GC had no effect on these buffers at all.

Re: PDAs Still Causing a Crash

by DitheredOutput » Wed Sep 29, 2021 5:25 pm

I'm only here to confirm that this issue is replicable in Siren, basically as Enjay has described it, and to thank those that are looking into it!

Re: PDAs Still Causing a Crash

by Gutawer » Wed Sep 29, 2021 5:08 am

The 2D drawer normally does keep one vertex buffer for the entirety of 2D drawing, but I changed this for Shape2Ds in specific because while it works great for quads and shapes of any mildly small size, Shape2Ds are not really bounded in the size they can grow to and it was absolutely murdering ZForms' performance for drawing tiled 9-sliced images (as these buffers can grow large as they effectively have to tile the inside of a texture - i.e. UV wrapping won't work) - it was also affecting the performance of some other heavy Shape2D-using code I had written (namely, a 2D dungeon tile minimap for one of Nash's projects). For these larger buffers it certainly makes sense to give them their own buffer and only do reuploads if the data itself changes, but perhaps there should be a vertex count check to also let Shape2Ds just append to the normal drawing list if they're small enough (wouldn't help in this case though).

EDIT: what's confusing me here is that all of the Shape2Ds being created in ZForms 1.0 are being created in the scope of a function (and there aren't that many per frame, perhaps 20 or so) and should have their OnDestroy called on scope exit, which should cause the vertex buffer to be dropped as soon as the drawer is done with it. If the vertex buffers aren't being dropped then I suppose the code to extend their lifetimes long enough is broken?

Re: PDAs Still Causing a Crash

by dpJudas » Wed Sep 29, 2021 2:42 am

I only know two ways. Both involve detaching the 1:1 relationship between the managed object and the resource.

First is to use a cache. The managed object holds the data itself (system memory, with a vector or similar). When the object is used it looks it up in the cache, where it gets the hardware vertex buffer. The cache then keeps an upper limit of how many hardware buffers that can be in use at the same time.

The other way is what UDB does, where I only allocate one vertex and index buffer for the entire thing. Each managed object then gets put into this shared one. I am not sure if this approach works for this case tho as the total memory used will be the same.

Re: PDAs Still Causing a Crash

by Graf Zahl » Tue Sep 28, 2021 11:47 pm

That entire Shape2D allocating vertex buffers thing surely turned out to be a big problem. Yes, it can make things faster, but it also requires a lot of programming discipline, which obviouly cannot be expected from high level scripters.
The main issue with the GC is that it only considers allocations for the object, but is totally unaware of an object hogging limited resources, so it takes a while to get triggered - and we already have an iisue with GC triggering when the game is paused.
I already changed the original setup of Shape2D to let OnDestroy delete all allocated buffers - but this apparently is not enough, because just "forgetting" the buffer seems to be a natural thing in a GC'd languaage...
I have no idea how to address this on the engine side

Re: PDAs Still Causing a Crash

by dpJudas » Tue Sep 28, 2021 8:06 pm

Enjay wrote:I know that I have never managed to get the crash when using OpenGL, but always get it within around 30 seconds with Vulkan. Quite what that means, I'm not qualified to say but if the problem does not lie with the Vulkan back end then wherever the problem lies, I guess it then creates a situation that the Vulkan backend is less tolerant of than OpenGL is? (Which, of course, does not have to mean anything wrong with the Vulkan backend.)
Vulkan crashes here much earlier than OpenGL due to how memory is allocated in the two backends. For vulkan the vertex buffers are placed in memory already committed to the GPU, while OpenGL gives certain guarantees that effectively means OpenGL will swap the leaking vertex buffers into system memory. If you did scroll for a very very long time then eventually you'd run out of system memory and the OpenGL backend would crash as well.
Nash wrote:From dpJudas' description, I am now 99.99% convinced this is an issue with the ZForms 1.0 API, and will go away when I upgrade the PDA kit to ZForms 2.0.
Yes and no. Maybe ZForms 1.0 does something naughty, maybe it does not. By the end of the day GZDoom itself should not allow a mod to create over 64000 vertex buffers. There is a memory leak in the shape drawing code that really should be fixed. Unfortunately I do not know enough about that code to really fix it without investing a lot of time figuring out what it's doing. See my posts more as a message to the other GZDoom developers (ideally, whoever wrote the shape 2d drawer or made changes to it) where the source to this crash originates.

Re: PDAs Still Causing a Crash

by Enjay » Tue Sep 28, 2021 7:13 pm

Thank you both very much for looking at this.

I know that I have never managed to get the crash when using OpenGL, but always get it within around 30 seconds with Vulkan. Quite what that means, I'm not qualified to say but if the problem does not lie with the Vulkan back end then wherever the problem lies, I guess it then creates a situation that the Vulkan backend is less tolerant of than OpenGL is? (Which, of course, does not have to mean anything wrong with the Vulkan backend.)

As I mentioned, something in one of the commits from either the 10th or the 11th of August is behind allowing this to happen but, again, I don't know enough to say if the commits themselves are to blame or merely that they exposed something that the PDA code was doing wrong all along and, up to that point, GZDoom had perhaps just tolerated it by chance?

I'm looking forward to the re-write of the PDA kit anyway (though I hope it will be reasonably easy to convert my PDAs across Image). I understand perfectly that you have other priorities Nash, so I look forward to it when it's done. I'll just keep a copy of GZDoom 4.6.1 in the project folder for this particular mod.

For what it's worth, I was also able to replicate the crash using the PDAs in Siren TC ( viewtopic.php?f=19&t=68686 ). It actually uses a slightly older version of the PDA starter kit (1.0) but is still crashes in a very similar way. The PDAs in that mod are less graphically intensive than mine and have shorter text entries too. Perhaps for one of those reasons it takes me longer (about 90 seconds) of reading and scrolling around PDA messages to get a crash, but the end result is the same. So I guess, that this will be something that could affect any project that is using the PDA starter kit.

Re: PDAs Still Causing a Crash

by Nash » Tue Sep 28, 2021 6:37 pm

From dpJudas' description, I am now 99.99% convinced this is an issue with the ZForms 1.0 API, and will go away when I upgrade the PDA kit to ZForms 2.0.

I am currently very occupied with some releases coming up towards the end of October so it will only be in November when I can start taking a look at upgrade the PDA kit.

Note that ZForms 2.0 is very different from 1.0 and I may have to heavily rewrite the PDA kit - or rewrite it completely.

Overall not a pretty situation for everyone involved... but it has to be done.

I apologize for not following too closely with these last few reports. I think it's safe to conclude that these will all go away once I upgrade the kit.

Re: PDAs Still Causing a Crash

by dpJudas » Tue Sep 28, 2021 6:16 pm

One last update on this. I'm now 100% sure this is not a bug in the vulkan backend.

Something keeps creating new DShape2D objects. The constructor there keeps allocating new DShape2DBufferInfo objects, which in turn keeps allocating new F2DVertexBuffer objects. Either RefCountedPtr<DShape2DBufferInfo> is broken or the DShape2D objects are never destroyed.

Aren't all DObject objects garbage collected? That a garbage collected object holds a reference to a limited resource is rarely ending well. I tried to force a GC run but that didn't get it to release the buffers though.

Re: PDAs Still Causing a Crash

by dpJudas » Tue Sep 28, 2021 5:55 pm

This is the call stack for what keeps creating more and more buffers (it is the call stack of the 10000th's vertex buffer allocation):

Code: Select all

>	gzdoom.exe!VKBuffer::VKBuffer() Line 38	C++
 	gzdoom.exe!VKVertexBuffer::VKVertexBuffer() Line 47	C++
 	gzdoom.exe!VulkanFrameBuffer::CreateVertexBuffer() Line 412	C++
 	gzdoom.exe!F2DVertexBuffer::F2DVertexBuffer() Line 1100	C++
 	gzdoom.exe!TArray<F2DVertexBuffer,F2DVertexBuffer>::ConstructEmpty(unsigned int first, unsigned int last) Line 592	C++
 	gzdoom.exe!TArray<F2DVertexBuffer,F2DVertexBuffer>::Reserve(unsigned __int64 amount) Line 513	C++
 	gzdoom.exe!F2DDrawer::AddShape(FGameTexture * img, DShape2D * shape, DrawParms & parms) Line 635	C++
 	gzdoom.exe!DrawShape(F2DDrawer * drawer, FGameTexture * img, DShape2D * shape, VMVa_List & args) Line 275	C++
 	gzdoom.exe!AF__Screen_DrawShape(VMValue * param, int numparam, VMReturn * ret, int numret, const unsigned char * reginfo) Line 292	C++
 	gzdoom.exe!VMNativeFunction::NativeScriptCall(VMFunction * func, VMValue * params, int numparams, VMReturn * returns, int numret) Line 315	C++

Re: PDAs Still Causing a Crash

by dpJudas » Tue Sep 28, 2021 5:49 pm

No worries, this is at least an easily reproducible test. I'm fairly certain at this point that it has to be F2DDrawer::AddShape that just keeps on creating more and more vertex buffers.

Top