Is Vulkan now considered fully functional?

Thu Sep 09, 2021 5:32 am

I know that this is probably something that I should know already, but somehow I think I may have missed the final memo. ;)

Is the Vulkan renderer now considered to be complete, mature, fully functional and as bug-free as reasonably practical (i.e. more or less the same status as OpenGL is considered to have in GZDoom)?

The reason that I ask is that there do seem to be one or two threads still hanging around with queries about difficulties with Vulkan, its memory usage and so on.

I have pretty much been sticking with OpenGL for my playing, partially because of these reports and because OpenGL has never really given me any problems to date (so I didn't see the need to change). However, I was messing around with a map and loading it with (probably too many) dynamic lights and I noticed that, after the addition of the lights, certain areas of the map would render really quite slowly* So, out of interest, I switched to Vulkan and the slowdowns went away.

So, I'm just trying to find out if Vulkan should be working in as problem-free a way as OpenGL has up to this point for me.

*I usually run with vsync and see a constant 60fps in most maps. However in a complicated map with lots of 3D floors, placed in an open outdoor area and with lots of attenuated lights interacting with the floors, fps was dropping to the mid-high 20s. The map in question was my old Inca HQ from my Burghead mod. I was loading it with dynamic lights just to see if I could. From several points, when looking across the building and, even worse, when standing at certain points within the building, I was getting these very noticeable slowdowns - but only with OpenGL.

Re: Is Vulkan now considered fully functional?

Thu Sep 09, 2021 6:54 am

My only issue with Vulkan in GZDoom is the double memory usage (because Intel GPU's use the system RAM as graphics memory) which sometimes trigger stack smashing protection in larger projects. (for example: Total Chaos)

I also have a issue with the texture modes. Any mode that enables mipmap causes the entire map to be rendered with bilinear/trilinear filtering. Yet under OpenGL I can use mipmaps and nearest neighbor just fine. (vkQuake doesn't have this problem though)

Re: Is Vulkan now considered fully functional?

Mon Sep 13, 2021 12:12 pm

That double allocation definitely is a problem, but the way it was all done isn't easy to fix, it'll probably require a significant rewrite of the texture manager to properly detect if the copies are still needed. This was definitely one of the biggest fuckups in the spec that the backend cannot easily notify the calling code that a resource is done with and can be removed.

Aside from that the biggest issue is that on NVidia the memory allocator totally loses it when VRAM is full.
On NVidia the advantage is minor anyway, I can barely detect any performance differences and OpenGL overall just is less 'temperamental' and probably the better choice.
Of course on AMD and Intel things are different.

No idea about the texture filtering. I have no problems with it. I wouldn't consider it unreasonable that it's an Intel driver bug.

Re: Is Vulkan now considered fully functional?

Mon Sep 13, 2021 12:38 pm

About the double allocation, that could be solved by calling fb->WaitForCommands(false) at the end of VkHardwareTexture::CreateTexture. This will make it wait for the texture upload to finish before continuing, releasing the transfer buffers immediately.

The catch is that it will stall the GPU pipeline (potentially creating more stutter if the texture is loaded during play). The OpenGL backend currently uses glTexImage2D, which effectively does the same for large textures. I could imagine the OpenGL drivers have some threshold for when they will copy the data on the spot to avoid such a stall though. Such a similar thing could be implemented by only calling WaitForCommands if it detected it queued enough data for the (temporary) double allocation to become a problem. That could be implemented with a simple integer counter that is reset every frame and then call WaitForCommands when it exceeds something like let's say 64 megabytes.

Edit: Submitted a PR. It was slightly more complicated than this due to the renderstate already using bound resources when texture uploads are triggered.

Re: Is Vulkan now considered fully functional?

Mon Sep 13, 2021 2:36 pm

Is there no way to query a synchronization object so that you can check all pending ones periodically and release the textures that have finished uploading?
I mean, that's what I'd provide for such an asynchronous workload because there needs to be a way to know if something is complete without explicitly having to wait for it.

Re: Is Vulkan now considered fully functional?

Mon Sep 13, 2021 5:03 pm

Yes, but unfortunately the upload doesn't actually start until you submit the command buffer with the transfer command. You could create a sync object and a command buffer per texture, but it would be very inefficient to submit so many small command buffers I could imagine (*). It also wouldn't surprise me if the sync objects are backed by windows event handles. To do it 100% correctly you even need to submit it to a different command queue called the 'copy queue' so that the transfers are done via DMA. The graphics queue only does the copy using the CPU. (OpenGL probably does bit of both based on some heuristics involving the size of the texture and how many it saw you upload so far)

To track all this you need a full subsystem (dedicated class handling this problem). It can't be done with just a few functions like the vulkan backend does it today. My PR is a compromise that sacrifices a bit of upload performance over needing 1000 more lines of pure vulkan fun! Since textures are only uploaded once anyway it will probably not affect users much. If GZDoom was a 100 GB modern game title my PR wouldn't be good enough, but for GZDoom's mods it hopefully is. Either way, the PR cuts the memory usage in half, so I still think it's a better solution than what was there before.

*) The OpenGL backend is sometimes faster than Vulkan and sometimes slower due to differences in when the OpenGL heuristics submits a command buffer versus the fixed draw call count hardcoded in the vulkan backend. Submit buffers too often and OpenGL is faster, submit buffers too rarely and OpenGL is faster, submit it with just right amount of drawcalls and Vulkan wins.

Re: Is Vulkan now considered fully functional?

Tue Sep 14, 2021 12:20 am

Yeah, I had some fun looking at NVRHI's texture code. Which really makes me wonder what they were smoking when designing this crap. Low level and all, I consider an API worthless if there is no way to find out the optimal values for such things.
In this regard it reminds me a lot of J2ME's sound extensions. That spec was full of 'canbe's and 'maybe's and 'shouldbe's but zero facility to query these ''s that trial and error was the only way to handle it which resulted in one build for one device instead of having a univeral one for all.

Re: Is Vulkan now considered fully functional?

Sun Sep 19, 2021 6:12 am

Well, yesterday, with an older git build (prior to dpJudas' change) I deliberately tried to cause a situation that would cause one of these memory allocation issues. I did eventually manage to do it by using nuts.wad, swwmgz and lights.pk3 loaded, freezing the game at the console, opening the far room so that all monsters could be seen, spawning a whole bunch (several thousand) bfg balls (using a key bound to doing that), unfreezing the game, firing a shot and then using the massacre cheat. That reduced GZDoom to a crawl (less than 1 frame per minute by my guess - but I could still occasionally hear a sound, so I knew it wasn't hung) and after about 6 or 7 minutes, GZDoom aborted telling me that it couldn't allocate [number] of vertexes. I was recording the session in a log file but I accidentally restarted the game with the same parameters and over-wrote it. It was definitely a big number though - over a million, possibly 11 million and something I think.

Anyway, today, with a newer build (g4.7.0pre-195-g3acc5a272), I have tried to recreate the same conditions. I guess that it could just be that I haven't spawned enough BFG balls or created such a stressful situation yet (I should have made a savegame before typing massacre yesterday), but I haven't managed to get a slowdown as severe as I got yesterday, and I haven't managed to get GZDoom to abort either.

The above only slowed to an unplayable frame rate for about 10-15 seconds and then I could start running around the map again and, as you can see, things were still interactive enough for me to take a screenshot.

So, in this one, very unscientific, test, there does seem to be an improvement.

Re: Is Vulkan now considered fully functional?

Sun Sep 19, 2021 6:31 am

In general you need to either have extremely low end hardware, or be using very large mods mixed with texture upscalers, supersampling, all that stuff. Otherwise GZDoom just doesn't use that much memory.

To illustrate, playing something like Frozen Time uses less than 64 MB of texture memory. I had to artificially reduce my memory limit for uploads when testing the PR to even be able to test it! Most Doom mods just doesn't use high resolution textures as it clashes with the low poly count. Upscalers are more the cause of texture memory issues than the mods are.

Note that your vertex count test is the same behavior for both OpenGL and Vulkan. The persistently bound vertex buffers are the same sizes and you really have to be doing something completely unreasonable (like your test was demonstrating :p) for it to become any issue at all. If you want to experience the out of memory conditions actually affecting users then simply find the biggest mod around (BoA prolly will suffice) and play it with supersampling on a large monitor and with the most extreme upscaler settings. That way you can easily end up using gigabytes of memory on textures.

Re: Is Vulkan now considered fully functional?

Sun Sep 19, 2021 6:34 am

Try Total Chaos, and turn on texture HQ resize upscalers, all post processing enabled, AND video supersampling. ;)

Re: Is Vulkan now considered fully functional?

Sun Sep 19, 2021 6:58 am

Total Chaos never even worked for me on my system with OpenGL - the texture management simply isn't designed to handle this much data.

Re: Is Vulkan now considered fully functional?

Sun Sep 19, 2021 8:20 am

I'd forgotten about Total Chaos. The only issue I seem to have with it is noticeable loading times when seeing new areas for the first time (I haven't checked to see if caching would help). Once the data was loaded though, it ran well enough most of the time (occasional detectable drops in frame rate but most of the time working well enough).

Anyway, I just tried it for about 15 minutes with all the texture resizing values enabled and maxed out and... it was rather uneventful really. It all just worked. :)

Re: Is Vulkan now considered fully functional?

Sun Sep 19, 2021 8:39 am

I guess it depends on the amount of VRAM. Back when I tried it I still had my old 1GB Geforce 550 Ti which clearly did not have enough.