The new OpenGL ES Renderer... WOW!

Discuss anything ZDoom-related that doesn't fall into one of the other categories.
dpJudas
 
 
Posts: 2940
Joined: Sat May 28, 2016 1:01 pm

Re: The new OpenGL ES Renderer... WOW!

Post by dpJudas »

Never mind about the shadowContribution thing. I managed to confuse myself while launching GZDoom. It is still slow with it commented out. :(
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 48657
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: The new OpenGL ES Renderer... WOW!

Post by Graf Zahl »

Yeah, I noticed.

BTW, what tipped it over in 3.4 vs. 3.3.2 was that flat rendering was changed from per-subsector to per-sector. Normally this won't make much of a difference, but MAP18 with its large sector that contains over 100 lights and covering a large part of the screen is a real edge case for this.

I think the GLES renderer is cheating and not rendering all lights. I was running another test with Waterlabs GZD and that one looks that a lot of lights are just missing. In MAP18 that's barely noticable.
bLUEbYTE
Posts: 148
Joined: Fri Nov 15, 2019 4:28 am
Graphics Processor: Intel with Vulkan/Metal Support
Location: Australia

Re: The new OpenGL ES Renderer... WOW!

Post by bLUEbYTE »

Oh, so it's the shadowmap
I have gl_light_shadowmap=false in my ini.
So I guess for the dynamic light stuff a bit of code optimizaton is needed then
Sounds good! I'm just glad I accidentally triggered this :)
Been testing some more and it seems to be related to the "pipelining" part. It is the Finish that is slow
Interesting you should mention that. I've always noticed that the 'finish' part was always taking a significant chunk of the render time.

I have these pseudo-benchmark results from the starting view of Demonastery. Will provide that file below just for funzies, but please note that this is from years ago, probably from v4.2-v4.4 era of GZDoom, with older drivers on Windows 10. Same hardware though. I had annotated the file with driver versions etc. at the beginning of each block.

Graf, dpJudas, emile_b, Rachael... Let me know if you'd like me to test specific things. I'm at your service.
You do not have the required permissions to view the files attached to this post.
Last edited by bLUEbYTE on Fri Feb 17, 2023 1:33 am, edited 1 time in total.
Professor Hastig
Posts: 101
Joined: Mon Jan 09, 2023 2:02 am
Graphics Processor: nVidia (Modern GZDoom)

Re: The new OpenGL ES Renderer... WOW!

Post by Professor Hastig »

I played around with it a bit, too, and did some test edits of the light definitions. Removing the dynamic lights from the health and armor bonus items gave a significant speed boost to MAP18.

So what's going to be the solution for this? I see two things that could be done - first add a light limiter the user can set and second, offer an optimized light.pk3 for low end systems that omits a few faint lights that don't add much but may cost a lot of performance. Especially these bonuses often come in large numbers so removing the light definitions for those may help a lot already.
Another thing I noticed is that when attenuated lights were added the radii of all affected items were increased by a factor of 1.5, so would it make sense to un-attenuate the lights in such a low end setup to reduce the radius again and let them affect fewer elements?
Professor Hastig
Posts: 101
Joined: Mon Jan 09, 2023 2:02 am
Graphics Processor: nVidia (Modern GZDoom)

Re: The new OpenGL ES Renderer... WOW!

Post by Professor Hastig »

bLUEbYTE wrote: Fri Feb 17, 2023 1:26 am Interesting you should mention that. I've always noticed that the 'finish' part was always taking a significant chunk of the render time.
Once you understand how a GPU works it is quite obvious why so much time often ends up in there.

Depending on your GPU's performance the renderer may still be able to submit its draw calls faster than the GPU is able to process. So normally, when the CPU is finished the GPU still needs to do some work, so before presenting the current frame the CPU needs to wait for the GPU. This wait time is what will show up in the 'finish' part.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 48657
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: The new OpenGL ES Renderer... WOW!

Post by Graf Zahl »

I wouldn't do a different lights.pk3. A better option would be to give lights a priority and then in performance mode disable low priority lights like the bonuses or other pickups that can appear in larger numbers.

Disabling attenuation and shrinking the radius would also be an option.
The third one, of course, is what GLES does, i.e. limit the number of lights per surface.

That way it can be better tuned to the actual hardware.
dpJudas
 
 
Posts: 2940
Joined: Sat May 28, 2016 1:01 pm

Re: The new OpenGL ES Renderer... WOW!

Post by dpJudas »

Shrinking the radius will have limited effect on MAP18. The reason being that the entire floor in the courtyard is one big surface that shares the same light list. Even though the shader's light code attempts to early out based on distance, it still have to examine the full list of lights for every pixel it draws.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 48657
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: The new OpenGL ES Renderer... WOW!

Post by Graf Zahl »

That's correct for this place, but on lower end hardware the size reduction may actually help in other places. The goal here should be to give users of low end hardware some different means to reduce GPU load for different kinds of maps. In a map like Waterlabs GZD which is overstuffed with dynamic lights the effect will be very different already.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 48657
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: The new OpenGL ES Renderer... WOW!

Post by Graf Zahl »

I did some more experimentation and I think I now know the cause of the performance problem - at least on NVidia.

Doing branching depending on SSBO contents does not work. The shader will always execute all parts, i.e. run the entire lighting logic on every pixel for every light - including the very costly spotlight code, even though there are no spot lights present.

This also explains why most of my tests yesterday did not provide any results - I mainly removed code depending on global uniforms, not the buffer content

Just switching back to uniform buffers makes things go 2-3x faster on MAP18.
I wonder what's up on Intel then. At least on OpenGL it should have defaulted to uniform buffers.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 48657
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: The new OpenGL ES Renderer... WOW!

Post by Graf Zahl »

Pushed a change to master, OpenGL only so far. I need feedback how this fares on other hardware before continuing here. Anyone interested should compare the upcoming devbuild with the 4.10 release

EDIT: Tried to add it to Vulkan, too, but it looks the entire infrastructure for using uniform buffers here has never been implemented. Som I'm going to wait with this until there's some feedback.
User avatar
phantombeta
Posts: 2051
Joined: Thu May 02, 2013 1:27 am
Operating System Version (Optional): Windows 10
Graphics Processor: nVidia with Vulkan support
Location: Brazil

Re: The new OpenGL ES Renderer... WOW!

Post by phantombeta »

Graf Zahl wrote: Sat Feb 18, 2023 3:49 am I did some more experimentation and I think I now know the cause of the performance problem - at least on NVidia.

Doing branching depending on SSBO contents does not work. The shader will always execute all parts, i.e. run the entire lighting logic on every pixel for every light - including the very costly spotlight code, even though there are no spot lights present.

This also explains why most of my tests yesterday did not provide any results - I mainly removed code depending on global uniforms, not the buffer content

Just switching back to uniform buffers makes things go 2-3x faster on MAP18.
What was it using SSBOs for? SSBOs are generally slower, and I believe the way they work makes it harder for the GPU to determine if threads will diverge, so they're more likely to go with the "run both branches and mask the results" way of branching and all the register pressure that ensues. And I don't believe they can ever be treated as "constant" values, even if indexed with a constant value, so that's another way to get that kind of branching because of SSBOs, whereas UBOs can be treated as "constant" by the GPU (AFAIK even array access in an UBO can, as long as the index is constant or "constant") and IIRC Nvidia can even specialize the shader on the uniforms.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
Posts: 48657
Joined: Sat Jul 19, 2003 10:19 am
Location: Germany

Re: The new OpenGL ES Renderer... WOW!

Post by Graf Zahl »

The buffer for the light data was an SSBO because UBOs have a size limit. In most situations the performance difference was not noticable because overall processing time of lights is low. I guess MAP18 is the only place where this could be tested effectively but since this is still at 140 fps on a Geforce 1060 there was no way to notice it without explicitly disabling vsync and switching on the FPS display.
and IIRC Nvidia can even specialize the shader on the uniforms.

I sincerely hope they don't. Recompiling shaders for these things leads to those infamous microstutters that are impossible to control on the application side.
User avatar
phantombeta
Posts: 2051
Joined: Thu May 02, 2013 1:27 am
Operating System Version (Optional): Windows 10
Graphics Processor: nVidia with Vulkan support
Location: Brazil

Re: The new OpenGL ES Renderer... WOW!

Post by phantombeta »

Graf Zahl wrote: Sat Feb 18, 2023 3:38 pm I sincerely hope they don't. Recompiling shaders for these things leads to those infamous microstutters that are impossible to control on the application side.
I imagine they're specialized at a fairly low level. If it knows it's a constant value, it can have it already set up to specialize it without recompiling the shaders. After all, if it knows the value is gonna be constant over a whole warp at execution time, it can assign registers and such based on the knowledge that only one of the branches can ever be taken per warp. It might have specialized instructions on the ISA for that, or rely on how a warp can elide an unused branch if it can determine all threads will go with the same branch, or it might even just remove the branch from the shader assembly before execution.
bLUEbYTE
Posts: 148
Joined: Fri Nov 15, 2019 4:28 am
Graphics Processor: Intel with Vulkan/Metal Support
Location: Australia

Re: The new OpenGL ES Renderer... WOW!

Post by bLUEbYTE »

Just did some performance testing on MAP18.

FIrst of all, compared to my initial benchmark in the original post, I have changed some settings with hopes of getting more accurate results:

FPS limit: changed to unlimited from 120 FPS
Vsync: turned off

4.10.0, OpenGL ES: ~ 200 FPS
4.10.0, OpenGL: ~ 70 FPS
devbuild 087050c20, OpenGL: ~105 FPS

So I can confirm a significant 50% boost in performance with the devbuild. Furthermore, I can also confirm the 'skipped' lights with the GLES backend; 12 of of the armor bonuses did not show dynamic lights - the ones closer to the closet in the center on two of the four lines, 6 on each 'line' was missing the lights. I am not sure if it's particular to my setup, but none of the health bonuses had lights around them regardless of backend.
Last edited by bLUEbYTE on Sat Feb 18, 2023 7:08 pm, edited 2 times in total.
dpJudas
 
 
Posts: 2940
Joined: Sat May 28, 2016 1:01 pm

Re: The new OpenGL ES Renderer... WOW!

Post by dpJudas »

bLUEbYTE wrote: Sat Feb 18, 2023 5:57 pm I am not sure if it's particular to my setup, but none of the health bonuses had lights around them regardless of backend.
Are you sure? They can be kind of hard to notice on the ground texture. Try type "gl_texture 0" in the console.

Return to “General”