dpJudas wrote:
So why haven't I attempted to implement an actual production version of such a system? First, a few years back GZDoom's renderer backend was a lot more messy than it is today. Secondly, in order to do this mesh building code in GZDoom needs to be improved in such a way that it can output to some kind of mesh builder. And lastly, the system needs reliable reports from the playsim part of things about when sectors invalidate and in what way it happened. All those things happens in areas I'm not too familiar with.
Adding the tracking code shouldn't be too hard - most of the sector and line structures are sufficiently encapsulated - it may require a bit of work on the ZScript side that it recognizes such protected properties so that it can update a set of change flags.
I don't really think that it's feasible without a persistently mapped storage buffer for the more volatile map info like light levels or light colors, but for modern hardware that shouldn't be a problem.
dpJudas wrote:
Anyway, so it isn't that you couldn't make significiant performance improvements. However it does require a much more planned strategy than simply grabbing other old tech like Quake and porting it to Doom. The renderers of both those engines were built around assumptions that doesn't apply or doesn't scale to today's GPU based rendering. Even GZDoom still roughly renders by following the original formulae. Graf made it as fast as it goes - any further (significant) speed improvement requires a different approach.
The thing I got the biggest boost out of was partially multithreading the BSP traversal with the draw list setup, that brought a 20% performance improvement. I think a similar gain could be achieved by multithreading the render data generation - but a more radical approach may be preferrable, like you outlined. But it should be clear that this might help on modern and fast hardware but it'd backfire miserably on weaker hardware limited by pixel fill rate. On my brother's laptop, for example, Frozen Time is rendered at 25 fps, both with OpenGL and Vulkan, with an AMD GPU - on this system the GPU cannot even work fast enough to get throttled by AMD's high draw call overhead.
And while your approach is certainly workable it suffers from one problem and that is the automap - the way this is handled requires traversing the BSP - so it either has to be done in a worker thread periodically or a different means needs to be invented.
But yeah, should I feel some motivation again to work on the renderer, I'd certainly try such a modern approach, if it can assume it got all the buffer features in the world, a lot of the volatile data can be stored in there to reduce setup costs and make the whole thing a lot easier.