Multithreading Doom's playsim

If it's not ZDoom, it goes here.

Multithreading Doom's playsim

Postby MartinHowe » Tue Mar 23, 2021 7:00 am

Prompted by finally considering what kind of CPU (by cores/threads) to buy for a new computer, I remember Graf saying ages ago that the playsim in Doom engine games has to be deterministic; also that while some of the rendering can be multithreaded, especially given the recent vertical sliced software renderer experiment by one of the devs, most of the playsim can not; it has to be deterministic.

For demos and multiplayer, I guess that's a given; but theoretically, if demos are not needed (I never watch them nor record them) and for SP only, does the playsim still need to be deterministic and if so why? If going beyond just that, could a nondeterministic playsim even have one thread per class of thinker, or even one per thinker, with a semi or completely asynchronous architecture? (And would it still feel like Doom :p)

Please note this is a theoretical question, I'm not angling towards an eventual feature request or anything; I suspect if doable, the internal architecture would require a huge rewrite in any case.
User avatar
MartinHowe
In space, no-one can hear you KILL an ALIEN
 
Joined: 11 Aug 2003
Location: Waveney, United Kingdom

Re: Multithreading Doom's playsim

Postby Graf Zahl » Tue Mar 23, 2021 10:38 am

The main problem here is that each ticked actor in the playsim can freely alter other actors. Now imagine two actors modifying something at the same time. Like monster dying from a projectile hit but also ticking itself at the same time and deciding to attack. Since order of execution is undefined the attack can occur right in the process of performing the death setup which means you can end up with a zombie, or worse, crash the game because one thread deletes what the other needs.

You have to carefully synchronize these things which will essentially nullify all the advantages
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: Multithreading Doom's playsim

Postby Caligari87 » Tue Mar 23, 2021 11:30 am

Hypothetically speaking, if there were a class of actor that had just read-only access to the gamesim/level data (and likewise were invisible to the gamesim), could those actors be segregated to run on a separate thread? Obviously these actors could be created by the main gamesim but would basically instantly disappear from it.

I'm imagining the primary use case would be visual-only stuff like effects / particles which end up being a big part of the think time for many advanced mods. Synchronization isn't important for things like that, and indeed can actually be a detriment if thinker iterators have to run over them looking for more important things.

8-)
User avatar
Caligari87
I'm just here for the community
User Accounts Assistant
 
Joined: 26 Feb 2004
Location: Salt Lake City, Utah, USA
Discord: Caligari87#3089

Re: Multithreading Doom's playsim

Postby Graf Zahl » Tue Mar 23, 2021 12:43 pm

Only if these actors were invisible to everything else and ran on their own stat queue that'd also have to be invisible/inaccessible to everything else.
But there's one other gotcha - the garbage collector is not capable of dealing with multithreading.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: Multithreading Doom's playsim

Postby Caligari87 » Tue Mar 23, 2021 1:37 pm

Graf Zahl wrote:Only if these actors were invisible to everything else and ran on their own stat queue that'd also have to be invisible/inaccessible to everything else.

Sounds reasonable, that's basically what I'm imagining. Separate everything, completely isolated actors that otherwise behave as normal for scripting purposes but are invisible to the main gamesim and can only read from it, not influence it.

The garbage collector sounds like a definite blocker and I'm not familiar with how much effort it'd take to make it multi-core aware, if even possible. Maybe run two GCs if that's even a thing?

8-)
User avatar
Caligari87
I'm just here for the community
User Accounts Assistant
 
Joined: 26 Feb 2004
Location: Salt Lake City, Utah, USA
Discord: Caligari87#3089

Re: Multithreading Doom's playsim

Postby Blzut3 » Tue Mar 23, 2021 2:23 pm

MartinHowe wrote:theoretically, if demos are not needed (I never watch them nor record them) and for SP only

Demos are actually useful for performance comparisons and debugging as well. For example a stubborn to reproduce bug might be possible to capture in a demo vs manually repeating the actions over and over again. I know this isn't the point of your thread, but people associating demos only for entertainment purposes is kind of a pet peeve of mine.
MartinHowe wrote:does the playsim still need to be deterministic and if so why?

While a non-deterministic play sim can be done and can feel like Doom (see Zandronum net play), it's not really the core of the reason that Doom's play sim is difficult to multithread. As Graf says, since you'd still need to synchronize writes to actor data even if don't care about what order things happen (i.e. two actors shoot a third actor, one kills the other only inflicts pain, you still need to ensure that the death takes precedence).

Now if you remove the need to feel like Doom it probably is possible to make a deterministic multithreaded playsim. Have all actors independently, based on a read only snapshot of the world, decide what their "move" is for the tick (this "move" would include any damage inflicted on other actors), but instead of actually writing out the changes they would be placed onto a per thread queue. Then once everything has been decided one thread commits the changes dropping any conflicts. I'm not sure how well this model would actually scale though. But it's easy to see how it would drastically impact the game play since for example two shotgun blasts to the same actor would hit the desired target wasting any overflow damage instead of one potentially blowing through.

Overall though if one were creating a Doom like engine from scratch I do think there's a lot that could be done differently from GZDoom to increase performance before even reaching for multithreading. One of the things I've heard of other engines doing is splitting actors into multiple objects in memory based on what values are used together so that CPU cache line utilization is higher and iteration is more predictable. For example you might have the x/y coordinates of an actor as a object so that 8 of them can fit into a cache line when trying to find actors within a radius. Certainly would be possible to retrofit something like this into GZDoom, but it would be a lot of work to do right.
Blzut3
Pronounced: B-l-zut
 
 
 
Joined: 24 Nov 2004
Github ID: Blzut3
Operating System: Debian-like Linux (Debian, Ubuntu, Mint, etc) 64-bit
Graphics Processor: ATI/AMD with Vulkan Support

Re: Multithreading Doom's playsim

Postby Graf Zahl » Tue Mar 23, 2021 2:34 pm

Blzut3 wrote:Overall though if one were creating a Doom like engine from scratch I do think there's a lot that could be done differently from GZDoom to increase performance before even reaching for multithreading. One of the things I've heard of other engines doing is splitting actors into multiple objects in memory based on what values are used together so that CPU cache line utilization is higher and iteration is more predictable. For example you might have the x/y coordinates of an actor as a object so that 8 of them can fit into a cache line when trying to find actors within a radius. Certainly would be possible to retrofit something like this into GZDoom, but it would be a lot of work to do right.



I think the drawbacks of such an approach will overweigh in the end.
Blood has done actors like that and calling the result a structural mess would be understating the problem.
This can only work if you compromise other things like scalability. There's also only a few places where you only need to read one or two variables from an actor, and accessing actors in batches is also rare, so time savings are doubtful.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: Multithreading Doom's playsim

Postby Blzut3 » Tue Mar 23, 2021 3:14 pm

Graf Zahl wrote:so time savings are doubtful.

Given that I first heard about the technique from a Microsoft employee talking about optimizing for the Xbox 360, and have since heard of a few people using it in practice, I think it's fair to say it has been proven to work. That said, I would definitely agree that current programming languages aren't designed for this kind of thing, which means from hobby "I'm going to maintain this code forever" standpoint it's probably worth taking the performance hit of having clear easy to maintain code. From a retrofitting standpoint I certainly wouldn't be surprised if getting significant performance uplift from it would require a huge overhaul, so there's that too. Backwards compatibility would probably get in the way all the time as well. Lots of reasons not to do it, but would definitely be a thing to think about if someone wanted to make a Doom like 2.5D engine from scratch.
Blzut3
Pronounced: B-l-zut
 
 
 
Joined: 24 Nov 2004
Github ID: Blzut3
Operating System: Debian-like Linux (Debian, Ubuntu, Mint, etc) 64-bit
Graphics Processor: ATI/AMD with Vulkan Support

Re: Multithreading Doom's playsim

Postby Graf Zahl » Tue Mar 23, 2021 3:24 pm

XBox 360? How many CPU cache does that have?
Also, people using it in practice does not mean much. I have seen all kinds of pointless optimizations 'in practice', but on closer inspection few of them had any benefit.
Realistically, in GZDoom it won't help at all, because the majority of current performance issues are releated to mod scripting where such techniques would fall flat on their face.

It's the kind of peephole optimization that ultimately stands in the way of algorithmic changes that may bring a lot more gains, but it's also the kind of stuff tinkerers may jump onto.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: Multithreading Doom's playsim

Postby MartinHowe » Tue Mar 23, 2021 4:05 pm

Blzut3 wrote:Demos are actually useful for performance comparisons and debugging as well. For example a stubborn to reproduce bug might be possible to capture in a demo vs manually repeating the actions over and over again. I know this isn't the point of your thread, but people associating demos only for entertainment purposes is kind of a pet peeve of mine.

Fascinating; I never realised there was a practical use for demos, this makes sense.

Blzut3 wrote:Have all actors independently, based on a read only snapshot of the world, decide what their "move" is for the tick (this "move" would include any damage inflicted on other actors), but instead of actually writing out the changes they would be placed onto a per thread queue. Then once everything has been decided one thread commits the changes dropping any conflicts.

This is not quite what I would have imagined and is definitely more sophisticated that the 'nearly asynchronous with mutexes and such' I would have thought of; I have only once in my varied IT career written multithreaded code and that was using helper stuff in .NET, so all this is quite an eye-opener for me.

On a general note, regarding motivation, I've been putting off buying a newer computer for ages as my current rig works fine for most things; playing mostly 1990s maps and a fewer more modern ones, you'd think dual 2008-era quad-core Xeons, 16GB of quad-channel server-grade memory, and a GF 550 Ti, on Linux so no MICROS~1 bloat to waste CPU cycles on, would have been enough. Then, this Sunday, I tried to play Planisphere 2 :p

Looks like that lovely Ryzen 9 3950X I was thinking of buying, with 16 cores, won't help me that much against the Spawn of Hell :(
User avatar
MartinHowe
In space, no-one can hear you KILL an ALIEN
 
Joined: 11 Aug 2003
Location: Waveney, United Kingdom

Re: Multithreading Doom's playsim

Postby Graf Zahl » Tue Mar 23, 2021 4:27 pm

MartinHowe wrote:On a general note, regarding motivation, I've been putting off buying a newer computer for ages as my current rig works fine for most things; playing mostly 1990s maps and a fewer more modern ones, you'd think dual 2008-era quad-core Xeons, 16GB of quad-channel server-grade memory, and a GF 550 Ti, on Linux so no MICROS~1 bloat to waste CPU cycles on, would have been enough. Then, this Sunday, I tried to play Planisphere 2 :p(


Didn't you know? Even for Doom you cannot have enough CPU power! Some of the larger maps are quite performance hungry.
Off topic: Micros~1 bloat does not affect GZDoom, the only bloat we have to contend with is AMD's lousy OpenGL performance.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: Multithreading Doom's playsim

Postby Blzut3 » Tue Mar 23, 2021 11:25 pm

MartinHowe wrote:This is not quite what I would have imagined and is definitely more sophisticated that the 'nearly asynchronous with mutexes and such' I would have thought of; I have only once in my varied IT career written multithreaded code and that was using helper stuff in .NET, so all this is quite an eye-opener for me.

I haven't had much excuse to write a whole lot of multithreaded code, but I feel like you might be approaching the thought from the wrong angle. That is you seem to be thinking only in "how do I start more threads" rather than redefining the problem so that it becomes "embarrassingly parallel," or at least a close to it as you can get. Mutexes are effectively a necessary evil and if you want code to go fast you need to avoid them. If you were to take the naive approach of parallelizing the playsim and mutex every actor you're going to burn a lot of time waiting on mutexes locks. This can easily result in code which may use more CPU percentage but actually perform no additional work, or worse.
MartinHowe wrote:Looks like that lovely Ryzen 9 3950X I was thinking of buying, with 16 cores, won't help me that much against the Spawn of Hell

Even as someone that loves playing with old hardware and making it do things it really shouldn't be doing, my advice is go mid range and upgrade twice as often. Since it's not really something that shows up in spec sheets people forget that the platform architecture as a whole evolves with the years, not to mention new instruction sets and IPC improvements. Obviously there's a limit to how far down the product stack this advice applies, but when it comes to the 3950X I do think that's in the territory of: if having 16 cores is not going to save yourself time today, don't buy it.

My brother garbage picked a 2006 dual Opteron 285 (quad core) system with 16GB of RAM. It was funny since with a GTX 960 4GB it would often running games get into a state where both CPU and GPU utilization was in single digits. Presumably the system buses were just too slow at carting data around. Probably not helped by games not being NUMA aware. A problem your socket 771 system doesn't have, but on the other hand with your CPUs being native dual core dies you effectively have a quad socket system which isn't doing the FSB any favors. He upgraded to a Haswell Celeron (2C/2T) and it ran circles around the Opterons in most things he did.

Or how about when my, at the time 7 years old, first gen core i5 couldn't handle video capture at 4:4:4 despite nominally having enough PCIe bandwidth for the card. But my Dad's Kaveri system did the same without breaking a sweat. Granted I was just waiting for Zen to upgrade anyway.

By the way you do know "server-grade memory" means slower with higher latency right? Granted the modules usually have a lot of overclocking headroom, but most server boards don't allow taking advantage of that.
Blzut3
Pronounced: B-l-zut
 
 
 
Joined: 24 Nov 2004
Github ID: Blzut3
Operating System: Debian-like Linux (Debian, Ubuntu, Mint, etc) 64-bit
Graphics Processor: ATI/AMD with Vulkan Support

Re: Multithreading Doom's playsim

Postby Graf Zahl » Wed Mar 24, 2021 12:57 am

Blzut3 wrote:
MartinHowe wrote:Looks like that lovely Ryzen 9 3950X I was thinking of buying, with 16 cores, won't help me that much against the Spawn of Hell

Even as someone that loves playing with old hardware and making it do things it really shouldn't be doing, my advice is go mid range and upgrade twice as often. Since it's not really something that shows up in spec sheets people forget that the platform architecture as a whole evolves with the years, not to mention new instruction sets and IPC improvements. Obviously there's a limit to how far down the product stack this advice applies, but when it comes to the 3950X I do think that's in the territory of: if having 16 cores is not going to save yourself time today, don't buy it.


It really depends on what you need. 9 years ago I had the choice of going mid-range or spend €200 more on a more powerful CPU. So, knowing my gaming habits, I merely bought a low mid-range GPU but invested a bit more into the CPU. I still run that system, admittedly it has problems decoding 4K videos but since I cannot display them, who cares?

But I am slowly reaching the stage where an upgrade may make sense because for many tasks the 4 core CPU and 8 GB of RAM won't cut it anymore. Still, getting 9 years of life out of a computer surely isn't bad at all! With a weaker CPU I may have had to upgrade 4 years ago already.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
 
Joined: 19 Jul 2003
Location: Germany

Re: Multithreading Doom's playsim

Postby Apeirogon » Wed Mar 24, 2021 1:58 am

Graf Zahl wrote:I think the drawbacks of such an approach will overweigh in the end.

Is it? CPU need to cache less data to work with per "operation" -> less cache misses -> more operations per second -> Optimization, from a BIG letter O. Or I miss something?

And yes, I dont think that game from 1997 is a good example why this approach is bad in 2021.
User avatar
Apeirogon
I have a strange sense of humour
 
Joined: 12 Jun 2017

Re: Multithreading Doom's playsim

Postby Graf Zahl » Wed Mar 24, 2021 4:22 am

Yes, you miss something here, namely that such an approach is very, very hostile toward code readability and refactorability. It's the classic case of a peephole optimization that for the moment may make things faster but ultimately will lead to more complex and bloated code because you no longer have to track one, but multiple objects per actor - and to be efficient, need to store them in static arrays. The loss of productivity and maintainability will inevitably take its toll here.
User avatar
Graf Zahl
Lead GZDoom+Raze Developer
Lead GZDoom+Raze Developer
 
Joined: 19 Jul 2003
Location: Germany

Next

Return to Off-Topic

Who is online

Users browsing this forum: wolfmanfp and 1 guest