QZDoom - ZDoom with True-Color (Version 1.3.0 released!)
Forum rules
The Projects forums are ONLY for YOUR PROJECTS! If you are asking questions about a project, either find that project's thread, or start a thread in the General section instead.
Got a cool project idea but nothing else? Put it in the project ideas thread instead!
Projects for any Doom-based engine are perfectly acceptable here too.
Please read the full rules for more details.
The Projects forums are ONLY for YOUR PROJECTS! If you are asking questions about a project, either find that project's thread, or start a thread in the General section instead.
Got a cool project idea but nothing else? Put it in the project ideas thread instead!
Projects for any Doom-based engine are perfectly acceptable here too.
Please read the full rules for more details.
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
I tried to play with the dynamic lights, but your code looks weird (but still the best one )
On my best I managed to gain 25+fps (mostly in vanilla doom maps) but somehow I tested the glitchland map that I did and while your code did 56fps O_O mine only reached 6 ;--;)
EDIT: Actually no, the code that I have is quite slow with my map O_o
At any case I tested two things:
test one:
replacing
float dist = dist2 * _mm_cvtss_f32(_mm_rsqrt_ss(_mm_load_ss(&dist2)));
by:
float dist = ((int)dist2 >> 5);
Yeah, We lose precision here, but it's kinda similar to the original one
test two:
replacing
float dist = dist2 * _mm_cvtss_f32(_mm_rsqrt_ss(_mm_load_ss(&dist2)));
by:
float dist = dist2 / 34.0;
this way we have the same result as the previous test, but we do not lose the float part
On my best I managed to gain 25+fps (mostly in vanilla doom maps) but somehow I tested the glitchland map that I did and while your code did 56fps O_O mine only reached 6 ;--;)
EDIT: Actually no, the code that I have is quite slow with my map O_o
At any case I tested two things:
test one:
replacing
float dist = dist2 * _mm_cvtss_f32(_mm_rsqrt_ss(_mm_load_ss(&dist2)));
by:
float dist = ((int)dist2 >> 5);
Yeah, We lose precision here, but it's kinda similar to the original one
test two:
replacing
float dist = dist2 * _mm_cvtss_f32(_mm_rsqrt_ss(_mm_load_ss(&dist2)));
by:
float dist = dist2 / 34.0;
this way we have the same result as the previous test, but we do not lose the float part
Last edited by ibm5155 on Sun Jan 01, 2017 6:42 pm, edited 1 time in total.
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
It sounds like an issue with cache misses. How big is your table?
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
Well, before I try anything more complex I was just looking about how well the code would run with just a simple bit move...
I need a better map for testing dynamic lights (and just it)
Maybe doing a really big array table (but losing a bit of distance precision) will help with the performance and also with the precision compared with the tests that I did
Ok I did a test wad and wow, DpJudas code rocks, mine reached 100fps while his code was at 125fps O_O
Just to compare:
DpJudas
I need a better map for testing dynamic lights (and just it)
Maybe doing a really big array table (but losing a bit of distance precision) will help with the performance and also with the precision compared with the tests that I did
Ok I did a test wad and wow, DpJudas code rocks, mine reached 100fps while his code was at 125fps O_O
Just to compare:
DpJudas
Spoiler:Ibm5155
Spoiler:
Last edited by ibm5155 on Sun Jan 01, 2017 7:02 pm, edited 1 time in total.
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
I recommend using Unloved for dynlight tests.
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
If you like toying around with this, try see what speed you get with the fast inverse square root algorithm.
About using a non-linear attenuation (dist2 * (1/34.0f)), this is faster of course. Like affine texturing, it also gets increasingly more wrong the bigger the point light is. So there's a trade-off there - speed vs quality.
About using a non-linear attenuation (dist2 * (1/34.0f)), this is faster of course. Like affine texturing, it also gets increasingly more wrong the bigger the point light is. So there's a trade-off there - speed vs quality.
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
I just replaced the dist2 >> 5 by dist2 * (dist2 * (1/34.0f)), and, it was faster, but not as fast as your original code O_o (you won by 10fps)
EDIT: I just tested the fast inverse square root and, It was as fast as the bit shift method that I did (so, 100fps one my case)
EDIT: I just tested the fast inverse square root and, It was as fast as the bit shift method that I did (so, 100fps one my case)
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
Actually, I meant your "float dist = dist2 / 34.0". I just rewrote it as "dist = dist2 * (1/34.0f)" to get rid of the divide.
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
I just tested with a 600kb of valid dist values and the restult is BAD
So far here are the results:
Linear Divide code: 112fps
Bit Shift code: 100fps (why: needs to transform an float to integer all the time for making bitshift)
Dpjudas code: 98fps (118fps on the eruana build and 98 on the build that I'm compiling)
Big array code: 89fps (why: I beat in alot of cache miss with this code) (and because it's converting from float to integer for finding out the array index)
EDIT: I have added the testing wad here
EDIT 2: Hmm, ok, Idk why but, the build that I'm compiling is doing a quite different fps from the latest eruana build now both bit shift and linear divide code are with the highest fps (but ofc with a lower quality)
For makingt The "Big array code" I did this:
and in the code I just did dist = SQRT_ARRAY[(int)dist2];
LAST EDIT D:
Dpjudas software code: 103fps (true color)
AMD HD8870m: 97fps
Intel HD 4000: 58fps
Now the question is: WHY THE HELL I'm complaining with the fps performance about the dynamic light code when the software render is now FASTER THAN MY OWN GPU IN OPENGL RENDER !!!!!!!
Spoiler:So far it's the worse code I ever tested (I beat one cookie this one is faulty of a big memory data, so there'll be alot of cache miss here and we'll need to access the memory more than the normal
So far here are the results:
Linear Divide code: 112fps
Bit Shift code: 100fps (why: needs to transform an float to integer all the time for making bitshift)
Dpjudas code: 98fps (118fps on the eruana build and 98 on the build that I'm compiling)
Big array code: 89fps (why: I beat in alot of cache miss with this code) (and because it's converting from float to integer for finding out the array index)
EDIT: I have added the testing wad here
EDIT 2: Hmm, ok, Idk why but, the build that I'm compiling is doing a quite different fps from the latest eruana build now both bit shift and linear divide code are with the highest fps (but ofc with a lower quality)
For makingt The "Big array code" I did this:
Code: Select all
float Q_rsqrt(float number)
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = *(long *)&y; // evil floating point bit level hacking
i = 0x5f3759df - (i >> 1); // what the fuck?
y = *(float *)&i;
y = y * (threehalfs - (x2 * y * y)); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
void main()
{
int i = 65535;
fprintf(X, "float SQRT_ARRAY[%d] = {",i);
for (int j = 0; j < i; j++)
{
fprintf(X,"%f,", j*(float)Q_rsqrt(j));
}
fprintf(X, "}");
fclose(X);
}
LAST EDIT D:
Dpjudas software code: 103fps (true color)
AMD HD8870m: 97fps
Intel HD 4000: 58fps
Now the question is: WHY THE HELL I'm complaining with the fps performance about the dynamic light code when the software render is now FASTER THAN MY OWN GPU IN OPENGL RENDER !!!!!!!
- InsanityBringer
- Posts: 3386
- Joined: Thu Jul 05, 2007 4:53 pm
- Location: opening the forbidden box
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
I'm trying the paletted software renderer, since I need something to replace ZDoom with while it remains dead in the water, but its performing much worse for me than it ever has with zdoom. In particular, transparency just murders the framerate, and I'm noticing some issues with decals not rendering correctly (usually with some columns not drawing at all). Any idea what could be up? if it helps any, I'm running an AMD FX 6300 for my CPU and a r7 260x for my GPU
EDIT: I was able to tame the framerate some by switching the software canvas back to Direct3D, but I'm still observing a small bit of lag and the decal problem. I'll report the decals thing as a bug I guess, but I found it weird no one else has mentioned it.
EDIT: I was able to tame the framerate some by switching the software canvas back to Direct3D, but I'm still observing a small bit of lag and the decal problem. I'll report the decals thing as a bug I guess, but I found it weird no one else has mentioned it.
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
Right now there are two modes for the transparency drawers. They should be in the "Display" options - Transparency Render Mode - Precise or Classic. Also - make sure you deactivate dynamic lights. With these two options, speed should be pretty close to original ZDoom.
- InsanityBringer
- Posts: 3386
- Joined: Thu Jul 05, 2007 4:53 pm
- Location: opening the forbidden box
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
I'm not seeing these options in the display options menu. Are they not present in the 1.2.0 release?
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
Oh - I thought you'd be using a dev build. No, they're not present in the release builds because I released on an old Git commit. It should not be slower, at all, then. What happens when you run GZDoom 2.3.0 in software mode?
Also - what is your CPU?
Also - what is your CPU?
- InsanityBringer
- Posts: 3386
- Joined: Thu Jul 05, 2007 4:53 pm
- Location: opening the forbidden box
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
I can try test builds later. For now, I tried putting GZDoom in software rendering mode and I didn't notice any abnormal slowdown with transparency (at least nothing worse than plain ZDoom). Does Multithreaded drawing in the truecolor options have an effect on the paletted renderer? When that's off, I get the slowdowns when looking at transparency, but switching that on makes it run faster, even in paletted mode. I also switched off lights, but I'm not sure if that has an effect since I'm not loading lights.pk3
My CPU is a AMD FX 6300.
My CPU is a AMD FX 6300.
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
Yes, when dpJudas backported the multithreaded rendering to ZDoom, he used the same variable name that QZDoom now shares for its Truecolor rendering.
Which means that should probably be removed from the Truecolor menu now.
Which means that should probably be removed from the Truecolor menu now.
- Graf Zahl
- Lead GZDoom+Raze Developer
- Posts: 49073
- Joined: Sat Jul 19, 2003 10:19 am
- Location: Germany
Re: QZDoom - ZDoom with True-Color (Version 1.2.0 released!)
Can it be that some of the RGB666 drawers are unintentionally being used somewhere?