ZDBSP 1.9: Obviously ZDBSP 1.8 Wasn't the End
-
randi
- Site Admin
- Posts: 7749
- Joined: Wed Jul 09, 2003 10:30 pm
ZDBSP 1.9: Obviously ZDBSP 1.8 Wasn't the End
ZDBSP 1.9 is out and fixes a crash bug that could sometimes occur on maps with unused sectors. There is also support for SSE1 without SSE2, so users with Pentium 3s or Athlon XPs will received improved performance from this version. People with SSE2-compatible processors will see a minor speedup as well.
Update: It seems I uploaded the Visual C++ version of ZDBSP instead of the GCC version by mistake. To rectify this, I have reuploaded it as ZDBSP 1.9.1. This is exactly the same as the old ZDBSP 1.9 but built with a different compiler that manages to make it faster.
Update: It seems I uploaded the Visual C++ version of ZDBSP instead of the GCC version by mistake. To rectify this, I have reuploaded it as ZDBSP 1.9.1. This is exactly the same as the old ZDBSP 1.9 but built with a different compiler that manages to make it faster.
Last edited by randi on Thu Jun 29, 2006 11:17 am, edited 1 time in total.
-
Enjay
-

- Posts: 27512
- Joined: Tue Jul 15, 2003 4:58 pm
- Location: Scotland
-
Wills
- Posts: 1446
- Joined: Mon Jan 10, 2005 7:01 pm
- Location: The Well of Wishes
-
randi
- Site Admin
- Posts: 7749
- Joined: Wed Jul 09, 2003 10:30 pm
-
Phobus
- Posts: 5984
- Joined: Thu May 05, 2005 10:56 am
- Location: London
-
Bio Hazard
- Posts: 4019
- Joined: Fri Aug 15, 2003 8:15 pm
- Location: ferret ~/C/ZDL $
-
NiGHTMARE
- Posts: 3463
- Joined: Sat Jul 19, 2003 8:39 am
-
Hirogen2
- Posts: 2033
- Joined: Sat Jul 19, 2003 6:15 am
- Operating System Version (Optional): Tumbleweed x64
- Graphics Processor: Intel with Vulkan/Metal Support
- Location: Central Germany
nodebuild_classify_sse1.cpp from zdbsp r228 exists, but is empty.
Maybe use an #include "classifyline.c"? (It does not look as good, but it saves the copy&past'ing).
Do you actually use a win32 gcc to compile zdbsp? Or does MSVC have an equivalent to gcc's -msse/-msse2?
Code: Select all
// You may notice that this function is identical to ClassifyLine2.^M
// The reason it is SSE2 is because this file is explicitly compiled^M
// with SSE2 math enabled, but the other files are not.^M
Do you actually use a win32 gcc to compile zdbsp? Or does MSVC have an equivalent to gcc's -msse/-msse2?
-
Alterworldruler
- Posts: 622
- Joined: Mon Dec 19, 2005 7:31 am
-
randi
- Site Admin
- Posts: 7749
- Joined: Wed Jul 09, 2003 10:30 pm
Yes, because GCC produced the fastest code, but you can set per-file optimization settings in VC++ as well. That's why the SSE optimization is present in ZDoom's internal nodebuilder as well. Writing a hand-optimized version of the routine that uses vector math might afford an appreciable speedup, but I'm pretty happy with the way it is now. If I do write a hand-optimized version, it will be for purely academic interest and not because I expect a huge speedup. That's what happened with the backpatching I added in this version.Hirogen2 wrote:Do you actually use a win32 gcc to compile zdbsp?
-
entryway
- Posts: 59
- Joined: Tue Aug 30, 2005 1:04 pm
-
Bio Hazard
- Posts: 4019
- Joined: Fri Aug 15, 2003 8:15 pm
- Location: ferret ~/C/ZDL $
-
entryway
- Posts: 59
- Joined: Tue Aug 30, 2005 1:04 pm
-
randi
- Site Admin
- Posts: 7749
- Joined: Wed Jul 09, 2003 10:30 pm
That looks like a pretty nice speedup from the Intel compiler. I'm curious, though, did you build your own ZDBSP 1.9 with GCC or use the one I provided? I accidentally uploaded a Visual C++ build instead of a GCC one, which I discovered after seeing your post and wanting to see what time I got for zdoomcmp1.wad. I noticed ZDBSP 1.9 was slower than ZDBSP 1.8, which I knew couldn't be right, and that was when I discovered my mistake.
Anyway, here are my times (1.6 GHz Pentium M):
version 1.8: 2.34 seconds
version 1.9 (VC++ build): 2.64 seconds
version 1.9 (GCC build): 2.30 seconds
If the Intel version is still faster than the GCC version, I'd be interested in seeing its assembler output for the ClassifyLine functions. Since the nodebuilder spends over half its time in that one function, it seems the best place to optimize.
Anyway, here are my times (1.6 GHz Pentium M):
version 1.8: 2.34 seconds
version 1.9 (VC++ build): 2.64 seconds
version 1.9 (GCC build): 2.30 seconds
If the Intel version is still faster than the GCC version, I'd be interested in seeing its assembler output for the ClassifyLine functions. Since the nodebuilder spends over half its time in that one function, it seems the best place to optimize.