ZDBSP 1.9: Obviously ZDBSP 1.8 Wasn't the End
Moderator: GZDoom Developers
ZDBSP 1.9: Obviously ZDBSP 1.8 Wasn't the End
ZDBSP 1.9 is out and fixes a crash bug that could sometimes occur on maps with unused sectors. There is also support for SSE1 without SSE2, so users with Pentium 3s or Athlon XPs will received improved performance from this version. People with SSE2-compatible processors will see a minor speedup as well.
Update: It seems I uploaded the Visual C++ version of ZDBSP instead of the GCC version by mistake. To rectify this, I have reuploaded it as ZDBSP 1.9.1. This is exactly the same as the old ZDBSP 1.9 but built with a different compiler that manages to make it faster.
Update: It seems I uploaded the Visual C++ version of ZDBSP instead of the GCC version by mistake. To rectify this, I have reuploaded it as ZDBSP 1.9.1. This is exactly the same as the old ZDBSP 1.9 but built with a different compiler that manages to make it faster.
Last edited by randi on Thu Jun 29, 2006 11:17 am, edited 1 time in total.
@Wills: Don't jinx it!
@randy: Was this linked with ace's bug report, by any chance?
EDIT: Heh, never mind, just saw the closed bug thread.
Thanks again for all your time and effort Randy
@randy: Was this linked with ace's bug report, by any chance?
EDIT: Heh, never mind, just saw the closed bug thread.
Thanks again for all your time and effort Randy
Last edited by Phobus on Mon Jun 26, 2006 3:30 pm, edited 1 time in total.
- Bio Hazard
- Posts: 4019
- Joined: Fri Aug 15, 2003 8:15 pm
- Location: ferret ~/C/ZDL $
- Contact:
- Hirogen2
- Posts: 2033
- Joined: Sat Jul 19, 2003 6:15 am
- Graphics Processor: Intel with Vulkan/Metal Support
- Location: Central Germany
- Contact:
nodebuild_classify_sse1.cpp from zdbsp r228 exists, but is empty.
Maybe use an #include "classifyline.c"? (It does not look as good, but it saves the copy&past'ing).
Do you actually use a win32 gcc to compile zdbsp? Or does MSVC have an equivalent to gcc's -msse/-msse2?
Code: Select all
// You may notice that this function is identical to ClassifyLine2.^M
// The reason it is SSE2 is because this file is explicitly compiled^M
// with SSE2 math enabled, but the other files are not.^M
Do you actually use a win32 gcc to compile zdbsp? Or does MSVC have an equivalent to gcc's -msse/-msse2?
- Alterworldruler
- Posts: 622
- Joined: Mon Dec 19, 2005 7:31 am
Yes, because GCC produced the fastest code, but you can set per-file optimization settings in VC++ as well. That's why the SSE optimization is present in ZDoom's internal nodebuilder as well. Writing a hand-optimized version of the routine that uses vector math might afford an appreciable speedup, but I'm pretty happy with the way it is now. If I do write a hand-optimized version, it will be for purely academic interest and not because I expect a huge speedup. That's what happened with the backpatching I added in this version.Hirogen2 wrote:Do you actually use a win32 gcc to compile zdbsp?
- Bio Hazard
- Posts: 4019
- Joined: Fri Aug 15, 2003 8:15 pm
- Location: ferret ~/C/ZDL $
- Contact:
That looks like a pretty nice speedup from the Intel compiler. I'm curious, though, did you build your own ZDBSP 1.9 with GCC or use the one I provided? I accidentally uploaded a Visual C++ build instead of a GCC one, which I discovered after seeing your post and wanting to see what time I got for zdoomcmp1.wad. I noticed ZDBSP 1.9 was slower than ZDBSP 1.8, which I knew couldn't be right, and that was when I discovered my mistake.
Anyway, here are my times (1.6 GHz Pentium M):
version 1.8: 2.34 seconds
version 1.9 (VC++ build): 2.64 seconds
version 1.9 (GCC build): 2.30 seconds
If the Intel version is still faster than the GCC version, I'd be interested in seeing its assembler output for the ClassifyLine functions. Since the nodebuilder spends over half its time in that one function, it seems the best place to optimize.
Anyway, here are my times (1.6 GHz Pentium M):
version 1.8: 2.34 seconds
version 1.9 (VC++ build): 2.64 seconds
version 1.9 (GCC build): 2.30 seconds
If the Intel version is still faster than the GCC version, I'd be interested in seeing its assembler output for the ClassifyLine functions. Since the nodebuilder spends over half its time in that one function, it seems the best place to optimize.