by Rachael » Thu Oct 26, 2017 9:56 am
So after researching this (this was on my to-do list for a while and I just got around to it), I've found out that r_all.cpp and poly_all.cpp are already compiled with unsafe math enabled (it's implied by the "-ffast-math" GCC flag which is set in the root CMakeLists.txt file). In fact, that's actually implied with the ${FASTMATH_SOURCES} list in CMake, and the flags are set for both compilers to make this happen. It happens with actually a large number of the files in the source.
This actually explains why software renderer bugs are so hard to diagnose in release mode; when fast math is enabled a lot of the safety checks go off (hence, it being labeled as "unsafe"), including division by zero checks and interrupt catchers. It also explains why the software renderer behaves differently in ARM and i686; hence, why some software renderer bugs are ARM-specific. The math that is executed in these functions will, in fact, break IEEE conformity, and what happens at that point really is processor-specific.
So yeah, it really was my fault for enabling unsafe math across the board - that should never have been done, and it turns out it's already enabled where it's actually needed (in performance-critical code, including the renderer) - so nothing further needs to be done here.
So after researching this (this was on my to-do list for a while and I just got around to it), I've found out that r_all.cpp and poly_all.cpp are already compiled with unsafe math enabled (it's implied by the "-ffast-math" GCC flag which is set in the root CMakeLists.txt file). In fact, that's actually implied with the ${FASTMATH_SOURCES} list in CMake, and the flags are set for both compilers to make this happen. It happens with actually a large number of the files in the source.
This actually explains why software renderer bugs are so hard to diagnose in release mode; when fast math is enabled a lot of the safety checks go off (hence, it being labeled as "unsafe"), including division by zero checks and interrupt catchers. It also explains why the software renderer behaves differently in ARM and i686; hence, why some software renderer bugs are ARM-specific. The math that is executed in these functions will, in fact, break IEEE conformity, and what happens at that point really is processor-specific.
So yeah, it really was my fault for enabling unsafe math across the board - that should never have been done, and it turns out it's already enabled where it's actually needed (in performance-critical code, including the renderer) - so nothing further needs to be done here. :)