Okay, I fixed the new version BestColor using doubles instead of uint_32's. We're taking 256 to the power of 4, while technically it fits inside a uint_32 it does not allow for much else.
For now, I kept all code for comparison using CVars, but that's not going to be suitable for a pull request. If I do such a thing, I will most likely be removing these CVars and there will be only one algorithm available.
Now - onto the implementation:
I like it, but it does have problems of its own. Mostly - it gets the primary colors correct, but when there are colors that are really distant from the palette (have higher/lower saturations than what's available) it doesn't seem to always pick the best "looking" color.
Best place to try this out - Doom 1 - "E3M3" do "warp -800 150" - use "gl_lightmode 8" - also, pretty much all of E3M7.
Here's the diff with changes, along with a pre-compiled exe for others to try (since devbuilds are behind):
https://mega.nz/#!9JkjABwZ!Fh8ci7vKUPZc ... drE4T7Qi5k (reuploaded, was missing gzdoom.pk3 which also changed since latest devbuild)
And here's just the diff:
To test this implementation, please use
Code: Select all
gzdoom +set gl_palette_tonemap_algorithm 2 +set r_colormatcher_algorithm 1
This only needs to be done once - the CVars will save to your config.
If you change r_colormatcher_algorithm, in both Software and GL mode the tables do not automatically get rebuilt. A "restart" ccmd will fix that. gl_palette_tonemap_algorithm does not have this problem and will take effect immediately. These CVars are in place for comparison and notation purposes.
With all that said - maybe there's a happy medium somewhere? A slightly lower gamma ramp, if available, maybe?
Also - I really do not like the idea of setting a "max" distance. The reason why I initialize the distance on the first iteration of the loop is it allows me to alter the algorithm any way I choose without having a maximum other than what the number structures themselves support. Notice how even you had to set a new max after putting in the gamma ramp? If we fix the v_palette.cpp code, I really would prefer the first iteration to initialize the numbers, rather than allowing it to be done outside the loop as before. I think the code looks cleaner, and it is more flexible. Correct me if I am wrong.
One more thing - if we do put in a floating exponent gamma ramp (such as 2.2 or 1.4 or whatever) - you can still retain the speed in processing simply by first going ahead and applying exponents to (0-255) and storing the results in an array, since these are the only numbers you are going to use, anyway. Then you only have to read the array to get the proper results, rather than recalculating it 768k times. Will speed up processing tremendously. Doom's original Gamma correction system did hard-coded precalculated arrays for exactly this reason.