by RuscoIstar » Wed Dec 22, 2021 11:15 pm
The "Software" fuzz style is great, specially when the effect is scaled up. Although we will never have a perfect recreation of the "Vanilla Doom" or software look on the hardware accelerated engine, it comes really close. There is one problem though, and that is it's so "expensive" on less capable hardware (igpu and mobile) that on legacy builds the fuzz setting defaults to shadow (I know there are other technical reasons to pick shadow too, as explained at viewtopic.php?p=1112677#p1112677 but I'd digress).
It always seemed curious to me that the effect, on the same old hardware, is trivial and not expensive on software mode, so I took a quick look at the code found in
https://github.com/coelckers/gzdoom/blo ... oftware.fp and I think I know what's holding it up.
Code: Select all
// Fuzz effect as rendered by the software renderer
#define FUZZTABLE 50
#define FUZZ_RANDOM_X_SIZE 100
#define FRACBITS 16
#define fixed_t int
int fuzz_random_x_offset[FUZZ_RANDOM_X_SIZE] = int[]
(
75, 76, 21, 91, 56, 33, 62, 99, 61, 79,
95, 54, 41, 18, 69, 43, 49, 59, 10, 84,
94, 17, 57, 46, 9, 39, 55, 34,100, 81,
73, 88, 92, 3, 63, 36, 7, 28, 13, 80,
16, 96, 78, 29, 71, 58, 89, 24, 1, 35,
52, 82, 4, 14, 22, 53, 38, 66, 12, 72,
90, 44, 77, 83, 6, 27, 48, 30, 42, 32,
65, 15, 97, 20, 67, 74, 98, 85, 60, 68,
19, 26, 8, 87, 86, 64, 11, 37, 31, 47,
25, 5, 50, 51, 23, 2, 93, 70, 40, 45
);
int fuzzoffset[FUZZTABLE] = int[]
(
6, 11, 6, 11, 6, 6, 11, 6, 6, 11,
6, 6, 6, 11, 6, 6, 6, 11, 15, 18,
21, 6, 11, 15, 6, 6, 6, 6, 11, 6,
11, 6, 6, 11, 15, 6, 6, 11, 15, 18,
21, 6, 6, 6, 6, 11, 6, 6, 11, 6
);
vec4 ProcessTexel()
{
vec2 texCoord = vTexCoord.st;
vec4 basicColor = getTexel(texCoord);
// Ideally fuzzpos would be an uniform and differ from each sprite so that overlapping demons won't get the same shade for the same pixel
int next_random = int(abs(mod(timer * 35.0, float(FUZZ_RANDOM_X_SIZE))));
int fuzzpos = (/*fuzzpos +*/ fuzz_random_x_offset[next_random] * FUZZTABLE / 100) % FUZZTABLE;
int x = int(gl_FragCoord.x);
int y = int(gl_FragCoord.y);
fixed_t fuzzscale = (200 << FRACBITS) / uViewHeight;
int scaled_x = (x * fuzzscale) >> FRACBITS;
int fuzz_x = fuzz_random_x_offset[scaled_x % FUZZ_RANDOM_X_SIZE] + fuzzpos;
fixed_t fuzzcount = FUZZTABLE << FRACBITS;
fixed_t fuzz = ((fuzz_x << FRACBITS) + y * fuzzscale) % fuzzcount;
float alpha = float(fuzzoffset[fuzz >> FRACBITS]) / 32.0;
return vec4(0.0, 0.0, 0.0, basicColor.a * alpha);
}
The first mod operation seems to help the shader update at the correct framerate and is thus not easily negotiable, but the following modulus operators are for array lookups and coordinates. I have found during testing on my project Demake Shaders that this can become a bottleneck in low powered devices. In shader units hardware the % operator seemingly becomes more expensive the bigger the divisor is (for example, the 64 levels of dithering algorithm I have been using is way slower than the 16 levels version).
One solution I haven't been able to implement yet in Demake Shaders (due to health issues) is to use a Bayer Matrix texture with precomputed values baked in for the error dispersion, since texture lookup in the GPU is way faster than array lookups (as reported by PixelEater in viewtopic.php?p=1067486#p1067486 ) and that got me thinking..... Wouldn't such an approach also help the Fuzz effect slowdown? The textures would need to cycle and "animate" to replace the offset (on the x axis and time), sure, so it would take more memory, but such is the nature of pre-computing of exchanging memory usage for execution time.
I'd do the experiment myself but I am not familiar enough with what this shader is exactly doing. When I look at the fuzz effect in the unity port (for a high res "vanilla" example) I see a repeating pattern in the animated noise. Is it maybe possible to generate such a texture beforehand (like an animated error-difussion-like, I mean)? It doesn't need to be exact, but convincing.
The "Software" fuzz style is great, specially when the effect is scaled up. Although we will never have a perfect recreation of the "Vanilla Doom" or software look on the hardware accelerated engine, it comes really close. There is one problem though, and that is it's so "expensive" on less capable hardware (igpu and mobile) that on legacy builds the fuzz setting defaults to shadow (I know there are other technical reasons to pick shadow too, as explained at viewtopic.php?p=1112677#p1112677 but I'd digress).
It always seemed curious to me that the effect, on the same old hardware, is trivial and not expensive on software mode, so I took a quick look at the code found in https://github.com/coelckers/gzdoom/blob/master/wadsrc/static/shaders/glsl/fuzz_software.fp and I think I know what's holding it up.
[code]// Fuzz effect as rendered by the software renderer
#define FUZZTABLE 50
#define FUZZ_RANDOM_X_SIZE 100
#define FRACBITS 16
#define fixed_t int
int fuzz_random_x_offset[FUZZ_RANDOM_X_SIZE] = int[]
(
75, 76, 21, 91, 56, 33, 62, 99, 61, 79,
95, 54, 41, 18, 69, 43, 49, 59, 10, 84,
94, 17, 57, 46, 9, 39, 55, 34,100, 81,
73, 88, 92, 3, 63, 36, 7, 28, 13, 80,
16, 96, 78, 29, 71, 58, 89, 24, 1, 35,
52, 82, 4, 14, 22, 53, 38, 66, 12, 72,
90, 44, 77, 83, 6, 27, 48, 30, 42, 32,
65, 15, 97, 20, 67, 74, 98, 85, 60, 68,
19, 26, 8, 87, 86, 64, 11, 37, 31, 47,
25, 5, 50, 51, 23, 2, 93, 70, 40, 45
);
int fuzzoffset[FUZZTABLE] = int[]
(
6, 11, 6, 11, 6, 6, 11, 6, 6, 11,
6, 6, 6, 11, 6, 6, 6, 11, 15, 18,
21, 6, 11, 15, 6, 6, 6, 6, 11, 6,
11, 6, 6, 11, 15, 6, 6, 11, 15, 18,
21, 6, 6, 6, 6, 11, 6, 6, 11, 6
);
vec4 ProcessTexel()
{
vec2 texCoord = vTexCoord.st;
vec4 basicColor = getTexel(texCoord);
// Ideally fuzzpos would be an uniform and differ from each sprite so that overlapping demons won't get the same shade for the same pixel
int next_random = int(abs(mod(timer * 35.0, float(FUZZ_RANDOM_X_SIZE))));
int fuzzpos = (/*fuzzpos +*/ fuzz_random_x_offset[next_random] * FUZZTABLE / 100) % FUZZTABLE;
int x = int(gl_FragCoord.x);
int y = int(gl_FragCoord.y);
fixed_t fuzzscale = (200 << FRACBITS) / uViewHeight;
int scaled_x = (x * fuzzscale) >> FRACBITS;
int fuzz_x = fuzz_random_x_offset[scaled_x % FUZZ_RANDOM_X_SIZE] + fuzzpos;
fixed_t fuzzcount = FUZZTABLE << FRACBITS;
fixed_t fuzz = ((fuzz_x << FRACBITS) + y * fuzzscale) % fuzzcount;
float alpha = float(fuzzoffset[fuzz >> FRACBITS]) / 32.0;
return vec4(0.0, 0.0, 0.0, basicColor.a * alpha);
}[/code]
The first mod operation seems to help the shader update at the correct framerate and is thus not easily negotiable, but the following modulus operators are for array lookups and coordinates. I have found during testing on my project Demake Shaders that this can become a bottleneck in low powered devices. In shader units hardware the % operator seemingly becomes more expensive the bigger the divisor is (for example, the 64 levels of dithering algorithm I have been using is way slower than the 16 levels version).
One solution I haven't been able to implement yet in Demake Shaders (due to health issues) is to use a Bayer Matrix texture with precomputed values baked in for the error dispersion, since texture lookup in the GPU is way faster than array lookups (as reported by PixelEater in viewtopic.php?p=1067486#p1067486 ) and that got me thinking..... Wouldn't such an approach also help the Fuzz effect slowdown? The textures would need to cycle and "animate" to replace the offset (on the x axis and time), sure, so it would take more memory, but such is the nature of pre-computing of exchanging memory usage for execution time.
I'd do the experiment myself but I am not familiar enough with what this shader is exactly doing. When I look at the fuzz effect in the unity port (for a high res "vanilla" example) I see a repeating pattern in the animated noise. Is it maybe possible to generate such a texture beforehand (like an animated error-difussion-like, I mean)? It doesn't need to be exact, but convincing.