11-18-2007:
The following test:
z > (d-1)/m + 1
can be much more efficiently reformulated as:
(z - 1)*m > d - 1
This removes the division and exposes the "- 1" computations so that they can be more easily folded away.
2-7-08:
*** Inverted Depth
Assuming z and d range from 0..1, the quantities "z - 1" and "d - 1" will be negative.
It is advantageous to invert the depth as follows:
(1 - z)*m < 1 - d
This way "1 - z" and "1 - d" are positive values ranging from 0..1. The value "1 - d" can then
easily be stored in a clamped RGB texture. The projection matrix used to generate both
"z" and "d" can be used to calculate these inverted/offset depth values directly, without any
instruction cost in a shader.
*** Fragment Programs for Shadow Test
Optimal instruction sequences for implementing the SSSM test WITHOUT the depth peeling, with depth
stored in the red channel and the mask stored in the G channel might look like:
TEX vals, fragment.texcoord[M], texture[N], 2D;
MAD diff, fragment.texcoord[M].z, vals.y, -vals.x;
CMP shadowed, diff, vals.y, 0;
or equivalently:
TEX vals, fragment.texcoord[M], texture[N], 2D;
MUL z, fragment.texcoord[M].z, vals.y;
SLT shadowed, z, vals.x;
MUL shadowed, shadowed, vals.y;
Note that if shadowing transparent objects, such that the actual shadowing factor differs from the mask/filter
metric (stored in the G channel), one can simply store the shadowing factor in either the B or A channel and
access it at no additional cost simply as:
TEX vals, fragment.texcoord[M], texture[N], 2D;
MAD diff, fragment.texcoord[M].z, vals.y, -vals.x;
CMP shadowed, diff, vals.w, 0;
If the SSSM test is done WITH depth peeling, using the B channel to store the receiver depth,
an optimized sequece might look like:
TEX vals, fragment.texcoord[M], texture[N], 2D;
MUL z, fragment.texcoord[M].z, vals.y;
SLT shadowed.xz, z, vals;
MAD_SAT shadowed, shadowed.x, vals.y, -shadowed.z;
Note that this above sequence, if the current "z" value is below the receiver depth, subtracts off
the fact that it failed (represented as 1) from the shadow value and then clamps the result back into the
range 0..1, such that the resulting "shadowed" value is always 0 if outside the receiver boundary.
If using the depth peeling, one wants a smoother fade rather than a harsh cut-off at the introduced
shadow map boundary, so one might smooth out the boundary using an instruction sequence like:
TEX vals, fragment.texcoord[M], texture[N], 2D;
MAD_SAT diff.xz, -fragment.texcoord[M].z, vals.y, vals;
CMP shadowed, -diff.x, vals.y, 0;
MAD_SAT shadowed, -smoothscale, diff.z, shadowed;
Note that "smoothscale" value is a constant scale factor controlling the distance over which the
smoothing occurs. With some trickery in this particular sequence, the "z*m < d" test is transformed
into "saturate(d - z*m) > 0", with any negative values being clamped to 0 and still allowing the test
to fail normally. This rewriting of the test gives access to the difference between the recieving
surface and the currently tested surface, appropriately clamped to 0 should the currently tested
surface be equal to or in front of the receiving surface. Once appropriately scaled, this can
be subtracted from the shadowing value to give a smooth transition to no shadowing.
*** Fragment Programs for Applying Shadows Against Combined Lightmaps
If using the resulting "shadowed" value to mask off a lightmap containing multiple combined lights,
then a simple way to apply this value, given a constant shadow light level ("shadowambient") assigned
to the shadowed area might look like:
SUB_SAT darken, light, shadowambient;
MAD light, shadowed, -darken, light;
This assures that if the lighting value is actually lower than the value "shadowambient", that this
lighting value is used instead so that shadows don't accidently brighten very dark areas. This as
opposed to using a single "LRP" instruction which might cause this accidental brightening. An equivalent
but slower (according to informal testing) sequence accomplishing the same thing might be:
MIN shadowambient, light, shadowambient;
LRP light, shadowed, shadowambient, light;