This is a simplified Unreal Lumen GI implementation (SimLumen) based on Unreal’s Lumen GI. We have implemented most of Unreal Lumen’s features.
To perform fast ray tracing, SimLumen builds the mesh SDFs offline using the embree library. We also precompute a global low resolution SDF of the whole scene, which is used in surface cache ray tracing and screen probe voxel ray tracing.
SimLumen builds mesh cards offline in order to capture material attributes (normal, albedo) at run time. Mesh cards store the capture direction as well as the camera capture frustum. Since our meshes in the test example are simple and most are boxes, we generate only 6 cards for each mesh. Each direction corresponds a mesh card. At run time, SimLumen captures the mesh attributes, and copies them into a global surface cache material attributes atlas.
The surface cache describes the lighting of the scene. It contains 5 parts: surface cache material attributes, surface cache direct lighting, surface cache indirect lighting, surface cache combined final lighting and voxelized scene lighting.
With the global surface cache material attributes (normal, albedo and depth), SimLumen computes the direct lighting for each pixel in the surface cache atlas.
What’s more, we have implemented infinity bounce lighting similar to Unreal Lumen. At first, we voxelize the scene. Each voxel has 6 directions. For each direction, we perform a mesh SDF trace and store the hit mesh index and hit distance in the voxel visibility buffer. Then, we inject the surface cache final lighting into the voxel if the voxel hit a mesh.
With the voxelized lighting, we compute the surface cache indirect lighting in the surface cache Atlas space. Firstly, SimLumen places probes every 4x4 pixels in the Atlas space. In the next step, we trace the ray to the voxelized scene via global SDF and sample the radiance within the voxel. In order to denoise the trace result, SimLumen filters the radiance atlas and converts them into spherical harmonics. By integrating the probes around the pixel, we obtained surface cache indirect lighting.
The surface cache final lighting is computed by combining surface direct and indirect lighting.
As we have SDF to trace the scene quickly as well as surface cache that describes the scene lighting, we are able to proform the screen space probe trace.
SimLumen uses importance sampling to reduce the trace noisy. The PDF of the sampling function contains two parts: BRDF PDF and lighting PDF. The BRDF PDF is stored in spherical harmonic form, and we project the pixel BRDF PDF around the probe into spherical harmonics if the PDF is not rejected by the plane depth weight. We use the previous frame’s screen radiance result to estimate the lighting PDF by reprojecting the probe into the previous screen space, since we do not have information about the lighting source in the current frame. To improve performance, SimLumen employs structured importance sampling by reassigning the unimportant samples to those with a higher PDF.
Each probe traces 64 rays to the scene. SimLumen implements a hybrid GI similar to Unreal Lumen. The probes whose distance from the camera is less than 100 meters trace the scene by mesh SDF and sample the radiance from the surface cache atlas. Other probes use Global SDF to trace the scene and sample the radiance from the voxel lighting.
After that, we perform two additional passes to denoise the results. In the first pass, we filter the radiance with a uniform weight. Then, in the second pass, we convert the radiance into spherical harmonics and transform the SH into boardered octchedron form. This is usefull in hardware bilinear sampling in the following pass.
We finally obtained the final indirect lighting by integrating the probes around the current screen pixel and sampling the octchedron form SH by a linear sampler.
Fast Ray Tracing
SimLumen uses signed distance fields to trace the ray with the mesh. SDF is a uniform mesh representation. What’s more, it is differentiable, which means we can calculate the normal at the ray hit position. SimLumen precomputes the mesh SDF and the scene global SDF. GI nearby camera employs mesh SDF to accelerate ray-mesh intersection. Global SDF is used in distant GI and surface cache indirect lighting calculation.
Build Mesh SDF Offline
The mesh SDF volume has different size based on the mesh bounding box. To pack these mesh SDF into a single 3D texture, we split the volume into bricks. Brick size is the same for mesh SDFs.
SimLumen use Embree to calculate the mesh SDF, which is the same as Unreal does. For each voxel, we traces 512 ray samples to intersect with the mesh and finds the closest hit position between these samples. A voxel is considered within the mesh if more than 25% of the 512 ray samples hit the triangle backface. We take the negative value of the closest hit distance.
At runtime, we copy and upload each mesh’s brick texture to the global mesh SDF brick texture and record the brick texture offset. We can find any SDF value at a given position in the mesh volume for each mesh by global brick texture and brick offset.
The GI nearby camera uses mesh SDF to trace rays. In this case, we calculate the possible meshes that can be intersected in the ray tracing direction. Then, we perform SDF tracing for each mesh and find the closest hit position.
At first, SimNanite transform the ray from the world position into sdf volume position. If the ray intersects the volume bound box, we trace the ray from the intersection position. When the sphere trace step is over 64 or the closest distance in the sample position is closer than on voxel size, it means that we hit the mesh surface and should stop the sphere trace.
float R = SampleDistanceFieldBrickTexture(float3(sample_volume_position.x + voxel_offset, sample_volume_position.y, sample_volume_position.z),mesh_sdf_info); float L = SampleDistanceFieldBrickTexture(float3(sample_volume_position.x - voxel_offset, sample_volume_position.y, sample_volume_position.z),mesh_sdf_info);
float F = SampleDistanceFieldBrickTexture(float3(sample_volume_position.x, sample_volume_position.y + voxel_offset, sample_volume_position.z),mesh_sdf_info); float B = SampleDistanceFieldBrickTexture(float3(sample_volume_position.x, sample_volume_position.y - voxel_offset, sample_volume_position.z),mesh_sdf_info);
float U = SampleDistanceFieldBrickTexture(float3(sample_volume_position.x, sample_volume_position.y, sample_volume_position.z + voxel_offset),mesh_sdf_info); float D = SampleDistanceFieldBrickTexture(float3(sample_volume_position.x, sample_volume_position.y, sample_volume_position.z - voxel_offset),mesh_sdf_info);
float3 gradiance = float3(R - L, F - B, U - D); return gradiance; }
Below is a SDF ray-tracing visualization. In this example, the ray-tracing direction for each cube is determined by its offset from the center cube. RGB colors represent the hit position’s normal. X is represented by red color, Y by green color, and Z by blue color.
Global SDF
The global signed distance field is a low resolution SDF of the whole scene. We precompute the global SDF offline. Global SDF differs from scene mesh SDF in that the scene mesh SDF is stored in brick textures with fixed z-dimensions, whereas global SDF is stored in a size-scalable volume texture based on the scene bounding box.
Surface Cache
The mesh SDF trace allows us to determine the hit position of a ray, however it does not provide information regarding the material attributes (albedo, normal, etc.) at the hit position. Unreal Lumen uses mesh cards to capture these material attributs runtime.
Mesh cars describe the material attributes capture infomation. It can be generated offline. However, material attributes capture must be performed at runtime, since occlusion between scene meshes can’t be determined offline.
SimLumen Card Generation
In Unreal, a mesh may have many mesh cards depend on mesh complexity. SimLumen simplifies mesh card generation: generates a fixed number mesh card (6 direction) based on the mesh bounding box.
We calculate the capture position and direction from the bounding box directly. The mesh card depth is determined by software raytracing. For each texel in the mesh card, we trace a ray in the mesh card direction, calculate the ray-mesh intersection and find the furthest intersection distance.
SimLumen captures the mesh card at runtime. After the mesh card capture is completed, we copy these mesh card attribute textures into a global card atlas.
Below is a visualization of the mesh card’s normal and albedo for a scene:
Surface Cache Lighting
Unreal Lumen has implemented an infinity bounce lighting by the combination of the surface cache lighting and voxel lighting. Here is the surface cache flow graph:
Step 1: Calculate direct lighting with surface cache attributes (albedo / normal).
Step 2: Combine direct lighting and indirect lighting together. In the first frame, direct lighting results in black, since no light has been injected into the scene voxel.
Step 3: Inject the combined light into the scene voxel.
Step 4: Calculate indirect lighting used in the next frame.
Surface Cache Direct Lighting
The world position of a mesh card pixel is calculated by the card rotation matrix and card depth.
Voxel visibility buffer stores the hit mesh index and hit distance in x/y/z direction. As a persistent data, it is only updated when the meshes’ positions change. SDF traces are performed for each voxel along the xyz direction for meshes that are possibly intersected with its center. If the ray from the voxel center along the x/y/z direction hit a mesh, we store the mesh index in the voxel. This will be used in the next light injection pass.
Lighting Injection
We can obtain the intersection mesh and intersection position directly from the voxel visibility buffer. After that, we transform the hit world position into mesh card space location and calculate the final light atlas UV. The final light is the combination of the direct light and indirect light. However, the indirect lighting remains black until the second frame as it is dependent on voxel illumination.
The first step is to calculate the radiance of the current surface cache pixel using global SDF, filter, and store the radiance in a radiance atlas. We then convert radiance into SH, which allows us to perform probe interpolation to reduce lighting noise. The irradiance in a unit area is the integral of the f(x) over the half sphere:
E(p)=∫ΩL(p,i)max(0,n⋅i)di
F(x) can be split into two parts: the lighting function and the diffuse transfer function:
E(p)=∫ΩL(i)⋅H(i)di
L(i) is reconstructed from the spherical harmonic by SH factors and biasis function:
L(i)≈ΣlkBk(i)
We project the radiance into the basis function to get the SH factors in Convert SH pass:
If we project both the illumination and transfer functions into SH coefficients then orthogonality guarantees that the integral of the function’s products is the same as the dot product of their coefficients:
E(p)=Σk=0n2lkhk H(i)=max(0,n⋅l)
The irradiance of a pixel is calculated in the integrate pass:
In this pass, SimLumen filters the radiance atlas to reduce the noise. We sample the radiance around the current texel and accumulate weighted samples. Radiance sample weights in Unreal Lumen are dependent upon a number of factors, including the texel’s World space plane and the distance between the planes.
filtered radiance atlas:
Convert To SH
Radiance atlas results are still noisy after filtering, since we only have 16 samples per probe. We solve this problem by converting the tile radiance into two bands SH, which allows us to interpolate the probes more easily.
Finally, sample the probes around the current pixel and calculate the weights based on the atlas position. Then, accumulate the SH weights and weighted SH, calculate the basis function using the current pixel’s world normal, and dot product the basis function with the SH result. By dividing it by the total sum of SH weights, we get the final radiance value for the current pixel.
We place the probe in screen space for each 8x8 pixels and use octchedron mapping to map the screen coordinates into spherical coordinates. Each probe trace 64 rays into the scene.
Importance Sampling
It’s too noisy if we employ uniform sampling rather than importance sampling. without importance sampling Vs with importance sampling:
What we do in the importance sampling part is searching the rays that orientates to the lighting source and world normal. That is to say, we peroform importance sampling for BRDF(fs) term and input radiance(Li) term:
N1Σk=1NPkLi(l)fs(l−>v)cos(θl)
BRDF PDF
In this step, SimLumen generate the three band sphere harmonic factors for the BRDF function. We sample the screen pixels around the screen probe and compute the influence weight on the probe. If the weight is over the threshold, convert the BRDF to SH and accumulate the SH. Then write the result to the BRDF SH buffer.
The pixel normal may be located in a different plane from the probe. Therefore, we compute the plane weight for the given pixel and reject the pixel if the depth weight is over the threshold. Then, store the results of those valid pixels in a group shared array.
After that, perform a parallel reduction to accumulate these SH factors.
Finally, the first nine threads store the 9 SH factors in output BRDF SH buffer.
The light source direction in the current frame is unknown. In order to search the light direction, we assume lighting changes slightly and reuse the previous frame’s lighting result.
Then, reproject the probe into the previous frame screen position and find the corresponding direction texel.
Calculate the lighting pdf based on its luminance.
Unreal Lumen adopted a new mechanism called structured importance sampling to reassign the not important samples to those importance directions.
The first step is to calculate the PDF of probe 8x8 pixels. We can obtain the world ray direction by pixel’s uv in the group based on the equi area spherical mapping algorithm. The BRDF PDF in this direction can be calculated by the BRDF PDF SH computed in the last pass.
FThreeBandSHVector direction_sh = SHBasisFunction3(world_cone_direction); float pdf = max(DotSH3(brdf, direction_sh), 0);
float light_pdf = light_pdf_tex.Load(int3(dispatch_thread_idx.xy,0)); bool is_pdf_no_culled_by_brdf = pdf >= MIN_PDF_TRACE;
float light_pdf_scaled = light_pdf * PROBE_SIZE_2D * PROBE_SIZE_2D; pdf *= light_pdf_scaled; if(is_pdf_no_culled_by_brdf) { pdf = max(pdf, MIN_PDF_TRACE); }
Perform a GPU sort from low to high to find those ray directions that need refinement.
We refine the rays in groups of three. If the maximum PDF among the three rays is less than the minimum PDF threshold, these samples are discarded and refinement is performed. The ray refinement is similar to the mip map. Double the coordinates, compute the local coordinates based on the ray indexes (0,1), (1,1), and (1,0), respectively. Coordinate (0,0) is the position of the corresponding ray to the refine group.
if (merge_idx == 0) { InterlockedAdd(num_rays_to_subdivide, 1); } } }
Screen Space Probe Trace
Unreal Lumen is a hybird global illumination solution consisting four GI methods: SSGI, Mesh SDF trace, global SDF trace and cube map. Unreal Lumen performs screen space GI first by sampling the previous scene color. When the SSGI fails to hit, Unreal Lumen performs a mesh SDF trace and samples the surface cache final lighting atlas. Mesh SDF trace is only performed on probes near the camera (positions in 40m radius around the camera). For those further probes, Unreal Lumen performs Global SDF trace and samples the scene voxel lighting. If all of these methods fail, Unreal Lumen falls back to sample the cube map.
In SimLumen, we only perform two trace methods: sampling the surface cache final lighting by mesh SDF trace and sampling the scene voxel lighting by global SDF trace.
Screen Space Probe Mesh SDF Trace
We perform mesh SDF trace if the probe’s distance to the camera is less than 100m.
Sample 64 directions for each screen space probe. The sample direction is obtained from structured importance sampling table. This table stores the ray coordinates. The refined ray’s coordinates (mip 0) range from (0,0) to (16,16), and the other ray’s coordinates (mip 1) range from (0,0) to (8,8). Divide the ray coordinates by the mip size and transform them into range (0,1). We can obtain the mapped direction by EquiAreaSphericalMapping with mapped UV.
Calculate the hit world position if the ray hits the scene SDFs. Then, sample the surface cache card based on the world position. Unreal Lumen samples the surface cache cards three times based on it’s ray directions and the normal in the hit position. Considering our meshes are not too complex, SimLumen only samples once based on the maximum ray direction.
The surface cache card index is calculated from the mesh card start index and the card direction offset index.
With the mesh card index, we can get the surface cache atlas UV and sample the final lighting.
The probes further than 100m using voxel lighting trace, which is similar to surface cache sampling. There are several differences between them: the probes trace the scene using a low resolution global SDF rather than single mesh SDFs, and they sample voxel lighting rather than surface cache lighting.
Voxel lighting is sampled three times based on the ray trace direction. The sample weights are computed by the dot product result between the ray trace direction and voxel face direction.
Since the samples may be refined, we perform an additional composition pass to accumulate the ray samples. Each comptue group processes one probe ( 8x8 samples).
Filter the screen space radiance to denoise the result. Radiance filter weight is a combination of angle weight and depth weight in Unreal Lumen. We use a simple uniform weight instead of an angle/depth weight because our scene is simple and most of the mesh is cubes.
To denoise the radiance result further, we convert the radiance into shperical harmonic, which acts as a low-pass filter. After that we transform the SH into octahedron representation. The reason we perform this additional pass is to use hardware bilinear filter in the next integration pass. In order to avoid sampling the other probe’s result, we add a pixel board around the center.
For each screen pixel, we sample 5 screen probes and accumulate the samples. After dividing the result by the sum of the weights, we are able to obtain the final result for screen indirect lighting: