Parallel Shader Compilation for Ray Tracing Pipeline States

In ray tracing, a single pipeline state object (PSO) can contain any number of shaders. This number can grow large, depending on scene content and ray types handled with the PSO; construction cost of the state object can significantly increase. The DXR API makes it possible to distribute part of the creation work to multiple threads by utilizing collections. A collection is a ID3D12StateObject with type D3D12_STATE_OBJECT_TYPE_COLLECTION.

Shader source code compiled to DXIL libraries remains in hardware agnostic format after compilation. When constructing a PSO, the DXIL libraries given as input compile to hardware native format. It is possible to create collection state objects before PSO creation that hold shader code in hardware native format. They are created like the PSOs with ID3D12Device5::CreateStateObject(), but they have a different type. These collections can be given as inputs to PSO creation call instead of the DXIL libraries. Subobjects with type EXISTING_COLLECTION are used to define the inputs. The final PSO creation call becomes cheaper when the shader code is already in hardware native format.

Multiple threads can be used for state object creation, as shown in figure 1. One collection can store one or more shaders that are compiled from one or more DXIL libraries. Each collection is created with a single thread, but as lots of shaders can be used in one PSO, it is possible to distribute the related collection creation work to multiple threads. Additionally, one collection can be potentially used in multiple PSOs. It can be a good idea to cache created collections for reuse.

Work related to building ray tracing pipeline state multithreaded diagram
Figure 1. Illustration of how work related to building a ray tracing pipeline state can be distributed to multiple threads by using collection state objects.

In order to allow compilation of shader code to native format during collection creation, the collections must define most of the state that would be defined in the final PSO as well with subobjects. A RAYTRACING_SHADER_CONFIG subobject must be defined. All shaders must have root signatures fully defined with GLOBAL_ROOT_SIGNATURE and LOCAL_ROOT_SIGNATURE subobjects. Additionally, a HIT_GROUP subobject must be defined for intersection, any-hit, and closest-hit shaders. Note that the RAYTRACING_SHADER_CONFIG subobjects in all collections and in the PSO itself in a PSO creation call must match. A RAYTRACING_PIPELINE_CONFIG subobject does not need to be defined in collections. You should avoid using the state object flags ALLOW_LOCAL_DEPENDENCIES_ON_EXTERNAL_DEFINITONS or ALLOW_EXTERNAL_DEPENDENCIES_ON_LOCAL_DEFINITIONS for best performance from collections. They may prevent compiling shader code to native format or increase memory consumption.

Conclusion

In case ray tracing pipeline state object creation time grows high in your application, taking a look at collection state objects can be worth it. In best cases, distributing the work to multiple concurrent threads through collections remarkably reduces the time spent in PSO creation.

No Comments