Video Series: Practical Real-Time Ray Tracing With RTX

RTX introduces an exciting and fundamental shift in the way lighting systems work in games and applications. In this video series, NVIDIA Engineers Martin-Karl Lefrancois and Pascal Gautron help you get started with real-time ray tracing. You’ll learn how data and rendering is managed, how acceleration structures and shaders work, and what new components are needed for your pipeline. We’ll also include key slides from the presentation on which this video series is based.

Part 1: Ray Tracing: An Overview	Part 2: Data and Rendering
Part 3: RTX Acceleration Structures	Part 4: Ray Tracing Shaders
Part 5: Ray Tracing Pipeline	Part 6: Additional Help

These videos are rich with information, but don’t worry about jotting things down while you watch; we’ve taken the notes for you. You’ll find the “key things” from every clip presented as bullets. We strongly recommend you watch the videos before digging into the bullets, though, to ensure you are getting the proper context.

Part 1: Ray Tracing: An Overview (3:15 min)

In this video, Martin-Karl offers a quick primer on the difference between rasterization and ray tracing.

Key things from part 1

Ray tracing is a fundamentally different rendering process than rasterization, shown in figure 1.

Sending a ray form the eye to the pixel — Figure 1. Instead of the triangle being projected on the screen, we take the position from the eye, send a ray through the pixel, and try to find the triangle underneath.

Further rays can be traced to compute the shading for that pixel.
When you trace a ray, it hits the closest triangle and returns that to you.
You don’t have to sort it out. It will just return the closest triangles along that ray.
What happens when you have a lot of triangles in your scene? How can that get processing handled quickly? You need an acceleration structure, shown in figure 2. You have a big bounding box around all of your objects in the scene, and one algorithm will split that box, and do that repeatedly, until the box contains just a few triangles. Then, you’ll be able to test against those triangles.

Ray tracing acceleration structure diagram — Figure 2. The construction of this acceleration structure is provided by RTX API.

Part 2: Data and Rendering (11:14 min)

Martin-Karl explains what comprises data in real-time ray tracing, and makes clear how acceleration structures, the pipeline and the binding tables work together.

Key things from part 2

Graphics programs include UI and interactions, an engine update, data, and rendering. We are specifically interested in the data and rendering components for ray tracing.
Rasters include buffers of vertices and indices that contain all of the triangles in your scene, as shown in figure 3, as well as your vertex and fragment shaders.

diagram of what data makes up a raster scene — Figure 3. Raster scene construction.

Together, they will help to draw your scene. In raytracing, you have to convert the buffers of vertices and indices into acceleration structures. Likewise, the vertex and fragment shaders must be converted into a different type of shading system. In a raster, these are separate. In a ray tracer, you have to combine these things.
The Bottom Level Acceleration Structure (BLAS) and Top Level Acceleration Structure (TLAS) represent two parts. Why is the structure split in this way? Let’s consider an example with a city, a car, and a truck, shown in figure 4.

revid2screen3 — Figure 4. Splitting the acceleration structure into top and bottom halves improves performance.

One acceleration structure holds the city. You place all your buildings inside that. This is all static; you want to render and ray trace that piece very rapidly.
Another acceleration structure holds the car. In this example, two instances use it because the same car can be a different color in the scene.
Finally, let’s add a truck using one instance.
You can easily rebuild the top level. Cars can move throughout the city, and you don’t have to rebuild the entire system.
You can rebuild in the bottom level. If one structure must adjust – say, a car crashes – you can make that change without having to change the other structures.
You want to minimize the number of bottom structures for performance reasons. Tracing a ray through two overlapping BLAS requires doing twice the work to find the closest intersection point… it’s important to have that separation.
Let’s take a look at the ray tracing pipeline, seen in figure 5.

ray tracing pipeline diagram — Figure 5. The Ray Tracing Pipeline

The pipeline consists of a set of shaders, as outlined in figure 6.
You start with a pixel that goes to your ray generation shader. That’s where you decide to start and the direction you shoot your ray, a process called ray generation, performed on a per-pixel basis. This will be called for every single pixel you have prepared.
Then, it will go to the traverse, and call the intersection shader. There is a built-in one for triangles (which can be overridden).

ray tracing shader relationship diagram — Figure 6. Ray tracing shader architecture

There is also an any hit shader. This is built into the pipeline but you can override it. For example, a tree with leaves whose shape is defined by an alpha texture. You want the system to go through all the leaves until it really hits something. It tests for alpha, and generates a closest hit only when the leaf body get really touched, not when it just touches the outside of the leaf.
You can also use this for shadow rays.
The closest hit shader comes into play when you actually touch the object. Closest hit holds the code for the shading. You can also trace new rays from there and trace it for your shadow.
The miss shader kicks in when you don’t touch anything. You totally miss all the objects inside your scene. This would be your environment shader, for example.

All together now

The diagram in figure 7 shows the possibilities when you fully use the ray tracing shader pipeline.

ray tracing shader flow diagram — Figure 7. This diagram shows one possible data flow using ray tracing shaders.

You have a TLAS (Top Level Acceleration Structure) and a BLAS (Bottom Level Acceleration Structure).
The pipeline is where you find your compiled shaders and where you declare all your shaders.
The shading binding table will bind the elements of your shaders.
Together, they all maintain a complex relationship, but not too complex, as shown in the assembly diagram in figure 8.

RTX ray tracing assembly diagram — Figure 8. This assembly view shows the relationship between the ray tracing pipeline and the shader binding table. This is what you have to do in DX12.

When it comes to rendering, ray tracing requires just one call, DispatchRays. Then you can move to UAV, and render target.

ray tracing rendering call flow — Figure 9. The rendering side of ray tracing uses just a single call.

Part 3: RTX Acceleration Structures (8:04 min)

Pascal now provides a deeper look into what happens when you try to take your raster-based application and make it work with ray tracing.

Key Things from Part 3

While the focus of this video series is on Direct X12, the fundamentals all carry over to Vulkan. Read our blog post about how Vulkan ray tracing works with RTX for more details.

Regarding acceleration structures

Separate the scene into bottom-level instances (BLAS)
Generate a bottom-level acceleration structure for each instance
Fewer BLAS is better.
Keep dynamic objects in their own BLAS.
Use refitting for dynamic objects
The generate the top-level acceleration structure (TLAS)

How do we build the BLAS?

Start from a descriptor, as shown in figure 10.
You’ll be able to re-use the data used in your raster-based application. Typically, you can point to your vertex and the index buffer, and access the exact same data. You can describe your objects with whatever ranges corresponded to each object in the raster-based application.

BLAS setup — Figure 10. Setting up bottom-level acceleration structures

You can put together different objects in a BLAS and locate them using a transform buffer which will bake them in one acceleration structure.
The triangles will be internally transformed and put in the right place inside the acceleration structure.
We build another descriptor that will give us some information on what our BLAS will be. We need to define whether we want to be able to update our structure, as figure 11 shows.

More BLAS setup slide — Figure 11. More BLAS setup requirements

Obtain pre-build information

Determine the size of the resulting acceleration structure.
The scratch data size describes how much memory the acceleration structure builder requires for the process of building. You need to allocate this memory.
In DX12 has no hidden allocation; everything must be done explicitly.
The scratch space is only used during the build. Afterwards, you can deallocate.
Once allocate scratch space, you can re-use all the descriptors you had. You can create another descriptor with an update flag for optional refitting. Finally, your BLAS can be built, which happens quickly on the GPU.

Build the TLAS

It’s like handling a scene graph, only you have two levels.
Each instance has an ID – something that describes where to find the shader that corresponds to the object, as you can see in figure 12.

TLAS setup diagram and code — Figure 12. Top level acceleration structure setup

Again, we have a transform, this time in the TLAS. If we want to move a complete bottom level around in the world, we can use this and have very fast re-fits without having to touch the actual geometry.
You follow the same principle that you did with the BLAS, but instead of geometry, we’ll be getting instanced information.

Part 4: Ray Tracing Shaders (7:50 min)

Pascal is going to provide a deeper look into what happens when you try to take your raster-based application and make it work with ray tracing.

Key things from part 4

A ray payload is a structure passed from one shader to another.
It all happens under the hood in RTX.
A smaller payload is a lot better!
The new Direct X compiler allows you to give semantics to your shader.
You can compile several shaders together, and still know which shaders are useful for what purpose (Figure 13).

Ray tracing shaders in code — Figure 13. Example of a ray tracing shaders working in concert

The dispatch ray index is like the Thread ID in CUDA; it identifies which thread is currently running and the dimensions of the image.
You describe a ray with origin, direction, and minimum and maximum distance between which we look for intersections.
Figure 14 shows the need to call a new instruction call, TraceRay, which takes a few parameters.

TraceRay call showing parameters — Figure 14. Now we finally trace the ray using the `TraceRay` call.

The first is the TLAS.
Raymask allows you to mask out some objects. For example, if you decide an object does not cast shadows, the ray mask can be used in conjunction with the instance mask to prevent the ray from intersecting the object.
Apply a few offsets. The first offset identifies which shader to use for a given object. The second offset describes where in the list of shaders we should start.
We can have several miss shaders. One may look at the environment, one may return and say, “nothing’s visible”, etc.
Finally, we pass our ray and our payload (Figure 15). This payload will directly come back. As soon as we call TraceRay we can assume the whole ray tracing has happened and the payload is filled.
You can write the results of your shading directly in the output buffer (Figure 15).

Writing out the ray tracing result diagram — Figure 15. Writing the output buffer

Avoid recursions! Let the raygen do the heavy lifting. Flattening the recursive ray tracing into a loop in the ray generation results in much less stack management.
In raster, you just project your triangles on screen, then interpolate your attributes. RTX gives you the index of the triangle you intersect. You then need to fetch all the attributes yourself and interpolate them, as the code example in figure 16 shows. Note that you need to be able to access the vertex and index buffer of your geometry with whatever layout you decided to have.

Closest hit code sample image — Figure 16. Closest hit code sample. The primitive index tells you which triangle has been hit.

Closest hit shaders can also shoot rays, e.g. shadows.
You access the primitive index, which yields which triangle has been hit during the rendering; then we can interpolate.
Then you write the payload, and the shader is finished.
To avoid recursion, you can have your ray generation carry a slightly bigger payload. The hit will return its hit information (Example: I hit this triangle at that coordinate), then the ray generation can generate another ray from there, and then continue, weighting the contribution of the second bounce, and so on. WIth this process, we end up with less stack management and far less memory traffic.
The final type of shader is the miss shader.
The miss shader writes directly into the payload, typically returning a fixed value, which can be anything (figure 17).

image of miss shader code example — Figure 17. Miss shader code example

Part 5: Ray Tracing Pipeline (8:04 Min)

Pascal provides a deep dive into the structure of a ray tracing pipeline, breaking down the key components.

Key things from part 5

Now that all shaders have been defined, let’s look at how to assemble that into something that can be rendered. (You are effectively creating an executable of the ray tracing process).
The ray tracing pipeline in DX12 is made up of a series of sub-objects, illustrated in figure 18. For example, you could have a sub-object for different shaders, a sub-object for how to assemble the shaders together, and so on.

D3D 12 ray tracing subobjects diagram — Figure 18. D3D12 ray tracing uses sub-objects which consist of various shaders in useful configurations

The first sub-object to look at is the libraries (figure 19).
- You provide your code to the DirectX shader compiler (DXC), which outputs a DLXIL library. That can be run as a sub-object.
- You’ll do that for all the shaders we have.

DirectX libraries used as sub-objects — Figure 19. Libraries as a sub-object

A Hitgroup describes everything that can happen on one surface for one given type of shader. It includes the intersection shader, any hit, and a close hit shader, shown in figure 20.
These combine these to give us all the code we need.
It’s important to note the intersection and any hit shaders. We have some built in to intersect triangles (and to do nothing in the case of an any hit).
Leave intersection and any hit to nullptr when possible.

Hit group illustration — Figure 20. Hit groups comprise shaders involved with intersection of objects

Another sub-object is the shader configuration, which describes the size of the payload you want to use. Shader configurations define sizes of the attributes used for intersections.
Keep those as small as possible; the built-in intersection shader returns 2 floats.
Associations in DXR associate shaders with a payload and attribute properties. You need to do this explicitly. Figure 21 illustrates how this can be performed.

Association sub-object configuration diagram — Figure 21. Configuring an association sub-object

Each shader in DXR contains a root signature that describes all the resources that will be accessed, shown in figure 22.
Each shader used needs its own root signature.
The root signature also goes through an association object.

Root signature association illustration — Figure 22. Root signature association

A pipeline configuration exists that decides how many bounces you can make (figure 23).
Avoid recursion by flattening into a loop in raygen.

Pipeline configuration diagram and code — Figure 22. The pipeline configuration enables you to specify maximum ray bounces

Now let’s examine the shader binding table, which associates the geometry with the shaders we will execute, outlined in figure 23.
The shader binding table has a number of entries, a descriptor for the shader, and all the pointers to the external resources.
It needs to follow the exact layout of the root signature you provided.
Each shader type requires a fixed size entry.

Shader binding table layout diagram — Figure 23. The shader binding table points to external resources, among other things

The descriptor setup determines how we can interpret the shading binding table, as shown in figure 24.
You need to indicate where you can find the ray generation shader.
You must define the size of one entry in the ray generation, shader, hit groups and miss shaders
You must also provide the dimensions of the image to render.

Descriptor setup code and diagram image — Figure 24. Descriptor setup code example

Now that we’ve rendered our first image, let’s think about shadows.

Here are some simple shader examples for shadow rays, shown in figure 25.
If we hit something it’s “true”.
If we do not, it’s “false”.

Shadow ray code examples — Figure 25. Simple shader code examples ofr adding shadow rays

In our original closest hit shader, we need to add another trace ray, as you can see in figure 26.
This time we will offset our hit group to say, “I want the second hit group for the object I’m going to hit, and the second miss, also.”

Closest hit shadow ray example diagram — Figure 26. Closest hit shadow ray example

In our original closest hit shader, we need to add another trace ray.
This time we will offset our hit group to say, “I want the second hit group for the object I’m going to hit, and the second miss, also.”

Part 6: Additional help with ray tracing (3:47 min)

Martin and Pascal provide guidance on next steps, and detail a range of supporting materials that will help you on your way towards adopting real-time ray tracing in your applications.

NVIDIA will be continuing to build out a “Helper’s Toolbox” for RTX. Additional resources include:

DXR Blog
Raytracing links:
- More on NVIDIA RTX
- Getting started ray tracing tutorial
- GameWorks ray tracing overview
Resources:
DevTech:
- Pascal Gautron: pgautron@nvidia.com
- Martin-Karl Lefrancois: mlefrancois@nvidia.com

Video Series: Practical Real-Time Ray Tracing With RTX

Part 1: Ray Tracing: An Overview (3:15 min)

Key things from part 1

Part 2: Data and Rendering (11:14 min)

Key things from part 2

All together now

Part 3: RTX Acceleration Structures (8:04 min)

Key Things from Part 3

Regarding acceleration structures

How do we build the BLAS?

Obtain pre-build information

Build the TLAS

Part 4: Ray Tracing Shaders (7:50 min)

Key things from part 4

Part 5: Ray Tracing Pipeline (8:04 Min)

Key things from part 5

Part 6: Additional help with ray tracing (3:47 min)

Related resources

Tags

About the Authors

Video Series: Practical Real-Time Ray Tracing With RTX

Part 1: Ray Tracing: An Overview (3:15 min)

Key things from part 1

Part 2: Data and Rendering (11:14 min)

Key things from part 2

All together now

Part 3: RTX Acceleration Structures (8:04 min)

Key Things from Part 3

Regarding acceleration structures

How do we build the BLAS?

Obtain pre-build information

Build the TLAS

Part 4: Ray Tracing Shaders (7:50 min)

Key things from part 4

Part 5: Ray Tracing Pipeline (8:04 Min)

Key things from part 5

Part 6: Additional help with ray tracing (3:47 min)

Related resources

Tags

About the Authors

Comments

Related posts

Ray Tracing Essentials Part 4: The Ray Tracing Pipeline

From Rasterization to Full Real-Time Path Tracing: The Evolution of Graphical Rendering Techniques

The Authoritative Book on Real-Time Ray Tracing Has Arrived

Effectively Integrating RTX Ray Tracing into a Real-Time Rendering Engine

RTX Coffee Break: Introduction to Real-Time Ray Tracing (5:41 minutes)

Related posts

Advanced API Performance: Descriptors

Advanced API Performance: Shaders

Accelerated Motion Processing Brought to Vulkan with the NVIDIA Optical Flow SDK

GPU-Accelerated Video Processing with NVIDIA In-Depth Support for Vulkan Video

Performance Boosts and Enhanced Features in New Nsight Graphics, Nsight Aftermath Releases