Turing Variable Rate Shading in VRWorks

NVIDIA Turing GPUs enable a new, easily implemented rendering technique, Variable Rate Shading (VRS). VRS increases rendering performance and quality by applying varying amount of processing power to different areas of the image. VRS works by changing the number of pixels that can be processed by a single pixel shader operation. Single pixel shading operations can now be applied to a block of pixels, allowing applications to effectively vary the shading rate in different areas of the screen.

Background

Visual fidelity in modern VR games continues to improve with each new release. Those improvements demand more from the GPU. The minimum requirement of 90 FPS for immersive VR experience only adds to this challenge. On the flip side, VR Headset Manufacturers continue to drive technological innovation, bringing the latest and greatest headsets with the ever-increasing display resolutions needed to provide a more realistic VR experience, as figure 1 shows.

VR HMD Resolutions Turing GPU VRS
Figure 1. Virtual reality headset native resolutions continue to climb

Balancing all these factors If the VR game decides to render everything at native resolution becomes extremely challenging. Higher resolution plus increased visual fidelity means games struggle to maintain the sweet-spot of 90 FPS. This unfortunately means sacrificing a visual effect or two just to meet the frame rate budget. Another drawback of the traditional rendering approach is how it always wastes precious pixel shading cycles due to inherent “overshading”. In other words, the game spends time rendering pixels eventually thrown away, such as pixels at the edges due to the physical lens distortion in VR headsets or how human eyes perceive detail in peripheral vision. Finally, games always want to have more performance headroom available so that they can improve the visual experience. Ultimately, we need a smarter way to render games.

One of the ways in which games can tackle the pixel shading load is by rendering at reduced resolution and then upscaling to the resolution required by the VR headset, such as in-game resolution scale factors or options exposed in VR runtimes, as shown in figure 1. But this approach has its own drawbacks:

  1. The resolution of the entire frame starts reduced. Any crucial part of the scene which actually demands higher quality rendering immediately suffers, such as character details in focus in the center of the scene or text the player needs to be able to read. Attempts to customize specific regions as per rendered resolutions or details is extremely tricky or even impossible in certain cases.
  2. When the game performs at reduced resolution, the rasterization happens at an even lower resolution. This ends up creating highly jagged edges of the objects (aliasing) and in the worst case can compromise the entire visibility of smaller objects resulting in shimmering and flickering.
Turing VRWorks VRS reduced resolution rendering
Figure 1: Reduced Resolution Rendering followed by upscaling

Given the frame rate budget, developers also want techniques that help improve visual quality and eliminate common rendering anomalies. One of the most common example is aliasing which can be mitigated by antialiasing techniques such as multisampling or supersampling. Antialiasing can also cause adverse performance impacts, resulting in wasted rendering as well as lack of customizability.

Previous NVIDIA architectures also provide techniques to minimize excessive pixel shading. Multi-Resolution Shading (MRS) and Lens-Matched Shading (LMS). MRS is more suited for applications that need limited flexibility in terms of pixel shading patterns. LMS caters to problems arising from headset lens distortions. Also note that rasterization happens at lower resolution in both of these features, so they require a separate upscaling pass.

Introducing Variable Rate Shading

The new NVIDIA Turing architecture enables a new way to optimize the pixel shading load by using variable rate shading (VRS) (figure 2).

  • VRS reduces excessive pixel shading load
  • VRS allows precisely customizing shading rates within the frame
  • VRS selectively allows improving visual quality with supersampling
  • VRS preserves edges and visibility of the objects
  • VRS works at screen space making it simple to integrate into applications
VRS rendering flow
Figure 2. Variable rate shading architecture

The Technology: Shading Rates

VRS performs rasterization at native resolution. Instead of executing the pixel shader once per pixel, VRS can dynamically change the shading rate during actual pixel shading in one of two ways:

  • Coarse Shading. Execution of the Pixel Shader once per multiple raster pixels and copying the same shade to those pixels
  • Super Sampling. Execution of the Pixel Shader more than once per single raster pixel

The configurations in which the pixel shader executions can be controlled (shading rates) include:

  • Coarse Shading: 1×1, 1×2, 2×1, 2×2, 2×4, 4×2, 4×4
  • Supersampling: 2x, 4x, 8x

Let us take an example of 1×2 with 4 samples per raster pixel. In this case, the pixel shader will be executed once for the group, as shown in figure 3.

Turing VRWorks VRS
Figure 3. Subpixel view of 1×2 coarse shading rate

 

Here, the pixel shader executes once. By default, the attribute is evaluated at the center of the larger “coarse” pixel of size 1×2. The overall evaluated output replicates to all the samples covered by the primitive. If all the samples are covered, both the pixels receive the exact same shade evaluated at the center. You can also perform the interpolation at the centroid of the coarse pixel. Centroid interpolation samples at locations mathematically computed to be in between other samples but somewhere within the covered area of the pixel. It may require extrapolation of end points from a pixel center. This generally improves antialiasing quality, especially if a pixel is partially covered and the pixel center is not covered.

Similarly, we can visualize other shading rates such as 2×1, 2×2, shown in figure 4.

Turing VRWorks VRS 2x1 and 2x2 coarse shading coverage
Figure 4. Subpixel view of 2×1 and 2×2 coarse shading rates

Coverage as coarse as 4×4 with 1 sample per raster pixel is possible. Figure 5 shows the pixel being evaluated at the center of the large area of 4 pixels by 4 pixels and all the samples covered get the same shade. If all the samples are covered, This theoretically achieves pixel shading of 16 pixels in the same time as just a single pixel!

VRWorks Turing VRS 4x4 coverage
Figure 5. Subpixel view of 4×4 coarse shading rate

NVIDIA Variable Rate Shading API

The variable rate shading API provides an interface for applications to set up the feature in a flexible way for different use cases. It consists of these enabling steps:

  • App creates a regular Texture2D, the shading rate resource
  • App calls NvAPI to create a new custom type of shading rate resource view
  • App populates this texture with shading rate pattern
  • App programs shading rate lookup table and enables variable shading rate mode
  • App sets this shading rate resource view

The program needs a shading rate surface — a Texture2D resource — which maps to the render target in screen space. Every 16×16 tile of pixels maps to a shading rate entry in the shading rate surface. This entry decides the shading rate for that 16×16 tile.

A shading rate lookup table adds an additional level of flexibility. The lookup table translates the entry in the shading rate surface to the actual shading rate. This look up table supports per-viewport programmability.

One key advantage of this programmability is that the game can keep the same shading rate pattern on the surface. It only needs to swap out the look-up table to quickly change the aggressiveness of the coarse shading. Application requirements determine order and values filled in this look-up table.

One example could be that the application wants to restrict itself to shading rates 1×1 and 2×2. Then it can just fill the table with the same value starting at index 0 and filling in the rest of the table, as you can see in figure 6. The logic to populate the Shading Rate Surface is very simple and involves only filling 1’s where it needs to perform coarse shading, leaving others as 0.

Another example could be an application choosing 4x supersampling along with 2×2, 2×1 and 1×2 shading rates. In that case application only fills in those values in any order per the requirement.

VRWorks Turing VRS SRS LUT RT
Figure 6. Shading rate look-up table and shading rate resource view usage

The pseudo-code shows actual usage of APIs to set up variable rate shading:

// Application programming to set up Variable Rate Shading

// Create the shading rate surface
Srs = CreateTexture2D ();

// Create a view of the shading rate surface
Srrv = NvAPI_D3D11_CreateShadingRateResourceView (Srs);

// Populate Variable Shading pattern as required
PopulateShadingRatePattern (Srs);

// Set up the shading rate look-up table
NvAPI_D3D11_RSSetViewportsPixelShadingRates ();

// Bind the shading rate surface view to the graphics pipeline
NvAPI_D3D11_RSSetShadingRateResourceView (Srrv);

Loop

   {
      if (required_based_on_use_case)
      {
         UpdateShadingRatePattern();
      }
      Draw();
}

Variable rate shading currently supports D3D11, OpenGL, Vulkan. Support for D3D12 is coming soon.

Use Case: Gaze-tracked Foveated Rendering

When looking around, the human eye can recognize extreme details in the scene only in a narrow region. The peripheral region normally only grasps fast-moving content with little detail.

This enables another major advantage a customizable shading rate pattern: the application can program this based on eye gaze. Eye gaze data acquired from any eye-tracking hardware interface can be consumed without latency loss. This enables applications to exploit the behavior of the human eye and perform foveated rendering (lowering rendering detail in the periphery of the image). Figure 7 shows the content at the eye gaze (marked as a “+” in the figure below) rendered at maximum quality while reducing the shading rate at the periphery.

VRWorks Turing VRS foveated rendering example
Figure 7. Shading rate pattern for gaze-tracked foveated rendering

The shading rate surface can be updated based on gaze position on a per-frame without the need of calling any of the variable rate shading initialization routines again.

...

// Init routines

Loop
   {
      GetGazePosition(&GazePos)
      {
         UpdateShadingRatePattern(GazePos);
      }
      Draw();
   }

Use Case: Lens-Optimized Shading

Even though VRS provides the ability to customize the shading rate pattern as frequently as the application wants, developers may choose to derive a static pattern. For example, the pattern may be based on the lens characteristics of the targeted VR HMD. An example of one such shading rate pattern could be as follows, shown in figure 8:

VRWorks Turing VRS lens-optimized shading rate pattern
Figure 8. Lens-optimized shading rate pattern for the two stereo viewports. The dark blue region in the center of the screen is the highest shading rate (1×1), other colors near the periphery are different coarse shading rates.
// Init routines

...
GetLensCharacteristics(&LensCharacteristic)
UpdateShadingRatePattern(LensCharacteristic);
...
Loop
   {
      // no need to update shading rate pattern per frame
      Draw();
   }

Use Case: SuperSampling

Variable rate shading also allows shading with more detail than native shading with supersampling. Traditionally, this is possible only throughout the render target. The ability to supersample selectively sets variable rate shading apart from standard supersampling. You can either selectively perform supersampling at the center of the scene (supersampled foveated rendering) or based on detecting the crucial content in the scene, such as content adaptive shading for rendering high-quality text. Figure 9 shows the shading rate surface populated with text shaded using much higher details by supersampling while the surroundings are shaded coarsely.

pasted image 0 1
Figure 9. Supersampling used for rendering higher quality text. Legend” Blue / Cold → Detailed Shading;  Red / Hot → Coarse Shading

Performance Considerations

Variable rate shading operates primarily on the pixel shader load of a scene. The more computationally complex the pixel shaders used, the greater the potential gain. The type of content in the scene also plays a role. For example, scenes that contain greater numbers of large primitives may also see performance improvements using VRS. So real-world performance depends on numerous factors, including pixel shader complexity and type of content. Figure 10 shows potential performance gains using a synthetically-generated pixel load*.

NVIDIA VRWorks Turing VRS performance gains chart
VRS performance gain from simulated pixel shader load. *These values are captured from the VRWorks sample shader using the static lens-optimized shading at the balanced preset. The pixel shader load factor in the graph denotes the lighting loop count in the base pass pixel shader to control the load.

Try Out Variable Rate Shading

NVIDIA Variable Rate Shading is simple to integrate and substantially benefits pixel shading within applications. Learn more about variable rate shading on the VRS developer page.

The NVIDIA Variable Rate Shading interface APIs require driver revision R410 and above NVIDIA graphics drivers. The VRWorks Graphics SDK 3.0 release includes the API and sample applications along with programming guides for NVIDIA developers.  We are also working on VRS plugins for major game engines — stay tuned for more information!

If you’re not currently an NVIDIA developer and want to check out VRWorks, signing up is easy — just click on the “join” button at the top of the main NVIDIA Developer Page.

No Comments