Ray Tracing, Your Questions Answered: Types of Ray Tracing, Performance On GeForce GPUs, and More

Real-Time Ray Tracing is the biggest leap in computer graphics in years, bringing realistic lighting, shadows and effects to games, enhancing image quality, gameplay and immersion. At this year’s Game Developer’s Conference, ray tracing was everywhere: Unreal and Unity announced and released engine integrations, developers presented and attended ray tracing panels in droves, and studios demonstrated their latest ray-traced game builds, whilst also sharing insights on how ray tracing can also assist with game development.

Powering all of these experiences and games were GeForce RTX GPUs, which include RT Cores - dedicated hardware designed to deliver the performance needed to run high-fidelity real-time ray traced experiences at high resolutions. September 1st, 2020 Update: We have now unveiled GeForce RTX 30 Series GPUs, with 2nd Generation RT Cores and a raft of other enhancements, that can improve performance by up to 2x. Learn more about the hardware and new RTX games here.

Since the release of our GeForce RTX GPUs we’ve continued to optimize and enhance ray tracing, and have helped developers further improve ray tracing performance in their games. Combined, these efforts have not only accelerated ray tracing performance for all GeForce RTX gamers, but have now also made it possible for gamers with GeForce GTX 1060 6GB and higher GPUs to test drive basic DirectX Raytracing (DXR) effects via a new Game Ready Driver.

Laptops with equivalent Pascal and Turing GPUs can also enable DXR

When we first announced this major update to our DXR support, there were numerous questions from the community about DXR technology and performance. The deep dive below aims to answer the community’s top questions, but if you have additional questions please ask us here.

GeForce RTX and GTX Performance

When we announced the introduction of DXR for GeForce GTX GPUs last month, many of you asked how these GPUs would perform with ray tracing enabled. In short, it varies greatly depending on the game’s general level of performance, the rendering resolution, the game settings, the types of ray tracing used by the game, and the selected ray-tracing quality level.

To understand what’s happening, and how it influences performance, a quick crash course in ray tracing is required.

 

Ray tracing introduces several new workloads (tasks) for the GPU to perform. The first is determining which triangles (geometric primitives) in the game scene that rays will intersect. A tree-based ray tracing acceleration structure called a Bounding Volume Hierarchy, or BVH, is used to calculate where the rays and triangles intersect.

The BVH reduces the number of ray / primitive intersection tests required, but is still a very computationally intensive process. After BVH traversal and other ray tracing operations are completed, a denoising algorithm is applied to improve the visual quality of the resulting image. Denoising also permits fewer total rays to be cast, allowing ray tracing to be performed in real-time at playable framerates. (For more details on basic ray tracing mechanics, please refer to Appendix D of the NVIDIA Turing GPU Architecture Whitepaper.)

The Turing architecture used in GeForce RTX GPUs was designed from the start for DXR-type workloads. Pascal, on the other hand, was launched in 2016 and was designed for DirectX 12. As such, Pascal and older-generation GPUs were not designed like Turing.

RT Cores on GeForce RTX GPUs provide dedicated hardware to accelerate BVH traversal and ray / triangle intersection calculations, dramatically accelerating the ray tracing process. On GeForce GTX hardware, these calculations are performed on the programmable shader cores, a resource shared with many other graphics functions of the GPU.

To see what this means in practice, let’s examine one frame of gameplay from Metro Exodus, which features ray-traced global illumination:

The graphs look at GPU utilization on Pascal; Turing with RT cores disabled (via a special software setting); and Turing with RT cores and DLSS enabled.

On Pascal-architecture GPUs, we see that ray tracing and all other graphics rendering tasks are handled by FP32 Pascal shader cores. This takes longer to perform, meaning the gamer encounters a lower framerate. The Turing architecture introduced INT32 Cores that operate simultaneously alongside FP32 Cores, helping reduce frame time. And with RT Cores and Tensor Cores, execution time shrinks significantly, translating to 2-3x faster in-game performance.

A close-up of a Metro Exodus frame on a GeForce RTX GPU, using RT Cores and Tensor Cores

There are many ways ray tracing can be implemented in games, from reflections to shadows, lighting (global illumination), ambient occlusion, caustics, and even full scene path tracing. And each type has unique performance requirements.

Ray-Traced Reflections

Before the introduction of real-time videogame ray tracing, the best reflection tech available was Screen Space Reflections (SSR). Developers could combine SSR with cubemaps and other techniques to build a more convincing reflection system, though no combination of tech could overcome SSR’s main issues: an inability to reflect off-screen detail and detail viewed at a low angle, and an inability to reflect occluded on-screen detail.

Ray-traced reflections address all those issues, and more, to generate high-resolution, full-scene, real-time reflections that reflect details in front of, behind, above and below the player or camera; details the player couldn’t see with previous techniques.

 

BattlefieldTM V was the first game to launch with ray-traced reflections, and in DICE’s implementation reflections are selectively applied to reflective surfaces such as water and glass.

Rays are cast towards reflective surfaces visible from the player’s camera, rather than from the player’s camera across the entire scene. This greatly reduces the number of rays that have to be cast, which increases performance.

Battlefield V 2560x1440 Ultra DXR performance on GeForce RTX and GeForce GTX GPUs

Click above for 4K and 1920x1080 performance

Advanced Ray-Traced Reflections

If a developer wishes to take ray-traced reflections a step further, they can trace additional reflections in more-complex scenes, and render reflections on curved or imperfect surfaces. Doing so rapidly increases the ray count, tracing complexity and shading difficulty, because light rays now scatter in different directions and with multiple bounces.

A classic example of advanced reflections is the infinitely reflecting hall of mirrors effect, which is impossible to achieve with traditional screen-space reflection techniques.

 

To construct scenes with these effects, Advanced Ray-Traced Reflections are required, greatly increasing the ray tracing workload through the use of additional bounced rays at each intersection point.

The Atomic Heart and Reflections RTX Tech Demos, and 3DMark Port Royal, tested below, feature ray traced reflections, and highlight the benefits of the dedicated RT Cores in attaining smooth framerates.

Ray-Traced Shadows

Video game shadows have come a long way in recent years, adding many features that have brought them closer to replicating the look of shadows in the real world. Until the advent of real-time ray-tracing, our own Hybrid Frustum Traced Shadows (HFTS) technique got closest, but ultimately it and other rasterized techniques were a collection of cleverly-programmed tricks to try and emulate what a shadow should look like.

Developers needed to carefully balance the properties of their traditional shadow maps to maximize accuracy, detail and view distances, whilst avoiding shadow aliasing, shadow acne (erroneous self-shadowing) and shadow detachment (inability to fully connect with the base of the shadow caster to ground it).

Ultimately, there was always a trade-off or visible issue that reduced immersion. But with ray tracing, those issues are no longer a concern. Instead, we cast rays across a scene to realistically account for characters, objects and foliage that block light, resulting in lifelike shadows. And beyond the addition of accurate shadows, we can for the first time support large complex interactions and real-time translucent shadowing at a level of detail far beyond what was previously possible.

 

3DMark Port Royal, the Atomic Heart RTX Tech Demo, Control, the Justice RTX Tech Demo, Shadow of the Tomb Raider, and the Reflections RTX Tech demo all feature ray-traced shadows.

In Shadow of the Tomb Raider, ray tracing introduces five shadow techniques that weren’t feasible to render with traditional techniques.

On “Medium”, the lowest ray tracing preset, there is a minimal amount of ray tracing, applied only to Point Lights that are occasionally encountered in the world, such as candles. For this reason, we recommend “High”, as additional ray-traced shadows are rendered throughout all levels of the game, having a continually-noticeable impact on realism.

The difficulty of ray tracing shadows increases with the number and type of the light sources, and also the complexity of the objects causing the shadow. Even for a single point, light sources at different angles must be traced separately to show the right shadow: the more angles and lights, the more rays needed.

The performance of ray-traced shadows in Shadow of the Tomb Raider, using the High DXR settings, are demonstrated in the charts below:

Shadow of the Tomb Raider 2560x1440 High DXR performance on GeForce RTX and GeForce GTX GPUs

Click above for 4K and 1920x1080 performance

Ray-Traced Global Illumination

The real-time ray tracing capabilities of GeForce RTX GPUs also enable us to more-accurately model the effect of light bouncing off of surfaces in a scene, giving developers the power to add Ray-Traced Indirect Diffuse Global Illumination to their games.

If you’re unfamiliar with the term “Global Illumination”, it describes the process of computing all the light interactions in a scene, including the indirect contributions resulting from light bouncing off from one surface to another. Before now, it was commonly achieved with precomputed lightmaps, Image-Based Light Probes, Spherical Harmonics, and Reflective Shadowmaps, plus artist-placed lights to help force illumination where the aforementioned techniques fail.

These techniques had several shortcomings, the biggest of which being that dynamic lighting failed to bounce or illuminate beyond the area that the light hit.

For example, imagine a dark room with bright light shining through a window. With traditional techniques, everything that is directly hit by the light is illuminated, but the illuminated areas themselves do not bounce light, and do not illuminate surrounding game elements, when in reality they would.

With ray tracing, we now have the ability to more accurately model the dynamic indirect diffuse lighting reflected by one or more indirect bounces off of surfaces in the scene. This enables developers to craft dynamic scenes with more-realistic indirect lighting that updates in real-time as lighting changes and events occur in the game world.

In other words, the light bounces naturally, illuminating and brightening surrounding detail. And if the sun moves or the window shades open, the room’s lighting realistically changes; enabling you to see the room in an entirely new light.

Metro Exodus was the first game to integrate real-time ray-traced global illumination technology, enabling developer 4A Games to craft dynamic scenes with more-realistic indirect diffuse lighting that updates in real-time as lighting changes, and events occur in the game world.

In Metro Exodus, rays are cast per pixel, making ray-traced global illumination more performance-intensive than limited ray-traced reflections, where rays can be cast more-selectively. The cost is worth it though, as affirmed by tech site DigitalFoundry, who stated that, “ray tracing provides some simply spectacular 'next level' moments.”

For GeForce GTX GPUs, the extra workload of RTGI results in a sizeable performance impact:

Metro Exodus 2560x1440 Ultra DXR performance on GeForce RTX and GeForce GTX GPUs

Click above for 4K and 1920x1080 performance

Ray-Traced Caustics

Caustics refers to the focusing of light rays reflected or refracted by a curved surface or object, or the projection of that envelope of rays on another surface. This is commonly seen when the waves on a water surface bend and focus light into changing patterns of bright and dark areas. Rendering them in games has required shortcuts and approximations to achieve aesthetically-pleasing real-time results, but these were never accurate or realistic.

With DXR ray tracing, developers can finally render realistic caustics, and in Justice developer NetEase has done just that (for further details, check out the free, sponsored Justice and Caustic GDC sessions here).

 

In addition to caustics, the Justice RTX tech demo also features ray-traced reflections and shadows, greatly enhancing the appearance of its scenes, as you can see in the video above.

Justice NVIDIA RTX Tech Demo 2560x1440 DXR performance on GeForce RTX and GeForce GTX GPUs

Click above for 4K and 1920x1080 performance

Real Time Ray Tracing: Momentum Continues

There’s far more to come in the world of ray tracing, this year and in the future - Unreal Engine and Unity have added ray tracing tech and tools, enabling developers worldwide to craft ray-traced experiences; Remedy Entertainment is adding ray-tracing to the upcoming Control; the eagerly-anticipated Vampire: The Masquerade - Bloodlines 2 will utilize ray-tracing when released in 2020; we’re working on  ray-tracing enhancements for the classic Quake II; and numerous developers are actively integrating ray tracing to games launching this year, next year, and in the years beyond. For word on those, and everything else ray-traced, stay tuned to GeForce.com.

In the meantime, be sure to download and try NVIDIA RTX Technology demos for Atomic Heart, Justice and Reflections for further examples of how ray tracing can enhance games.

We hope this comprehensive article answered all your questions about DXR support on GeForce GPUs. If not, please ask your follow-ups here.