Real-Time Ray Tracing Arrives
We test three of the new NVIDIA Quadro RTX GPUs.
June 1, 2019
At SIGGRAPH 2018, NVIDIA CEO Jensen Huang took the stage to unveil his company’s new Turing graphics processing unit (GPU) architecture and to announce three initial Turing-based products: the NVIDIA Quadro RTX8000, Quadro RTX6000 and Quadro RTX5000 GPUs. According to NVIDIA, Turing is the first GPU architecture capable of bringing artificial intelligence (AI) and real-time ray tracing to design professionals.
The new Turing streaming multiprocessor (SM) architecture includes up to 4,608 compute unified device architecture (CUDA) cores to accelerate complex simulation of real-world physics; new RT cores to enable real-time ray tracing of objects and environments with physically accurate shadows, reflections, refractions and global illumination; and new Turing Tensor cores to accelerate deep neural network training and inference, which are critical to powering AI-enhanced rendering, products and services.
“Turing is NVIDIA’s most important innovation in computer graphics in more than a decade,” said Huang during his SIGGRAPH announcement. “Hybrid rendering will change the industry, opening up amazing possibilities that enhance our lives with more beautiful designs, richer entertainment and more interactive experiences. The arrival of real-time ray tracing is the Holy Grail of our industry.”
New Technologies
By combining two fundamentally new technologies: ray-tracing acceleration with the RT core and deep learning with the Tensor core, NVIDIA has delivered real-time ray tracing nearly 10 years sooner than anyone had anticipated. The new RTX boards also incorporate ultra-fast Samsung GDDR6 memory to support more complex designs, massive architectural datasets and up to 8K movie content.
After announcing these new technologies and unveiling the three new GPUs at SIGGRAPH, NVIDIA then introduced the next member of the family, the RTX 4000, in November at Autodesk University.
All four new RTX boards are full-height, full-length boards and feature 3D stereo support via a stereo connector. The new GPUs are also the first to include a VirtualLink connector.
VirtualLink is an open industry standard developed to meet the connectivity requirements of current and next-generation virtual reality headsets. Essentially a USB Type-C port, VirtualLink delivers the required power, display and data to power virtual reality headsets through a single USB-C connector. This simplifies the setup of VR headsets, by replacing the multiple connection cables with a single lightweight cable.
All four boards can drive up to four 4K displays at 120Hz, four 5K displays at 60Hz or two 8K displays at 60Hz. You can also add an NVIDIA Quadro Sync II board to connect as many as four Quadro GPUs in a single workstation to support up to 16 displays or projectors per system, or use two Quadro Sync II boards to connect eight Quadro GPUs to support video walls with up to 32 displays.
The Four Boards in Detail
With 4,608 CUDA cores, 576 Tensor cores and 72 RT cores, the RTX 8000 and RTX 6000 are essentially the same, the only difference being the amount of discrete memory. The Quadro RTX 8000 ($5,500 estimated market price) includes 48GB of GDDR6 memory while the RTX 6000 ($4,000) includes 24GB. With a 384-bit interface, both boards can deliver a memory bandwidth of up to 672 GB/second, enabling these boards to achieve 16.3 million single-precision floating point operations per second (TFLOPS) and to trace as many as 10 billion rays per second.
The Quadro RTX 8000 and RTX 6000 are dual-slot boards. Their 295-watt power consumption means that they also require 14-pin auxiliary power connections. Both provide four DisplayPort 4.1 connectors as well as a VirtualLink connector. You can also use NVIDIA NVLink to scale performance by linking two RTX 8000 or RTX 6000 GPUs.
With an estimated selling price of $2,300, the NVIDIA Quadro RTX 5000 costs less than the previous P5000 generation. It offers the same 16GB of memory and features the same 256-bit memory interface as its predecessor but uses the new Samsung GDDR6 memory rather than the previous generation’s GDDR5X memory, thus increasing the memory bandwidth from 288 GB/second to 448 GB/second. With 3,072 CUDA cores, 384 Tensor cores and 48 RT cores, the Quadro RTX 5000 can perform 11.2 million TFLOPS and trace 8 billion rays per second.
The RTX 5000 is also a dual-slot board, and its 265-watt power consumption also requires a 14-pin auxiliary power connection. Like the RTX 8000 and 6000, the Quadro RTX 5000 provides four DisplayPort 4.1 connections as well as a VirtualLink connector. The RTX 5000 also includes an NVLink connector to link two RTX 5000 boards.
The RTX 4000 is a single-slot board, with an estimated street price of $900. While it includes the same 8GB of memory as the P4000 and features the same 256-bit memory interface, the GDDR6 memory in the new board allows the RTX 4000 to achieve a memory bandwidth of up to 416 GB/second. The new GPU includes 2,304 CUDA cores, 288 Tensor cores and 36 RT cores, enabling it to perform 7.1 million TFLOPS and trace 6 billion rays per second.
With its 160-watt power consumption, the RTX 4000 requires an eight-pin auxiliary power connection. Although the GPU provides only three DisplayPort 4.1 connections, you can use the USB-C VirtualLink connector to power a fourth display. The RTX 4000 does not include an NVLink connector, but when combined with a single Quadro Sync II card, up to four RTX 4000 GPUs can be seamlessly synchronized inside a workstation to drive a 16-display video wall from a single system.
Testing the new NVIDIA RTX Boards
Three of these new NVIDIA GPUs—the Quadro RTX 6000, RTX 5000 and RTX 4000—recently arrived in our office. We quickly set about the task of testing these new boards, using SPECviewperf version 13 (spec.org) in an @Xi workstation (xicomputer.com) equipped with an Intel Core i9-9920X
12-core CPU (with all of its cores over-clocked to 4.3GHz) and 32GB of memory, running Windows 10 Pro.
We also retested the two previous-generation NVIDIA Quadro graphics boards—the Quadro P5000 and P4000—in this same workstation, so you can make direct performance comparisons. We did not have a Quadro P6000 board available to perform a similar comparison.
We also performed additional tests using SolidWorks 2019 and SolidWorks Visualize. Our results provide convincing proof of the performance improvements achieved by this new generation of NVIDIA GPUs. Each new board out-performed the next higher level of the previous generation. For example, the NVIDIA Quadro RTX 4000 clearly surpassed the performance of the previous generation Quadro P5000.
Of course, like all other NVIDIA Quadro boards, the Quadro RTX 6000, Quadro RTX 5000 and RTX 4000 are fully certified with most CAD and digital content creation applications and all of the boards in the Quadro line use the same unified video driver and support 64-bit versions of Microsoft Windows 10, 8.1 and 7; Windows Server 2008, 2012, 2016 and 2019; and Linux, Solaris x86 and FreeBSD. The boards are also backed by a three-year warranty.
As we have seen time and time again, NVIDIA has introduced new GPUs that deliver unprecedented levels of performance. But this time, the company also set the stage for dramatic improvements in rendering, visualization and simulation performance once the new boards are fully supported by popular software applications.
Subscribe to our FREE magazine,
FREE email newsletters or both!About the Author
David CohnDavid Cohn is a consultant and technical writer based in Bellingham, WA, and has been benchmarking PCs since 1984. He is a Contributing Editor to Digital Engineering, the former senior content manager at 4D Technologies, and the author of more than a dozen books. Email at [email protected] or visit his website at www.dscohn.com.
Follow DE