CUDA Spotlight: CUDA-Accelerated Adventures

This week’s Spotlight is on Cyrille Favreau and Christophe Favreau, brothers who leverage GPU computing in different ways, with equally compelling results.

Favreau-Brothers-PhotoCyrille, a technical architect by day, uses CUDA in his free time to pursue his interest in visualization technologies. His projects include building a real-time ray-tracing engine and molecule visualizer, and exploring fractal theory.

Christophe, a professional photographer and videographer, is passionate about sailing and nature. GPUs help him produce beautiful work as he travels the world.

Following is an excerpt from our interview (you can read the complete Spotlight here).

NVIDIA: Cyrille, when did you first start using GPUs?
Cyrille: As a technical architect, I am always looking for new solutions to problems. In 2009, I discovered CUDA and GPU computing and that took me to a whole new world. I could see that massively parallel architectures were about to shake the foundations of traditional programming.

NVIDIA: How have you used CUDA to pursue your passion for visualization projects?
Cyrille: My ray-tracing engine, called SoL-R for Speed of Light Ray-tracer, and my molecule visualizer initially started as simple learning projects to help me understand GPUs. But programming on GPUs became so exciting that I kept adding new functionality. I have a number of projects in the pipeline now, such as coupling SoL-R with the Oculus Rift virtual reality headset, and exploring fractal mathematics applied to financial data.

NVIDIA: Tell us more about SoL-R.
Cyrille: SoL-R is built on a multi-session architecture that allows multiple simultaneous users to interact with the same instance of the engine. Each session is designed to use a different stream on the GPU, taking full benefit of NVIDIA Hyper-Q technology. Concurrent kernel execution is an important GPU feature that the SoL-R engine uses extensively to maximize device occupancy and deliver optimal performance.

In ray-tracing, recursion is key. In the current implementation of the SoL-R engine, parallelism is done at the pixel level. With dynamic parallelism, I am hoping to take parallelism down to the ray level. I expect this new design to increase GPU occupancy and improve scalability. Dynamic Parallelism also efficiently helps in the building of acceleration structures. By sub-dividing the 3D space into smaller volumes (bounding boxes), the number of costly ray-object intersections computed by each ray can be dramatically decreased.

NVIDIA: What advice would you like to share with other developers?
Cyrille: I started programming on the CUDA platform on a GeForce GTX260, and then moved to a GTX480. I now use a Quadro K6000 on my desktop and a GT650M on my laptop. Testing on different cards is essential to make sure that kernel runtime parameters are properly utilized.

GeForce and Tesla devices have different approaches in terms of core speed and numbers, and the NVIDIA Visual Profiler is an extremely powerful tool to validate that my algorithms are suitable for both types of cards.

Back in the early days of CUDA, debugging on the GPU was not possible. The arrival of NVIDIA Nsight, and the ability to debug on a single GPU dramatically speeds up development time and makes it easier to adopt the technology.

One of the challenges of designing real-time programs is to maintain a compromise between data structure simplicity and performance. For example, SoL-R uses memory indices extensively, and as the code develops and the algorithms become more complex this can be a source of invalid memory accesses (the SoL-R main kernel is ~3,000 lines of code). The CUDA Memory Checker makes it much easier to identify these bugs.

My experience as a technical architect has taught me that the ease with which a library can be integrated into an existing project is sometimes as important as the features the library offers. For this reason, the SoL-R engine implements a subset of the OpenGL API, making it easy to switch instantly from rasterization to ray-tracing. Still, a number of OpenGL features are used behind the scenes through the OpenGL-CUDA interop. In short, CUDA provides a coherent set of features which work together as a whole to get the best out of GPUs.

NVIDIA: Christophe, tell us about your photography business.
Christophe: It is a very exciting but also very challenging environment. The digital era has completely redefined the rules and computer work is now a big part of the process….

NVIDIA: How do GPUs help you produce great photos?
Christophe: As a professional photographer, I am always on the move, and having constant access to high-end workstations is just not possible. The development of GPUs, along with the ability of industry software to leverage GPU architectures, changed my life. Since I work with very high definition images, I need significant computing power. For many years, interactivity was not even thinkable, but today I can pan, zoom, rotate, and apply filters on very large files on a laptop that I can take everywhere. I am currently using Photoshop CS5 on a Windows laptop equipped with an NVIDIA GeForce 650M. The user experience is amazing.

Read the full interview. Read more CUDA Spotlights.


About Calisa Cole

Calisa Cole
Calisa joined NVIDIA in 2003 and currently focuses on marketing related to CUDA, NVIDIA’s parallel computing architecture. Previously she ran Cole Communications, a PR agency for high-tech startups. She majored in Russian Studies at Wellesley and earned an MA in Communication from Stanford. Calisa is married and the mother of three boys. Her favorite non-work activities are fiction writing and playing fast games of online scrabble.