CUDA Spotlight: CUDA-Accelerated Adventures

This week’s Spotlight is on Cyrille Favreau and Christophe Favreau, brothers who leverage GPU computing in different ways, with equally compelling results.

Favreau-Brothers-Photo Cyrille, a technical architect by day, uses CUDA in his free time to pursue his interest in visualization technologies. His projects include building a real-time ray-tracing engine and molecule visualizer, and exploring fractal theory.

Christophe, a professional photographer and videographer, is passionate about sailing and nature. GPUs help him produce beautiful work as he travels the world.

NVIDIA: Cyrille, when did you first start using GPUs?
Cyrille: As a technical architect, I am always looking for new solutions to problems. In 2009, I discovered CUDA and GPU computing and that took me to a whole new world. I could see that massively parallel architectures were about to shake the foundations of traditional programming.

NVIDIA: How have you used CUDA to pursue your passion for visualization projects?
Cyrille: My ray-tracing engine, called SoL-R for Speed of Light Ray-tracer, and my molecule visualizer initially started as simple learning projects to help me understand GPUs. But programming on GPUs became so exciting that I kept adding new functionality. I have a number of projects in the pipeline now, such as coupling SoL-R with the Oculus Rift virtual reality headset, and exploring fractal mathematics applied to financial data.

NVIDIA: Tell us more about SoL-R.
Cyrille: SoL-R is built on a multi-session architecture that allows multiple simultaneous users to interact with the same instance of the engine. Each session is designed to use a different stream on the GPU, taking full benefit of NVIDIA Hyper-Q technology. Concurrent kernel execution is an important GPU feature that the SoL-R engine uses extensively to maximize device occupancy and deliver optimal performance.

In ray-tracing, recursion is key. In the current implementation of the SoL-R engine, parallelism is done at the pixel level. With dynamic parallelism, I am hoping to take parallelism down to the ray level. I expect this new design to increase GPU occupancy and improve scalability. Dynamic Parallelism also efficiently helps in the building of acceleration structures. By sub-dividing the 3D space into smaller volumes (bounding boxes), the number of costly ray-object intersections computed by each ray can be dramatically decreased.

NVIDIA: What advice would you like to share with other developers?
Cyrille: I started programming on the CUDA platform on a GeForce GTX260, and then moved to a GTX480. I now use a Quadro K6000 on my desktop and a GT650M on my laptop. Testing on different cards is essential to make sure that kernel runtime parameters are properly utilized.

GeForce and Tesla devices have different approaches in terms of core speed and numbers, and the NVIDIA Visual Profiler is an extremely powerful tool to validate that my algorithms are suitable for both types of cards.

Back in the early days of CUDA, debugging on the GPU was not possible. The arrival of NVIDIA Nsight, and the ability to debug on a single GPU dramatically speeds up development time and makes it easier to adopt the technology.

One of the challenges of designing real-time programs is to maintain a compromise between data structure simplicity and performance. For example, SoL-R uses memory indices extensively, and as the code develops and the algorithms become more complex this can be a source of invalid memory accesses (the SoL-R main kernel is ~3,000 lines of code). The CUDA Memory Checker makes it much easier to identify these bugs.

My experience as a technical architect has taught me that the ease with which a library can be integrated into an existing project is sometimes as important as the features the library offers. For this reason, the SoL-R engine implements a subset of the OpenGL API, making it easy to switch instantly from rasterization to ray-tracing. Still, a number of OpenGL features are used behind the scenes through the OpenGL-CUDA interop. In short, CUDA provides a coherent set of features which work together as a whole to get the best out of GPUs.

NVIDIA: Christophe, tell us about your photography business.
Christophe: It is a very exciting but also very challenging environment. The digital era has completely redefined the rules and computer work is now a big part of the process….

NVIDIA: How do GPUs help you produce great photos?
Christophe: As a professional photographer, I am always on the move, and having constant access to high-end workstations is just not possible. The development of GPUs, along with the ability of industry software to leverage GPU architectures, changed my life. Since I work with very high definition images, I need significant computing power. For many years, interactivity was not even thinkable, but today I can pan, zoom, rotate, and apply filters on very large files on a laptop that I can take everywhere. I am currently using Photoshop CS5 on a Windows laptop equipped with an NVIDIA GeForce 650M. The user experience is amazing. When I am shooting a sailing event, as soon as the races are over I can transfer the pictures to my laptop and immediately rework them as necessary. And since my customers are still on site, this is clearly a huge commercial advantage.

2012 Nespresso International 18 Skiff Regatta (Courtesy Christophe Favreau)

NVIDIA: Do you also use GPU computing for videography?
Christophe: Yes. In the past, I used to spend ages converting files from one format to another. Real-time editing was a huge productivity problem. The constraints were so high that I gave up this activity at one point.

Now, thanks to GPUs and Adobe Premiere Pro 5.5, video editing has become a genuine pleasure. The ability to process high definition sequences interactively has brought me back to video editing.

NVIDIA: Cyrille, what’s next for you?
Cyrille: For my SoL-R project, my intention is to make it available to the scientific community in the near future. SoL-R is designed to use the same API as OpenGL and, when completed, will allow an instant switch from rasterization to ray-tracing. There is still a lot to do but the general architecture is settled, and it’s only a question of time until it becomes a production-ready library. Stay tuned for more info on this.

Additionally, my website has become an incredible channel for me to connect with people who share my passion for visualization and high-performance computing. I plan to continue posting on my site to make it an interesting and valuable source of information.

In recognition of Rob Farber’s famous Dr. Dobbs series, I created the LinkedIn group Supercomputing for the Masses. It now has more than 700 members! The purpose is to promote the use of GPUs in the IT industry. I’d like to see the group continue to grow and spark discussions.

NVIDIA: Christophe, what’s on the horizon for you?
Christophe: In addition to my photographic work, I am working with a programmer to create an immersive virtual gallery of my portfolio. This application, which will leverage Oculus Rift, can exist only because the underlying GPU is powerful enough to produce real-time OpenGL rendering on a laptop.