matlab_logo

Prototyping Algorithms and Testing CUDA Kernels in MATLAB

This guest post by Daniel Armyr and Dan Doherty from MathWorks describes how you can use MATLAB to support your development of CUDA C and C++ kernels. You will need MATLAB, Parallel Computing Toolbox™, and Image Processing Toolbox™ to run the code. You can request a trial of these products at www.mathworks.com/trial. For a more detailed description of this workflow, refer to the MATLAB for CUDA Programmers webinar and associated demo files.

NVIDIA GPUs are becoming increasingly popular for large-scale computations in image processing, financial modeling, signal processing, and other applications—largely due to their highly parallel architecture and high computational throughput. The CUDA programming model lets programmers exploit the full power of this architecture by providing fine-grained control over how computations are divided among parallel threads and executed on the device. The resulting algorithms often run much faster than traditional code written for the CPU.

While algorithms written for the GPU are often much faster, the process of building a framework for developing and testing them can be time-consuming. Many programmers write CUDA kernels integrated into C or Fortran programs for production. For this reason, they often use these languages to iterate on and test their kernels, which requires writing significant amounts of “glue code” for tasks such as transferring data to the GPU, managing GPU memory, initializing and launching CUDA kernels, and visualizing kernel outputs. This glue code is time-consuming to write and may be difficult to change if, for example, you want to run the kernel on different input data or visualize kernel outputs using a different type of plot.

Using an image white balancing example, this article describes how MATLAB® supports CUDA kernel development by providing a language and development environment for quickly evaluating kernels, analyzing and visualizing kernel results, and writing test harnesses to validate kernel results. Continue reading