In 2012 alone, over 8.7 billion ARM-based chips were shipped worldwide. Many developers of GPU-accelerated applications are planning to port their applications to ARM platforms, and some have already started. I recently chatted about this with John Stone, the lead developer of VMD, a high performance (and CUDA-accelerated) molecular visualization tool used by researchers all over the world. But first … some exciting news.
To help developers working with ARM-based computing platforms, we are excited to announce the public availability of the CUDA Toolkit version 5.5 Release Candidate (RC) with support for the ARM CPU architecture. This latest release of the CUDA Toolkit includes support for the following features and functionality on ARM-based platforms.
- The CUDA C/C++ compiler (nvcc), debugging tools (cuda-gdb and cuda-memcheck), and the command-line profiler (nvprof). (Support for the NVIDIA Visual Profiler and NSight Eclipse Edition to come; for now, I recommend capturing profiling data with nvprof and viewing it in the Visual Profiler.)
- Native compilation on ARM CPUs, for fast and easy application porting.
- Fast cross-compilation on x86 CPUs, which reduces development time for large applications by enabling developers to compile ARM code on faster x86 processors, and then deploy the compiled application on the target computer.
- GPU-accelerated libraries including CUFFT (FFT), CUBLAS (linear algebra), CURAND (random number generation), CUSPARSE (sparse linear algebra), and NPP (NVIDIA Performance primitives for signal and image processing).
- Complete documentation, code samples, and more to help developers quickly learn how to take advantage of GPU-accelerated parallel computing on ARM-based systems.