Software development is as much about writing code fast as it is about writing fast code, and central to rapid development is software reuse and portability. When building heterogeneous applications, developers must be able to share code between projects, platforms, compilers, and target architectures. Ideally, libraries of domain-specific code should be easily retargetable.
In this post I’ll talk about Hemi, a simple open-source C++ header library that simplifies writing portable CUDA C/C++ code. In the screenshot below, both columns show a simple Black-Scholes code written to be compilable with either NVCC or a standard C++ host compiler, and also runnable on either the CPU or a CUDA GPU. The right column is written using Hemi’s macros and smart heterogeneous Array container class, hemi::Array. Using Hemi, the length and complexity of this code is reduced by half. Continue reading