The ArrayFire library is a high-performance software library with a focus on portability and productivity. It supports highly tuned, GPU-accelerated algorithms using an easy-to-use API. ArrayFire wraps GPU memory into a simple “array” object, enabling developers to process vectors, matrices, and volumes on the GPU using high-level routines, without having to get involved with device kernel code.

## ArrayFire Capabilities

ArrayFire is an open source C/C++ library, with language bindings for R, Java and Fortran. ArrayFire has a range of functionality, including

- standard math functions;
- image processing functions;
- accelerated core algorithms: reduction, scan, sort etc.;
- statistics;
- set operations;
- BLAS linear algebra routines;
- FFT algorithms;
- and more.

ArrayFire has three back ends to enable portability across many platforms: CUDA, OpenCL and CPU. It even works on embedded platforms like NVIDIA’s Jetson TK1.

In a past post about ArrayFire we demonstrated the ArrayFire capabilities and how you can increase your productivity by using ArrayFire. In this post I will tell you how you can use ArrayFire to exploit various kind of parallelism on NVIDIA GPUs. Continue reading