CUDACasts Episode #12: Programming GPUs using CUDA Python

So far in the CUDA Python mini-series on CUDACasts, I introduced you to using the @vectorize decorator and CUDA libraries, two different methods for accelerating code using NVIDIA GPUs.  In today’s CUDACast, I’ll be demonstrating how to use the NumbaPro compiler from Continuum Analytics to write CUDA Python code which runs on the GPU.

In CUDACast #12, we’ll continue using the Monte Carlo options pricing example, and I’ll show how to write the step function in CUDA Python rather than using the @vectorize decorator. In addition, by using the nvprof command-line profiler, we’ll be able to see the speed-up we’re able to achieve by writing the code explicitly in CUDA.

To request a topic for a future episode of CUDACasts, or if you have any other feedback, please use the contact form or leave a comment to let us know.


About Mark Ebersole

As CUDA Educator at NVIDIA, Mark Ebersole teaches developers and programmers about the NVIDIA CUDA parallel computing platform and programming model, and the benefits of GPU computing. With more than ten years of experience as a low-level systems programmer, Mark has spent much of his time at NVIDIA as a GPU systems diagnostics programmer in which he developed a tool to test, debug, validate, and verify GPUs from pre-emulation through bringup and into production. Before joining NVIDIA, he worked for IBM developing Linux drivers for the IBM iSeries server. Mark holds a BS degree in math and computer science from St. Cloud State University. Follow @cudahamster on Twitter
  • Tim

    Hello Mark,

    I really enjoy the Cudacasts. I was wondering if the script you’re using is available anywhere. Copying it from the video can be difficult. Particularly, when I try to run my version of your code, I get the error that the device strides and host strides don’t match at the d_next.copy_to_host command.