CUDACasts Episode 18: CUDA 6.0 Unified Memory

CUDA 6 introduces Unified Memory, which dramatically simplifies memory management for GPU computing. Now you can focus on writing parallel kernels when porting code to the GPU, and memory management becomes an optimization.

The CUDA 6 Release Candidate is now publicly available. In today’s CUDACast, I will show you some simple examples showing how easy it is to accelerate code on the GPU using Unified Memory in CUDA 6, and how powerful Unified Memory is for sharing C++ data structures between host and device code. If you’re interested in looking at the code in detail, you can find it in the Parallel Forall repository on GitHub. You can also check out the great Unified Memory post by Mark Harris.

To suggest a topic for a future episode of CUDACasts, or if you have any other feedback, please use the contact form or leave a comment below.


About Mark Ebersole

As CUDA Educator at NVIDIA, Mark Ebersole teaches developers and programmers about the NVIDIA CUDA parallel computing platform and programming model, and the benefits of GPU computing. With more than ten years of experience as a low-level systems programmer, Mark has spent much of his time at NVIDIA as a GPU systems diagnostics programmer in which he developed a tool to test, debug, validate, and verify GPUs from pre-emulation through bringup and into production. Before joining NVIDIA, he worked for IBM developing Linux drivers for the IBM iSeries server. Mark holds a BS degree in math and computer science from St. Cloud State University. Follow @cudahamster on Twitter
  • Sash


  • Amir

    Hi Mark, I read somewhere that Maxwell GPUs can directly access system main memory. But I couldn’t find how this access is performed or any benchmarking about it. Do you know any document about it?