The introduction of Unified Memory in CUDA, for the first time makes it practical to move huge bodies of general C++ code entirely up to the GPU and to write and run entire complex code systems entirely on the GPU with minimal CPU governance. In theory a big leap, but not without some new challenges.
Alex extends the example I provided in my post Unified Memory in CUDA 6 to make it portable between the CPU, with a switch to select managed memory or host memory allocation. He also touches on an approach to making the member functions of the class portable (using
__host__ __device__—see my post about Hemi for more ideas on this topic).
Overall it looks like Alex had a very positive experience with Unified Memory: “Using this approach I ported several thousand lines of C++ code and half a dozen objects to CUDA 6.0 in a couple days.” I expect many programmers to have similar good experiences in the future.