Develop on your Notebook with GeForce, Deploy on Tesla

razerblade_cuda There’s a new post over on the NVIDIA Corporate Blog by my colleague Mark Ebersole about the latest line of laptops powered by new GeForce 700-series GPUs. As Mark explains, the GeForce 700 series (GT 730M, GT 735M, and GT 740M), powered by the low-power GK208 GPU has the latest compute features of the Tesla K20 (powered by the GK110 GPU), including:

CUDA Dynamic Parallelism, which enables the CUDA runtime API inside device code, so that threads running on the GPU can launch other kernels, call cudaMemcpy, create streams and events, and synchronize the device;
Hyper-Q for CUDA Streams, which improves the efficiency and performance of concurrent kernels running on a single GPU;
the SHFL warp shuffle instruction, which enables threads in the same warp to communicate directly; and
up to 255 registers per thread (increased from 63 in the Fermi architecture), which can reduce bottlenecks caused by spilling registers to off-chip memory.

The availability of the latest GPU architecture on low-cost, highly portable laptops makes it possible to develop CUDA code that uses the latest performance features for deployment on high-end Tesla GPUs. Check out Mark’s blog post for more information.