CUDACasts

Oct 13, 2014

CUDACasts Episode 21: Porting a simple OpenCV sample to the Jetson TK1 GPU

In the previous CUDACasts episode, we saw how to flash your Jetson TK1 to the latest release of Linux4Tegra, and install both the CUDA toolkit and OpenCV SDK....

1 MIN READ

Aug 28, 2014

CUDACasts Episode 20: Getting started with Jetson TK1 and OpenCV

The Jetson TK1 development kit has fast become a must-have for mobile and embedded parallel computing due the amazing level of performance packed into such a...

1 MIN READ

May 01, 2014

CUDACasts Episode 19: CUDA 6 Guided Performance Analysis with the Visual Profiler

One of the main reasons for accelerating code on an NVIDIA GPU is for an increase in application performance. This is why it's important to use the best tools...

1 MIN READ

Mar 13, 2014

CUDACasts Episode 18: CUDA 6.0 Unified Memory

CUDA 6 introduces Unified Memory, which dramatically simplifies memory management for GPU computing. Now you can focus on writing parallel kernels when porting...

1 MIN READ

Feb 20, 2014

CUDACasts Episode 17: Unstructured Data Lifetimes in OpenACC 2.0

The OpenACC 2.0 specification focuses on increasing programmer productivity by addressing limitations of OpenACC 1.0. Previously, programmers were required to...

1 MIN READ

Feb 06, 2014

CUDACasts Episode 16: Thrust Algorithms and Custom Operators

Continuing the Thrust mini-series (see Part 1), today's episode of CUDACasts focuses on a few of the algorithms that make Thrust a flexible and powerful...

1 MIN READ

Jan 29, 2014

CUDACasts Episode 15: Introduction to Thrust

Whenever I hear about a developer interested in accelerating his or her C++ application on a GPU, I make sure to tell them about Thrust. Thrust is a parallel...

1 MIN READ

Jan 19, 2014

CUDACasts Episode 14: Racecheck Analysis with CUDA 5.5

The key to the power of GPUs is their 1000's of parallel processors that execute threads. Anyone who has worked with even a handful of threads know how easy it...

1 MIN READ

Dec 19, 2013

CUDACasts Episode 13: Clock, Power, and Thermal Profiling with Nsight Eclipse Edition

In the world of high-performance computing, it is important to understand how your code affects the operating characteristics of your HW. For example, if your...

1 MIN READ

Dec 10, 2013

CUDACasts Episode #12: Programming GPUs using CUDA Python

So far in the CUDA Python mini-series on CUDACasts, I introduced you to using the @vectorize decorator and CUDA libraries, two different methods for...

1 MIN READ

Oct 29, 2013

CUDACasts Episode #11: GPU Libraries for CUDA Python

In the previous episode of CUDACasts I introduced you to NumbaPro, the high-performance Python compiler from Continuum Analytics, and demonstrated how to...

1 MIN READ

Sep 23, 2013

CUDACasts Episode #10: Accelerate Python on GPUs

This week's CUDACast continues the Parallel Forall Python theme kicked off in last week's post by Mark Harris, demonstrating exciting new support for CUDA...

1 MIN READ

Sep 09, 2013

CUDACasts Episode #9: Explore GPU device memory with Nsight Eclipse Edition

Visual tools offer a very efficient method for developing and debugging applications. When working on massively parallel codes built on the CUDA Platform, this...

1 MIN READ

Sep 02, 2013

CUDACasts Episode #8: Accelerate FFTW Apps with CUFFT 5.5

GPU libraries provide an easy way to accelerate applications without writing any GPU-specific code. With the new CUDA 5.5 version of the NVIDIA CUFFT Fast...

1 MIN READ

Aug 22, 2013

CUDACasts Episode #7: nvidia-smi Accounting

The NVIDIA System Management Interface, nvidia-smi, is a command-line interface to the NVIDIA Management Library, NVML. nvidia-smi provides Linux system...

1 MIN READ

Aug 09, 2013

CUDACasts Episode #6: CUDA on ARM with CUDA 5.5

In CUDACast #5, we saw how to use the new NVIDIA RPM and Debian packages to install the CUDA toolkit, samples, and driver on a supported Linux OS with a...

1 MIN READ