Figure 4: MMTI and trainable HoG pedestrian/vehicle detectors extract dynamic obstacles from HD video at runtime

Low-Power Sensing and Autonomy With NVIDIA Jetson TK1

Figure 1: simple TK1 block diagram
Figure 1: simple TK1 block diagram

NVIDIA’s Tegra K1 (TK1) is the first ARM system-on-chip (SoC) with integrated CUDA.  With 192 Kepler GPU cores and four ARM Cortex-A15 cores delivering a total of 327 GFLOPS of compute performance, TK1 has the capacity to process lots of data with CUDA while typically drawing less than 6W of power (including the SoC and DRAM).  This brings game-changing performance to low-SWaP and small form factor (SFF) applications in the sub-10W domain, all the while supporting a developer-friendly Ubuntu Linux software environment delivering an experience more like that of a desktop rather than an embedded SoC.  Tegra K1 is plug-and-play and can stream high-bandwidth peripherals, sensors, and network interfaces via built-in USB 3.0 and PCIe gen2 x4/x1 ports.  TK1 is geared for sensor processing and offers additional hardware-accelerated functionality asynchronous to CUDA, like H.264 encoding and decoding engines and dual MIPI CSI-2 camera interfaces and image service processors (ISP).  There are many exciting embedded applications for TK1 which leverage its natural ability as a media processor and low-power platform for quickly integrating devices and sensors.

As GPU acceleration is particularly well-suited for data-parallel tasks like imaging, signal processing, autonomy and machine learning, Tegra K1 extends these capabilities into the sub-10W domain.  Code portability is now maintained from NVIDIA’s high-end Tesla HPC accelerators and the GeForce and Quadro discrete GPUs, all the way down through the low-power TK1.   A full build of the CUDA 6 toolkit is available for TK1, including samples, math libraries such as cuFFT, cuBLAS, and NPP, and NVIDIA’s NVCC compiler.  Developers can compile CUDA code natively on TK1 or cross-compile from a Linux development machine.  Availability of the CUDA libraries and development tools ensures seamless and effortless scalability between deploying CUDA applications on discrete GPUs and on Tegra.  There’s also OpenCV4Tegra available as well as NVIDIA’s VisionWorks toolkit.  Additionally the Ubuntu 14.04 repository is rich in pre-built packages for the ARM architecture, minimizing time spent tracking down and building dependencies.  In many instances applications can be simply recompiled for ARM with little modification, as long as source is available and doesn’t explicitly call out x86-specific instructions like SSE, AVX, or x86-ASM. NEON is ARM’s version of SIMD extensions for Cortex-A series CPUs.
Continue reading

TK1_applications

Jetson TK1: Mobile Embedded Supercomputer Takes CUDA Everywhere

Today, cars are learning to see pedestrians and road hazards; robots are becoming higher functioning; complex medical diagnostic devices are becoming more portable; and unmanned aircraft are learning to navigate autonomously. As a result, the computational requirements for these devices are increasing exponentially, while their size, weight, and power limits continue to decrease. Aimed at these and other embedded parallel computing applications, last week at the 2014 GPU Technology Conference NVIDIA announced an awesome new developer platform called Jetson TK1.

TK1_Dev_Kit-6350-GREEN-V4_280Jetson TK1 is a tiny but full-featured computer designed for development of embedded and mobile applications. Jetson TK1 is exciting because it incorporates Tegra K1, the first mobile processor to feature a CUDA-capable GPU. Jetson TK1 brings the capabilities of Tegra K1 to developers in a compact, low-power platform that makes development as simple as developing on a PC.

Jetson TK1 is aimed at two groups of people. The first are OEMs, including robotics, avionics, and medical device companies, who would like to develop new products that use Tegra K1 SoCs, and need a development platform that makes it easy to write software for these products. Once these companies are ready to move to production, they can work with one of our board partners to design the exact board that they need for their product. The second group is the large number of independent developers, researchers, makers, and hobbyists who would like a platform that will enable them to create amazing technology such as robots, security devices, or anything that needs substantial parallel computing or computer vision in a small, flexible and low-power platform. For this group, Jetson TK1 offers the size and adaptability of Raspberry Pi or Arduino, with the computational capability of a desktop computer. We’re excited to see what developers create with Jetson TK1!

Tegra K1

Tegra K1 is NVIDIA’s latest mobile processor.  It features a Kepler GPU with 192 cores, Continue reading

cuda_spotlight

If the Virtual Zapato Fits, Wear It! (GPU-Accelerated Augmented Reality)

Foto_NestorThis week’s Spotlight is on Néstor Gómez, CEO of Artefacto Estudio in Mexico City.

Artefacto Estudio is a developer of interactive applications and games. The company’s projects include a real-time virtual shoe fitting kiosk that allows people to “try on” shoes using augmented reality powered by Microsoft Kinect and GPU computing (see the video).

The following is an excerpt from our interview (you can read the complete Spotlight here).

NVIDIA: Néstor, tell us a bit about Artefacto Estudio.
Néstor: Artefacto is an independent development studio. We integrate solutions using cutting-edge technologies like Microsoft Kinect, Oculus Rift and Leap Motion.

NVIDIA: How did you become involved in the shoe industry?
Néstor: An ad agency, Kempertrautmann, was seeking a technology partner to work on a prototype for a virtual shoe fitting exhibit for Goertz, the German shoe company.

NVIDIA: Tell us about the prototype you created for Goertz. Continue reading