DRIVE PX Application Development Using Nsight Eclipse Edition

Figure 1. NVIDIA DRIVE PX 2 AI Car Computer.
Figure 1. NVIDIA DRIVE PX 2 AI Car Computer.

NVIDIA DRIVE™ PX is the AI car computer designed to enable OEMs, tier 1 suppliers, startups and research institutions to accelerate the development of self-driving car systems. NVIDIA DriveWorks is a Software Development Kit (SDK) for DRIVE PX that includes a number of open-source reference samples, development tools and library modules targeting autonomous driving applications.

You can customize these samples or develop your own applications on your linux host machine, and then execute them either on the host or on DRIVE PX, after properly configuring your development environment. Figure 2 shows the common compilation and execution flow.

This blog post illustrates how to configure NVIDIA Nsight Eclipse Edition to enable the entire host- and cross-compilation process. Nsight Eclipse is a full-featured integrated development environment (IDE) powered by the Eclipse platform. It provides an all-in-one integrated environment to edit, build, debug and profile CUDA C/C++ applications. By following the instructions in this post, you will learn to import, compile, run and debug a DriveWorks project – both on a host machine and remotely on a DRIVE PX 2 – directly from the Nsight IDE, exploiting the original sample makefiles.

Figure 2. Compilation and deployment process for DriveWorks applications.
Figure 2. Compilation and deployment process for DriveWorks applications.

Preliminary Steps

Nsight Eclipse supports a rich set of commercial and free plugins, and is included in the CUDA Toolkit for Linux and Mac. This blog post assumes you have already run NVIDIA DriveInstall on your host machine. NVIDIA DriveInstall automatically installs the following items on both host and target systems:

CUDA Toolkit,
• DriveWorks,
• Library dependencies of the above.

You will be making a copy of the original DriveWorks sample folder to avoid overwriting the original source code.  In this example, you will import just the DriveNet sample from DriveWorks, and you won’t need the other samples.

Before proceeding, make sure that the DriveNet sample runs correctly: execute the following code in a terminal window both on your host and on the DRIVE PX 2.

cd /usr/local/driveworks/bin

In addition, make sure “manual” host compilation and DRIVE PX 2 cross-compilation run successfully. On your host, execute the following:

cd /usr/local/driveworks
sudo cp -r samples samples-original
cd samples-original
sudo mkdir build-host
cd build-host
sudo cmake ..
sudo make -j

Now cross-compile for your DRIVE PX 2 (look for the VibranteSDK folder in your host, note down its complete path, and substitute <V4L_SDK_PATH> in the following with that path).

sudo mkdir build-target
cd build-target        
sudo cmake -DCMAKE_BUILD_TYPE=Release \
-DCMAKE_TOOLCHAIN_FILE=/usr/local/driveworks/samples-original/cmake/Toolchain-V4L.cmake \
-DVIBRANTE_PDK:STRING=<V4L_SDK_PATH>/vibrante-t186ref-linux .. 
sudo make -j

If the above operations ran smoothly, you can keep reading to see how to configure Nsight Eclipse to automate them. First, make a copy of the existing DriveWorks source code folder on your host. In a host terminal window, type:

cd /usr/local/driveworks 
sudo cp -r samples samples-nsight

Next, in the samples-nsight/src folder, keep only these folders: common, dnn, drivenet. Since the structure of the samples folder has changed, you need to modify the CMakelists.txt file (as sudo), substituting the line starting with set(SAMPLES common; … with set(SAMPLES common;dnn;drivenet).

Now it’s time to run Nsight! As the sample working directory is in /usr, it is necessary to open Nsight Eclipse as root.

sudo /usr/local/cuda/bin/nsight

Create a new CUDA C/C++ project: “File > New > CUDA C/C++ project”. Write “DriveNet” as “Project name”, uncheck “Use default location” and browse to /usr/local/driveworks/samples-nsight.

Select “Empty Project” as “project type”, and leave the remaining options as default. See Figure 3.

Figure 3. Creating a new CUDA C/C++ project.
Figure 3. Creating a new CUDA C/C++ project.

Do not specify a “Target System” for now. Select the “DriveNet” project in the left column and add a folder called build-host. You will create a Make target for each desired configuration (Host and Target).

Host Environment Configuration

You now have to specify a target for the host. Click on the “Window > Show View > Make Target” menu command. It should appear on the right, next to the “Outline” window.

Figure 4. The Create Make Target configuration window.
Figure 4. The Create Make Target configuration window.

Select the build-host folder: CMake will use this folder as its working directory. Right click on it and select “New” from the menu. The “Create Make Target” window will appear. Set all options as shown in Figure 4: enter build-host as “Target name,” uncheck “Same as the target name,” empty the “Make target” field and uncheck “Use builder settings.” Finally, as “Build command,” make sure to write (including dots):

sudo cmake -G "Unix Makefiles" ..

After that, click “OK”. Now, you need to set the Eclipse builder up in order to run the Makefile that will be generated by CMake.

Right click on the “DriveNet” project, then select “Properties.” In the left column, select “Build.” Click on “Manage Configurations” and then on “New”. In the new window that pops up, type build-host as “Name”, and select “Release: CUDA Toolkit 8.0” as “Default configuration”. After clicking on “OK”, remove all other pre-defined settings. This creates a Release-type configuration for the host machine. Later, you will create a Debug-type configuration for the target instead.

After that, uncheck “Use default build command” and specify sudo make -j in the “Build command” field. Click on “Workspace…” and navigate to the correct path for the “Build directory” field: ${workspace_loc: /DriveNet/build-host} (as Figure 5 shows).

Figure 5. The build-host configuration window.
Figure 5. The build-host configuration window.

You are now ready to build the project. In the “Make Target” window, double click on “build-host.” If there are no errors, build the project by first selecting the build-host configuration with “Project > Build Configurations > Set Active”, and then clicking on “Build” with the “Project > Build Project” command.

The DriveNet sample source file can be found in “src > drivenet > drivenet > main.cpp.” The sample_drivenet executable is in “build-host > src > drivenet > drivenet > sample_drivenet”. Right click on it and select “Run as > Local C/C++ application” to launch it on your host machine.

Solving Include Warnings

Figure 6. Possible include warnings in Nsight Eclipse.
Figure 6. Possible include warnings in Nsight Eclipse.

If Nsight Eclipse shows include warnings related to the DriveWorks SDK (see Figure 6), you can specify the path to the DriveWorks header files.

Right click on the “DriveNet” project, then select “Properties.” In the left column, select “C/C++ general > Paths and Symbols.” In the “Includes” tab, click on “C++ Source file” and then on “Add…” Using the “File system…” option, navigate to /usr/local/driveworks/include.

Check “Add to all configurations” and click “OK” twice. Answer “Yes” to rebuild the include search path.

Target Environment Configuration

You will now add a second environment for a DRIVE PX 2 target device. Select the “DriveNet” project again in the left column and add a folder called build-target. In the “Make Target” dialog, add a new configuration by right clicking on the “build-target” folder. As “Target name,” type build-target. Make sure the “Same as the target name” option is not selected, and leave the “Make target” field empty. Uncheck “Use builder settings” and type the following in the  “Build command” field.

sudo cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Debug  -DCMAKE_TOOLCHAIN_FILE=/usr/local/driveworks/samples-nsight/cmake/Toolchain-V4L.cmake -DVIBRANTE_PDK:STRING=/vibrante-t186ref-linux ..

Once again, make sure to substitute <V4L_SDK_PATH> with the path for the Vibrante SDK on your host. Later, I will demonstrate how to run the Nsight Eclipse debugger on this sample, therefore for this configuration I specified -DCMAKE_BUILD_TYPE=Debug. If you are not interested in debugging, type -DCMAKE_BUILD_TYPE=Release instead.

As before, you will now set up the Eclipse builder to run the Makefile generated by CMake. Don’t worry about the CMake configuration: what you set previously will automatically specify the needed cross-compilation settings. To continue, select “Run all project builders” and click “OK.”

Continue by creating a new configuration for the DRIVE PX 2. Right click on the “DriveNet”  project, then select “Properties.” In the left column, select “Build.” Click on “Manage Configurations” and then on “New.” Set “Debug: CUDA Toolkit 8.0” as “Default configuration”, and click “OK”. Then, as Figure 7 shows, type sudo make -j in the “Build command” field and navigate to ${workspace_loc: /DriveNet/build-target} by clicking on “Workspace…”.

Figure 7. The build-target configuration window.
Figure 7. The build-target configuration window.

You can now cross compile the project by following the steps for building the host version. In the “Make Target” window, double click on build-target. If there are no errors, build the project by first selecting “build-target” with “Project > Build Configurations > Set Active”, and then clicking on “Build” with the “Project > Build Project” command. You will find the executable in “build-target > src > drivenet > drivenet.”

Run The Sample on DRIVE PX from Nsight

It’s possible to configure Nsight Eclipse to launch the execution of the DriveNet sample remotely on the Drive PX. First find your “<target IP address>” by typing ifconfig on a terminal window in the DRIVE PX. The DRIVE PX must be connected to your local network.

You will configure Nsight Eclipse to automatically transfer the updated cross-compiled version of the sample to the DRIVE PX every time you launch it. First, create a new folder and set its ownership to the nvidia user on the DRIVE PX.

$ ssh nvidia@
$ cd /usr/local/driveworks
$ sudo mkdir bin-nsight
$ sudo chown –R nvidia:nvidia bin-nsight
$ exit

Back in Eclipse, click on the “Run > Run configurations” top menu. Add a new configuration below “C/C++ Remote Application”, and type sample_drivenet_remote as “Name”. Next to “Remote connection,” select “Manage” and type the <target IP address> as “Host name,” nvidia as “User name” and nvidia@<PX ip address> as “Label”, see Figure 8. Then click on “Finish”.

Figure 8. Specifying the network connection to your DRIVE PX.
Figure 8. Specifying the network connection to your DRIVE PX.

Next to “Remote toolkit”, click on “Manage” and in the following section click on “Detect” to identify the Toolkit path: /usr/local/cuda/bin should be found. If not, you need to manually select the CUDA toolkit path on the target.

To complete the configuration, first check “Run remote executable” and type the complete remote path for the executable: it should be /usr/local/driveworks/bin-nsight/sample_drivenet. After that, you can check “Upload local executable” to automatically send the updated DriveNet sample to the target before running it.

In the “Local” tab, make sure to select the “DriveNet” project and select sample_drivenet as the C/C++ Application to run using the “Search Project…” button. In the “Environment” tab, click on “New” and type DISPLAY as “Name” and :0 as “Value.” Optionally, you can add CUDA_VISIBLE_DEVICES as “Name” and 1 as “Value” if you want to run the sample on the integrated GPU (iGPU) rather than the discrete GPU (dGPU) on the DRIVE PX 2. Click on “Run” to launch the application on the target.

Debug Sample Running on DRIVE PX 2

In the previous section, you specified a remote target system configuration for othe DRIVE PX 2. This configuration will be already available to debug the sample in the “Run > Debug configurations” menu. Before going on, spend some time looking at the many debug features available in this window.

By default, Nsight automatically downloads shared libraries from the remote target for the debugging process. This considerably increases the debugging execution time, however, so instead you can point Nsight Eclipse directly to the target libraries already available in the host system. Switch to the “Debugger > Shared Libraries” tab. Uncheck “Download shared libraries” from “remote target” and add the following paths by clicking on “Add…:”

  • /usr/local/driveworks/targets/aarch64-linux/lib
  • /usr/local/cuda/targets/aarch64-linux/lib
  • <V4L_SDK_PATH>/vibrante-t186ref-linux/targetfs/usr/lib
  • <V4L_SDK_PATH>/vibrante-t186ref-linux/targetfs/lib/aarch64-linux-gnu
  • <V4L_SDK_PATH>/vibrante-t186ref-linux/targetfs/usr/lib/aarch64-linux-gnu

Replace <V4L_SDK_PATH> with the path for the Vibrante SDK on your host and launch the debugger from this window. Nsight will switch to the debugger perspective and break at the first CPU instruction in the code. Find the CUDA view (cube icon) in the top-right pane and select “break on application kernel launches”: this will cause the debugger to automatically break on any CUDA kernel started on the GPU.

You can now resume the application, which will run until the first breakpoint is hit in the CUDA kernel. From there, you can browse the CPU and GPU call stacks in the top-left pane, as Figure 9 shows. In the top-right pane, you can also inspect variables, registers, and the GPU kernel execution configuration (the number of CUDA thread blocks and the number of threads per block). Finally, the disassembly view makes it easy to see how register values are updated while executing the code.

To debug a particular kernel code on the GPU, set a breakpoint inside it by double-clicking on the corresponding line number in the code. (Keep in mind, however, that single-stepping a thread causes the other threads in the same warp to step as well.)

When you are finished debugging, click on the red stop button to quit the application.

Figure 9. Remote CUDA debugging with Nsight Eclipse Edition.
Figure 9. Remote CUDA debugging with Nsight Eclipse Edition.

Profile the Sample on DRIVE PX 2

Now that the sample is debugged, you’ll want to profile your application from Nsight while it is running on the DRIVE PX 2. Once again, the initial remote target system configuration will also be available for remote profiling. However, remember to change the CMake build type to Release in the “Make Target” configuration.

If the -DCMAKE_BUILD_TYPE variable is not specified, “Release” will be the default setting (see Figure 4). To start profiling, click on “Run > Profile configurations” and select the correct element under the “C/C++ Remote Application” list. In the “Profiler” tab on the right, remember to specify an execution timeout (for instance, 60 seconds), so that the application will be killed automatically after some time.

After that, click on “Profile:” the Nsight profiler perspective will open automatically. Wait while Nsight runs the application to create an execution timeline including all the CUDA Runtime and kernel calls executed on the GPU, as Figure 10 shows. Once finished, the “Properties” tab displays details of any event you select from this timeline; these events can also be viewed in text form in the “Details” tab in the lower pane.

Figure 10. Using the Nsight profiler gives you the possibility to deeply analyze CUDA kernels in an intuitive way.
Figure 10. Using the Nsight profiler gives you the possibility to deeply analyze CUDA kernels in an intuitive way.

Check the Analysis tab below the timeline view to further analyze performance. There, you can easily identify bottlenecks by running more advanced profiling sessions on your code. You can refer to the “Guided Performance Analysis with the Visual Profiler” blog post for additional instructions.

Get Started with Nsight and DRIVE PX 2

If you want to learn more about the Drive PX platform, visit the Drive PX product page, where you can find additional material on NVIDIA’s Drive software. Watch this interesting video explaining further Nsight Eclipse features and get Nsight today to start developing you own CUDA and Drive PX applications!

1 Comment
  • Younggi Song

    Thanks Davide and will keep cooperation..