NVIDIA Nsight Eclipse Edition for Jetson TK1

NVIDIA® Nsight™ Eclipse Edition is a full-featured, integrated development environment that lets you easily develop CUDA® applications for either your local (x86) system or a remote (x86 or ARM) target. In this post, I will walk you through the process of remote-developing CUDA applications for the NVIDIA Jetson TK1, an ARM-based development kit.

Nsight supports two remote development modes: cross-compilation and “synchronize projects” mode. Cross-compiling for ARM on your x86 host system requires that all of the ARM libraries with which you will link your application be present on your host system. In synchronize-projects mode, on the other hand, your source code is synchronized between host and target systems and compiled and linked directly on the remote target, which has the advantage that all your libraries get resolved on the target system and need not be present on the host. Neither of these remote development modes requires an NVIDIA GPU to be present in your host system.

Note: CUDA cross-compilation tools for ARM are available only in the Ubuntu 12.04 DEB package of the CUDA 6 Toolkit.  If your host system is running a Linux distribution other than Ubuntu 12.04, I recommend the synchronize-projects remote development mode, which I will cover in detail in a later blog post.

CUDA toolkit setup

The first step involved in cross-compilation is installing the CUDA 6 Toolkit on your host system. To get started, let’s download the required Ubuntu 12.04 DEB package from the CUDA download page. Installation instructions can be found in the Getting Started Guide for Linux, but I will summarize them below for CUDA 6.

1. Enable armhf as a foreign architecture to get the cross-armhf packages installed:

$ sudo sh -c \ 'echo "foreign-architecture armhf" >> /etc/dpkg/dpkg.cfg.d/multiarch'
$ sudo apt-get update

2. Run dpkg to install and update the repo meta-data:

$ sudo dpkg – i cuda-repo-ubuntu1204_6.0-37_amd64.deb
$ sudo apt-get update

3. Install cuda cross and ARM GNU packages (these will be linked in future toolkit versions):

$ sudo apt-get install cuda-cross-armhf
$ sudo apt-get install g++-4.6-arm-linux-gnueabihf

4. OPTIONAL – if you also wish to do native x86 CUDA development and have an NVIDIA GPU in your host system then you can install the full toolchain and driver:

$ sudo apt-get install cuda

Reboot your system if you installed the driver so that NVIDIA driver gets loaded. Then update paths to the toolkit install location as follows:

$ export PATH=/usr/local/cuda/bin:$PATH
$ export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

At the end of these steps you should see armv7-linux-gnueabihf and the optional x86_64_linux folder under /usr/local/cuda/targets/.

For your cross-development needs, Jetson TK1 comes prepopulated with Linux for Tegra (L4T), a modified Ubuntu (13.04 or higher) Linux distribution provided by NVIDIA. NVIDIA provides the board support package and a software stack that includes the CUDA Toolkit, OpenGL 4.4 drivers, and the NVIDIA VisionWorks™ Toolkit. You can download all of these, as well as examples and documentation, from the Jetson TK1 Support Page.

Importing Your First Jetson TK1 CUDA Sample into Nsight

With the CUDA Toolkit installed and the paths setup on the host system, launch Nsight by typing “nsight” (without the quotes) at the command line or by finding the Nsight icon in the Ubuntu dashboard. Once Nsight is loaded, navigate to File->New->CUDA C/C++ Project and import an existing CUDA sample to start the Project Creation wizard. For the project name, enter “boxfilter-arm” and select “Import CUDA Sample” in the project type and “CUDA Toolkit 6.0” in the toolchains. Next, choose the Boxfilter sample which can be found under the Imaging category. The remaining options in the wizard let you choose which GPU and CPU architectures to generate code for.  First, we will choose the GPU code that should be generated by the nvcc compiler.  Since Jetson TK1 includes an NVIDIA Kepler™ GPU, choose SM32 GPU binary code and SM30 PTX intermediate code. (The latter is so that any Kepler-class GPU can run this application.) The next page in the wizard lets you decide if you wish to do native x86 development or cross-compile for an ARM system. To cross compile for ARM, choose ARM architecture in the CPU architecture drop-down box.


Building Your First Jetson TK1 Application from Nsight

CUDA samples are generic code samples that can be imported and run on various hardware configurations. For this cross build exercise the ARM library dependencies used by this application has to be resolved first. Here’s how you can resolve those:

1. Right click on the project and navigate to Properties->Build->Settings->Tool Settings->NVCC Linker->Libraries and update the paths to point to linux/armv7l instead of linux/x86_64. This will resolve the libGLEW library dependencies. Also remove the entry for GLU since that library is unused.


2. Click on the Miscellaneous tab and add a new -Xlinker option “—unresolved-symbols=ignore-in-shared-libs” (without the quotes).

3. In the terminal window use the scp utility to copy the remaining libraries from your Jetson TK1:

scp ubuntu@your.ip.address:/usr/lib/arm-linux-gnueabihf/libglut.so.3  /usr/arm-linux-gnueabihf/lib folder, with a symlink to libglut.so
scp ubuntu@your.ip.address:/usr/lib/arm-linux-gnueabihf/tegra/libGL.so.1 /usr/arm-linux/gnueabihf/lib folder, with a symlink to libGL.so
scp ubuntu@your.ip.address:/usr/lib/arm-linux-gnueabihf/libX11.so.6 /usr/arm-linux-gnueabihf/lib folder, with a symlink to libX11.so

Note: You need to copy these ARM libraries only for the first CUDA sample. You may need additional libraries for other samples.

The build process for ARM cross-development is similar to the local build process. Just click on the build “hammer” icon in the toolbar menu to build a debug ARM binary.  As part of the compilation process, Nsight will launch nvcc for the GPU code and the arm-linux-gnueabihf-g++-4.6 cross-compiler for the CPU code as follows:

Building file: ../src/boxFilter_kernel.cu
Invoking: NVCC Compiler
/usr/local/cuda-6.0/bin/nvcc -I"/usr/local/cuda-6.0/samples/3_Imaging" -I"/usr/local/cuda-6.0/samples/common/inc" 
-I"/home/satish/cuda-workspace_new/boxfilter-arm" -G -g -O0 -ccbin arm-linux-gnueabihf-g++-4.6 -gencode arch=compute_30,
code=sm_30 -gencode arch=compute_32,code=sm_32 --target-cpu-architecture ARM -m32 -odir "src" -M -o "src/boxFilter_kernel.d" 
/usr/local/cuda-6.0/bin/nvcc --compile -G -I"/usr/local/cuda-6.0/samples/3_Imaging" -I"/usr/local/cuda-6.0/samples/common/inc" 
-I"/home/satish/cuda-workspace_new/boxfilter-arm" -O0 -g -gencode arch=compute_30,code=compute_30 -gencode arch=compute_32,
code=sm_32 --target-cpu-architecture ARM -m32 -ccbin arm-linux-gnueabihf-g++-4.6  -x cu -o  "src/boxFilter_kernel.o" 
Finished building: ../src/boxFilter_kernel.cu

After the compilation steps, the linker will resolve all library references, giving you a boxfilter-arm binary that is ready to run.

Running Your First Jetson TK1 Application from Nsight

To run the code on the target Jetson TK1 system, click on Run As->Remote C/C++ Application to setup the target system user and host address.


Once you finish the remote target system configuration setup, click on the Run icon and you will see a new entry to run the boxfilter-arm binary on the Jetson TK1.

Note: Box filter application relies on data files that reside in the data/ subfolder of the application, which will need to be copied to the target system. Use the scp utility to copy those files into the /tmp/nsight-debug/data/ folder on your Jetson TK1.

Next, edit the boxfilter.cpp file as follows:
1. To ensure that the application runs on the correct display device, add this line to the top of the main function:

setenv(“DISPLAY”, “:0”, 0);

2. Add the following lines to the top of the display function so that app auto-terminates after a few seconds. This is required to gather deterministic execution data across multiple runs of the application, which we will need later in the profiling section:

static int icnt = 120;

Click on Run to execute the modified Box Filter application on your Jetson TK1.

Debugging Your First Jetson TK1 Application in Nsight

The remote target system configuration that you set up in Nsight earlier will also be visible under the debugger icon in the toolbar.

Before you launch the debugger, note that by default Jetson TK1 does not allow any application to solely occupy the GPU 100% of the time. In order to run the debugger, we need to fix this. On your Jetson TK1, login as root (sudo su) and then disable the timeout as follows (in future releases of CUDA, the debugger will handle this automatically):

root@tegra-ubuntu:/home/ubuntu# echo N > sys/kernel/debug/gk20a.0/timeouts_enabled

Now we can launch the debugger using the debug icon back on the host system. Nsight will switch you to its debugger perspective and break on the first instruction in the CPU code. You can single-step a bit there to see the execution on the CPU and watch the variables and registers as they are updated.

To break on any and all CUDA kernels executing on the GPU, go to the breakpoint tab in the top-right pane of Nsight and click on the cube icon dropdown. Then select the “break on application kernel launches” feature to break on the first instruction of a CUDA kernel launch. You can now resume the application, which will run until the first breakpoint is hit in the CUDA kernel. From here, you can browse the CPU and GPU call stack in the top-left pane. You can also view the variables, registers and HW state in the top-right pane. In addition, you can see that the Jetson TK1’s GPU is executing 16 blocks of 64 threads each running on the single Streaming Multiprocessor (SMX) of this GK20A GPU.

You can also switch to disassembly view and watch the register values being updated by clicking on the i-> icon to do GPU instruction-level single-stepping.


To “pin” (focus on) specific GPU threads, double click the thread(s) of interest in the CUDA tab in the top-right pane. The pinned CUDA threads will appear in the top-left pane, allowing you to select and single-step just those threads. (Keep in mind, however, that single-stepping a given thread causes the remaining threads of the same warp to step as well, since they share a program counter.)  You can experiment and watch this by pinning threads that belong to different warps.

There are more useful debug features that you will find by going into the debug configuration settings from the debug icon drop down, such as enabling cuda-memcheck and attaching to a running process (on the host system only).

To quit the application you are debugging, click the red stop button in the debugger perspective.

Profiling Your First Jetson TK1 Application in Nsight

Let’s switch back to the C++ project editor view to start the profiler run. The remote target system configuration you setup in Nsight earlier will also be visible to you under the profiler icon in the toolbar.

Before you launch the profiler, note that you need to create a release build with -lineinfo included in the compile options. This tells the compiler to generate information on source-to-instruction correlation. To do this, first go to the project settings by right-clicking on the project in the left pane. Then navigate to Properties->Build->Settings->Tool Settings->Debugging and check the box that says “Generate line-number…” and click Apply.

Back in the main window, click on the build hammer dropdown menu to create a release build. Resolve any build issues as you did during the first run above, then click on the Run As->Remote C/C++ Application to run the release build of the application. At this point Nsight will overwrite the Jetson TK1 system with the release binary you want to profile and run it once.

Next click on the profile icon dropdown and choose Profile Configurations where you must select “Profile Remote Application” since the binary is already on the Jetson TK1. Nsight will then switch you to the profiler perspective while it runs the application to gather an execution timeline view of all the CUDA Runtime and Driver API calls and of the kernels that executed on the GPU. The properties tab displays details of any event you select from this timeline; the details of the events can also be viewed in text form in the Details tab in the lower pane.


Below the timeline view in the lower pane, there is also an Analysis tab that is very useful for performance tuning. It guides you through a step-by-step approach on resolving performance bottlenecks in your application. You can switch between guided and unguided analysis by clicking on their icons under the Analysis tab.

You can also get a source-to-instruction correlation view, with hot spots (where the instructions-executed count was particularly high) identified in red as shown in the figure below. You get this view from within the guided analysis mode by first clicking on “Examine Individual Kernels” and selecting the highest ranked (100) kernel from the list of examined kernels, then clicking “Perform Kernel Analysis” followed by “Perform Compute Analysis.” From there, clicking “Show Kernel Profile” will show d_boxfilter_rgba_a kernel in the right pane. Double-click on the kernel name to see the source-to-instruction view. Clicking on a given line of source code highlights the corresponding GPU instructions.


As you can see, whether you are new to NVIDIA® Nsight™ Eclipse Edition or an avid Nsight user, Nsight makes it just as easy and straightforward to create CUDA applications for the Jetson TK1 platform as for all your CUDA-enabled GPUs.


About Satish Salian

Satish Salian
Satish Salian is a Sr. Software Engineering Manager at NVIDIA responsible for the software stack and developer experience of world’s fastest deskside deep learning machine called the DIGITS DevBox. Satish has over 13 years of experience at NVIDIA with prior projects that include building CUDA developer tools, display control UI tools and SDKs at NVIDIA. He has a Bachelor's degree in Computer Engineering from University of Pune, India.
  • Josh Smith

    Thanks for this post, Satish! I’m still waiting to get my Jetson TK1. When I get it, I intend on developing on it from my Mac (OS X Mavericks). Do you know if that setup is supported, or should I use a Ubuntu partition instead?

    • Satish Salian

      Josh good to know that you have a board on the way. Please use Ubuntu 12.04 LTS on the host for cross development. MAC OSX is also a supported host platform but “synchronize-projects” remote development mode is the way to go on MAC, I’ll add more details on MAC in a future post.

    • Satish

      Josh good to know that you have a board on the way. Please use Ubuntu 12.04 LTS on the host for cross development. MAC OSX is also a supported host platform but “synchronize-projects” remote development mode is the way to go on MAC, I’ll add more details on MAC in a future post.

  • Is there any way to compile the program inside the Jetson itself?! I couldn’t find any instruction for that?

  • Alexander Koumis

    Great guide Satish, looking forward to the synchronize-projects version.

  • Miner

    Hi, I followed the same steps but I keep on running into Xlib : extension “GLX” missing on display “:0” error… Can anybody guide me.

    • Satish

      You would usually see this error if you don’t have a active desktop running on Jetson TK1. Do you have a panel connected to Jetson TK1?

      • Miner

        Thank you. That helped.

  • Satish

    All my earlier replies on these questions/comments were made from the blog portal and were thus lost. So if you seeing late replies you know why:-) I am now using disqus for the replies.

  • Loukas Bampis

    Hello, I followed the above instructions and everything worked very good. Thank you for your great tutorial. I have one problem though. I am using Nsight in order to debug and profile my code and when I time the output lets say that I get x seconds. If I take the exact same code and compile it on the board, I am getting y seconds, with y secs being smaller than x. So the algorithms run faster if I compile them on the board and without using cross-compilation. Does anybody have any idea for that?
    Thank you.

    • Satish

      The generated GPU(SASS) code will be the same whether cross compiled or natively compiled. Please check the GPU code generation options (I mentioned above in the blog) is the same in both the cross compile scenario and the native compile case, they both need to be SM32 for code and SM30 for PTX. Also check if you are using any debug options -G, make sure any such flags are same across both the compile paths.

  • payal talati

    Can I upgrade graphics driver in Jetson TK1 platform?

    – I am having NVIDIA Jetson TK1 kit and I am having Linux ubuntu inside. Now I need to try latest ES3.1 extension like tesselation shader or draw indirect but I am getting linker error as those functions are not available in the library.

    – I am assuming NVIDIA is working on new ES3.1 extension with google. So, I believe there must be new version of drivers for that toolkit.


    • Satish

      No you should never update just the driver on JetsonTK1 since the driver is part of the L4T OS image. ES3.1 is supported in the upcoming Rel21 to be announced soon.

      • Shiney

        Ok, Thanks Satish.

  • Graham

    “sudo apt-get update” has problem something like this. Is there any suggestions for this?

    Err http://archive.ubuntu.com precise-security/universe armhf Packages
    404 Not Found [IP: 2001:67c:1360:8c01::19 80]

    • Mark Ebersole

      Graham, this is a known issue after adding “foreign-architecture
      armhf” to multiarch file and we’re working to fix it.

      However, this shouldn’t have any effect on your system (it’s a harmless error). Are you seeing other problems?

      • Graham

        So far it is good! Thank you Mark!!!

  • Archith


    The .deb file for the cross compilers installs the 6.5 version of CUDA’s cross compilers. On the otherhand, Jetson TK1 is at CUDA 6.0. This causes a version mismatch between the gdb server on TK1 and the gdb client on the host. Is there a way to resolve this? I am running ubuntu 12.04.


    • Hi Archith, please make sure you download the CUDA 6.0 cross compilation toolkit from the Jetson TK1 page (https://developer.nvidia.com/jetson-tk1-support), not the CUDA 6.5 toolkit from the CUDA download page. Alternatively you can wait for the next release of L4T, coming soon, which will support CUDA 6.5.

      • Archith

        Hi Mark,

        Thank you for your response.

        I did use the deb file for ubuntu 12.04. I have posted more details on the nvidia devtalk forum (https://devtalk.nvidia.com/default/topic/774786/cuda-setup-and-installation/cuda-6-0-on-ubuntu-14-04).

        The gist of the discussion was that the cross compiler deb file meant for ubuntu 12.04 points to cuda 6.5 tools instead of 6.0, and this causes an incompatibility which prevents cross-debugging. Is there an ETA on the 6.5 support for Jetson TK1?


        • The cross-compiler .deb file I referred to is this one:

          That is CUDA 6.0.

          • Archith

            I have used that exact deb file and I still run into CUDA 6.5 tools when I execute ‘apt-get install cuda-cross-armhf’, which is quite puzzling.


          • Ahah! We figured it out. :) Due to updates since the .deb was posted, you need to specify that you want the 6.0 tools like this:

            apt-get install cuda-cross-armhf-6-0

          • Archith

            Yes, that was it! I can cross-debug now. Thank you for pointing that out. It might be useful to add a note somewhere for the benefit of CUDA newbies like me.


          • Zhaoyufei

            Please help me ,when i try ”apt-get install cuda-cross-armhf-6-0”, there are some problems as below.What can i do?

            ~$ sudo apt-get install cuda-cross-armhf-6-0
            Reading package lists… Done
            Building dependency tree
            Reading state information… Done
            Some packages could not be installed. This may mean that you have
            requested an impossible situation or if you are using the unstable
            distribution that some required packages have not yet been created
            or been moved out of Incoming.
            The following information may help to resolve the situation:

            The following packages have unmet dependencies:
            cuda-cross-armhf-6-0 : Depends: cuda-driver-libs-cross-armhf (>= 331.00) but it is not installable
            Depends: cuda-driver-headers-cross-armhf (>= 331.00) but it is not installable
            Depends: cuda-core-libs-cross-armhf-6-0 (= 6.0-52) but it is not installable
            Depends: cuda-extra-libs-cross-armhf-6-0 (= 6.0-52) but it is not installable
            Depends: cuda-headers-cross-armhf-6-0 (= 6.0-52) but it is not installable
            E: Unable to correct problems, you have held broken packages.

          • Satish

            Maybe you didn’t enable armhf as a foreign arch, again here’s how you would enable armhf:

            $ sudo sh -c ‘echo “foreign-architecture armhf” >> /etc/dpkg/dpkg.cfg.d/multiarch’
            $ sudo apt-get update

          • Zhaoyufei

            Thanks,but it gave some problems when i tried $ sudo apt-get update,like this:
            Err http://extras.ubuntu.com precise/main armhf Packages
            404 Not Found
            Err http://security.ubuntu.com precise-security/main armhf Packages
            404 Not Found [IP: 80]
            Err http://security.ubuntu.com precise-security/restricted armhf Packages
            404 Not Found [IP: 80]
            Err http://security.ubuntu.com precise-security/universe armhf Packages
            404 Not Found [IP: 80]
            Err http://security.ubuntu.com precise-security/multiverse armhf Packages
            404 Not Found [IP: 80]

            Also when i try $sudo apt-get install g++-4.6-arm-linux-gnueabinf ,it show a problem like:
            Reading package lists… Done
            Building dependency tree
            Reading state information… Done
            E: Unable to locate package g++-4.6-arm-linux-gnueabinf
            E: Couldn’t find any package by regex ‘g++-4.6-arm-linux-gnueabinf’

            so how can i get ‘g++-4.6-arm-linux-gnueabinf’??
            please help me

          • Zhaoyufei

            thanks for all your reply! They really help a lot. Finally i have set up the toolkit.

    • Ashish Rajput

      Hi Archith,

      Did you use 32 bit or 64 bit? Also i got stuck on

      “sudo dpkg – i cuda-repo-ubuntu1204_6.0-37_amd64.deb” command. It always gives me some error.
      In short, thing are not favorable. Any suggestion would be much appreciated.

      • Archith

        Maybe you could post the error you are seeing?


  • Ashish Rajput

    Hi, i am having trouble deciding version. Please share your suggestions. I will try keep it simple.

    Ubunut 14 or 12 ?
    32bit or 64 bit?
    Cuda toolkit 6 or 6.5?

    i have tried my best and it appears dpkg command is not working in any version. i have tried ubuntu 14, 12 32-64bit.

    This paper suggests we should use ubuntu 12 (32 bit) but this file ”

    sudo dpkg – i cuda-repo-ubuntu1204_6.0-37_amd64.deb” seems 64bit to me.

    • Satish

      On the host packages there are no x86 32b debian packages from NVIDIA so host system has to be 64b system. For cross compilation stay with Ubuntu12.04 on the host. You are using the right CUDA6.0 toolkit package cuda-repo-ubuntu1204_6.0-37_amd64.deb for your 12.04 host system.

      Regarding CUDA 6.5 toolkit, please note the current shipping Jetson TK1 OS image for L4T (Linux for Tegra) does not contain the latest CUDA 6.5 toolkit or the related driver. CUDA6.5 toolkit will be available in a future L4T release (Rel21.2). You can check the L4T version with the following command:

      > head -1 /etc/nv_tegra_release.

      If you want more flexibility on the host OS please use the Nsight synchronized-project mode.
      More info @ http://devblogs.nvidia.com/parallelforall/remote-application-development-nvidia-nsight-eclipse-edition/

      • Ashish Rajput

        Thanks for help. It’s working fine at the moment. Also, i manage to cross-compile cuda samples on both Ubuntu 14.01 and 12.04

      • Guest

        Thanks for help. It’s working fine at the moment. Also, i manage to do cross-compilation on both Ubuntu 14.01 and 12.04

  • Zhaoyufei

    I have installed the toolkit on my Jetson Tk1,and checked that by ‘nvcc -v’on the terminal.The document says that the Nsight is inside the toolkit,how can i find the Nsight? PS,I have tried type the ‘nsight’ on the terminal,but it said there is no such commend. What should I do to get the Nsight on my TK1 ?And I only want to write native CUDA code ,not cross-compilation,only TK1,is that possible ?

    • Satish

      There are no native UI tools in ARM JetsonTK1 toolkit thus no NsightEclipse on Jetson TK1. If you want do native compilation instead of cross compilation, you can use the remote synchronized-project mode. More info @ http://devblogs.nvidia.com/parallelforall/remote-application-development-nvidia-nsight-eclipse-edition/

      • Zhaoyufei

        Thank you very much !
        I have some problems when I try to install the toolkit to my ubuntu 14.04(32bit) host(double OS with Win7 64bit). After install the driver Version 340 for my GT555m,I can’t get my UI desktop back by start lightdm.I can’t fix it .Is the problem of the driver ?

        As i don’t have another PC, can i make remote or cross compilation on Windows platform?

        Or maybe I should change to 12.04(64bit)?Is ubuntu-12.04.4-desktop-amd64 OK?

        Dose double OS make any effect to the host?

        • Ashish Rajput

          Hi Zhaoyufei,

          You should not use 32 bit version either of Ubuntu 14 or 12 OS (nvidia provides 64bit cross compilation toolkit package only).

          Does not matter whether you have graphics card installed on host PC.

          Windows, you should NOT do this.

          Yes, it would be a lot easier to set cross-compilation using Ubuntu 12.04 (64 bit) on host. The problem you may encounter installing dpkg package (run as root user would resolve it). Also, while installing cuda cross arm compiler use “sudo apt-get install
          cuda-cross-armhf-6-0” instead what mentioned above (it may install cuda-cross-6.5).

          Though i have not tested with double OS but based on my understading it should not affect at all.

          Hopefully, this will bring smile on your face.


          • Zhaoyufei

            Hi Ashish
            Thanks for your reply! I have installed the toolkit 6.5,so i am wandering will toolkit 6.5+cross 6.0 work?i am a undergraduate student,and the gurduate project my teacher gave me is to achieve an image enhance algorithm on Jetson TK1 by Gpu programing. I never involved cuda before,so could you please recommend some basic CUDA study materials ?

          • Satish

            Ashish, great to see you helping out Zhaoyufei with his questions after your recent success.
            Zhaoyufei, you will see version mismatch if you mix toolkit versions so stay with 6.0TK on the host too. Here’s links to imaging samples and programming guide: http://docs.nvidia.com/cuda/cuda-samples/index.html#imaging and http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#programming-model

          • Zhaoyufei

            Hi Satish
            Thanks for your reply.I downloaded my toolkite 6.5 on Nvidia’s web :
            There are only toolkit 6.5.Could you please tell me where can i find the toolkit 6.0? How can i return my toolkit to 6.0?Just uninstall the 6.5 and install the 6.0?Please give me some guides.
            Besides thanks for the links,that help me a lot.^_^

          • Satish

            Execute “sudo apt-get install cuda-toolkit-6.0” to get 6.0 toolkit installed side-by-side with your existing 6.5 install. That’s it since you have 6.5 already installed.
            For users who don’t have latest toolkit installed, older versions of released toolkits can be found here:


          • Zhaoyufei

            Thanks for reply.
            That sounds easy for me ,are they compatible with each other?
            All i have to do is to ececute “sudo apt-get install cuda-toolkit-6.0”?
            If i stay with 6.5 ,can i use the remote synchronized-project mode for Jetson TK1?

            thank you very much!

          • Satish

            Yes execute that command I mentioned above to install 6.0TK side by side with 6.5TK. And no you cannot use 6.5TK with JetsonTK1 since it has a 6.0TK, there will be version mismatch issues.

  • Zhaoyufei

    If my host uses toolkit 6.5 can I make cross compilation for jetson tk 1? What about the remote synchronized-project mode? If not, what Should I do ? Please

    • Ashish Rajput

      Hi Zhaoyufei,

      The current Linux for Tegra (L4T) r19 OS version does not have drivers included to support cuda-toolkit 6.5. It will be included in the forthcoming version of L4T r21. (I read it somewhere but could not recall exact source)


  • Zhaoyufei

    Excuse me,it’s me again.When i try the image sample on my host Nsight, it gives an error like the picture ”freeglut (/home/zhaoyufei/cuda-workspace/1/Debug/1): OpenGL GLX extension not supported by display ‘:0’)”.Some people say that’s because the driver,the picture 3 is my driver setting(version 340),and it seems not in use .But i can use the Nsight for some other samples not involve image which means i can use the driver Version.340.

    Is that because some thing wrong with my GL lib?

    There is also a question bother me, can i use my independent video card (GT 555m)for both OS UI and CUDA GPU computation? The situiation on my laptop seems to be that the Ubuntu use my Integrated graphics for UI and my GT55m for GPU computations,so my Nvidia X server setting looks like pic 2.
    Is that promble aboved caused by this?Should i use my GPU for both X server and computation?

    Waiting for your reply! thanks

    • Satish

      Based on your laptop config that’s correct you can run only non-GL CUDA apps on your laptop. Please use Nsight on your to host system to connect to JetsonTK1 and run OGL-CUDA app there.

      • Zhaoyufei

        You mean i have to use remote mode to run those image samples ?
        I have to connect my laptop to JetsonTK1 and there should be a display connect to the TK1? The result will be shown on the display,not my laptop,am i right?

        • Satish

          Yes on all those Qs.

          • Zhaoyufei

            Thanks for reply.

            I’m soory i still have some questions.Can you give some detials about connecting host with Jetson TK1? Should i use twisted -pair cable? Should the OS on Jetson TK1 be running? when i use scp to copy libs ,the “IP_ADDR” is the host’s ip address?

          • You just need to connect both the TK1 and your host system to the same network. You may find the guides here useful for getting started with Jetson TK1: http://elinux.org/Jetson_TK1

  • Alexander

    Hello. I’m sorry, i have this problem and i don’t how a can solve it. Please help me.

    • Satish

      Please follow the steps listed under:
      Building Your First Jetson TK1 Application from Nsight

      • Alexander

        I’m sorry but i make step-by-step/ I don’t know what certainly i make wrong

        • Satish

          Here’s how you copy libglut from your JetsonTk1 to your host system and create symlink:
          scp ubuntu@IP_ADDR:/usr/lib/arm-linux-gnueabihf/libglut.so.3 /usr/arm-linux-gnueabihf/lib
          ln -s /usr/arm-linux-gnueabihf/lib/libglut.so.3 /usr/arm-linux-gnueabihf/lib/libglut.so
          You can repeat the same for other libs.

          • Alexander

            Satish, thnk u very much. It works fine.

          • Zhaoyufei

            hi Satish,I tried this

            “scp ubuntu@ /usr/arm-linux/gnueabihf/lib folder, with a symlink to libGL.so”

            the IP address is my Jetson TK1’s address.
            but it gave an error like this
            “libGL.so: No such file or directory”

            i have checked that my laptop and Jetson, they can ping to each other,and the ‘/usr/lib/arm-linux-gnueabihf/tegra/libGL.so.1 ‘exist. Is there something else wrong? Please give your suggestions. Thanks

          • Graham

            This should work.
            “scp ubuntu@ /usr/arm-linux-gnueabihf/lib”

  • Alexander

    Hello. I’m sorry, i have this problem and i don’t how a can solve it.

  • Wilson

    Hello, when I tried to build the project, I have this problem, cannot find -lcudadevrt and -lcudart_static.
    And I tried to find these two files in /usr/local/cuda/lib64, I only found libcudadevrt.a and libcudart_static.a instead of libcudadevrt.so and libcudart_static.so.
    Please help me, thanks.

    • Satish

      Make sure you have CUDA cross packages installed and also read the build step #1 : Right click on the project and navigate to
      Properties->Build->Settings->Tool Settings->NVCC
      Linker->Libraries and update the paths to point to linux/armv7l
      instead of linux/x86_64

      • Wilson

        – First, thanks for your reply. I have made sure that step #1 is done. But it still doesn’t work. Maybe I didn’t install the CUDA cross packages well, if so, how could I verify if it’s installed successfully.
        I tried to type “nvcc -V”, and it showed :
        nvcc: NVIDIA (R) Cuda compiler driver
        Copyright (c) 2005-2013 NVIDIA Corporation
        Built on Thu_May__8_22:30:05_PDT_2014
        Cuda compilation tools, release 6.0, V6.0.1

        – And for the step #2, if I add a new -Xlinker option “—unresolved-symbols=ignore-in-shared-libs”, another error occurs:
        “cannot find —unresolved-symbols=ignore-in-shared-libs: No such file or directory”

        Why does this error occur?

        • Wilson

          I just discovered why an error would occur in step #2. The first character of “—unresolved-symbols=ignore-in-shared-libs” on the web is fullwidth form so that Nsight can’t recognize it.

      • Wilson

        Hello, Satish, I have resolved the problem by reinstalling the Ubuntu and cuda toolkit. But while installing cuda-cross-armhf, I use “sudo apt-get install cuda-cross-armhf-6-0” instead of “sudo apt-get install cuda-cross-armhf”. If I use “sudo apt-get install cuda-cross-armhf”, it would install cuda-6.5 by default.
        Thanks a lot!

  • Graham

    Hello All,

    I have a problem with
    “$ sudo dpkg – i cuda-repo-ubuntu1204_6.0-37_amd64.deb”

    When I enter the command I am getting this response

    “Unknown configuration key `foreign-architecture’ found in your `dpkg’
    configuration files. This warning will become a hard error at a later
    date, so please remove the offending configuration options and replace
    them with `dpkg –add-architecture’ invocations at the command line.”

    Any idea?

    • Satish

      Are you running this command on your Linux64 host system?

      • Graham

        I solved this problem typing the following. There was a dash problem.

        $ sudo dpkg -i cuda-repo-ubuntu1204_6.0-37_amd64.deb

  • Guest

    Hello Satish,
    First of all, thank you for this documentation.

    My problem is below. I implemented al the steps. Can you help me?
    Thank you

  • Graham

    Hello Satish,

    First of all, thank you for this documentation.

    My problem is in the screenshot. I implemented all the steps. Can you help me?

    Thank you

    • Satish

      The misc linker option in step#2 must appear like this --unresolved-symbols=ignore-in-shared-libs

      Since you have successfully copied the ARM libs, make sure you updated the paths to point to linux/armv7l to resolve the remaining issue.

      • Graham

        Thank You Satish,

        I got some improvement; however, I still need to solve this problem. I posted the screenshots.

        I also tried to change the permission of files that I transferred from JTK1. But it did not work.

        • Satish

          You need to copy all the 3 target libs files listed under build step#3. Also make sure the sym links are setup.

          • Graham

            Satish, I made it work. Thank you for your help!

            My purpose is to implement an image processing algorithm using OpenCV in JTK1.

            Can I run the GPU Module of OpenCV in JTK1 using cross-compilation? if so, how can I do that?

            I have one more question. Is there any mathematical optimization in CUDA or in CUDA related libraries?

            Thanks again.

  • Alexander

    Hello Satish, thank you! I want to install cuda tookit 6.0 on my host machine, download .deb file for version 6.0 but my host downloads 6.5 from nvidia site. How fix that? do you have direct link for cuda toolkit 6.0 for ubuntu x64

    • Satish

      Use this 6.0 specific command: apt-get install cuda-cross-armhf-6-0

  • peepo

    Satish, a lot has happened in 6 months! Could we please have a clean write up for JetPack with 14.04?
    I’m working through this, but at a loss to be sure what has already been incorporated, and what has not…
    thanks again


    No source available for “sched_yield() at 0xb6d6cdb6”

    • Satish

      Jonathan yes agreed on an updated writeup on install steps with the newer TK version. It maybe a CUDA Pro tip. Will see.

  • Ham

    when i click on the build “hammer” icon in the toolbar menu to build a debug ARM binary, ourring error.

    the error is blow.

    19:59:22 **** Incremental Build of configuration Debug for project boxfilter-arm ****

    make all

    Building file: ../src/boxFilter.cpp

    Invoking: NVCC Compiler

    /usr/local/cuda-6.5/bin/nvcc -I”/usr/local/cuda-6.5/samples
    /3_Imaging” -I”/usr/local/cuda-6.5/samples/common/inc” -I”/home/ubuntu/cuda-workspace/boxfilter-arm”
    -G -g -O0 -ccbin arm-linux-gnueabihf-g++-4.6 -gencode
    arch=compute_30,code=sm_30 -gencode arch=compute_32,code=sm_32
    –target-cpu-architecture ARM -m32 -odir “src” -M -o “src/boxFilter.d”

    arm-linux-gnueabihf-g++-4.6: No such file or directory

    make: *** [src/boxFilter.o] Error 1

    19:59:22 Build Finished (took 108ms)

    what should i do?
    please help me

  • Chulian

    Hello Satish,

    I followed all the instructions presented in this post, everything went pretty well except the remote profiling. It always ends up with an error message showed below:

    “Unable to profile application.
    com.nvidia.viper.jni.CuptiException: CUPTI_ERROR_NOT_COMPATIBLE”

    To figure out what’s the problem, I did the following tries:
    1) Run a couple of different CUDA samples, all has the same error.
    2) Reset Jetson-TK1 and host machine, do profiling again, still has the same error.
    3) Do remote profiling from different host machines, all has the same error.
    4) Run application with nvprof on Jetson-TK1 board, generate an output file called “timeline.nvprof”; Import this output file into the Nvidia Visual Profiler in host machine, the same error happen again!
    5) Do the same as 4), but import the output file using the command: “nvprof –import-profile timeline.nvprof” on host machine. It complains that “Warning: The profile is invalid or incomplete”. Then shows some partial result.
    6) Do the same as 4), but import the output file using the command: “nvprof –import-profile timeline.nvprof” on Jetson-TK1 board. Everything works perfect. It shows all the timeline info without any warning.

    According to the above experiments, I think the problem is that nvprof on host machine cannot “understand” output file generated by nvprof on Jetson-TK1. But still it cannot explain why there is such error as “com.nvidia.viper.jni.CuptiException: CUPTI_ERROR_NOT_COMPATIBLE”. Could you help take a look and verify if my analysis make sense or not? If not, what could be the problems?

    For now, I can just use the nvprof on Jetson-TK1 to do profiling. But I still want to use the Nvidia Visual Profiler in the future since it’s much more convenient.

    FYI, The jetson-TK1 board has the latest SW package (JetPack 1.0). The host machine is running Ubuntu-12.04 LTS.



    • Chulian

      Hello Satish,

      After a couple days of trial and error, I finally figured out the problem (special thanks to Yang Zhang, an engineer from DeepGlint). The reason is that nvprof on device and host has different versions.

      It’s quite “interesting” that the JetPack TK1 1.0 (the latest version at the time of writing) provided on Nvidia website install nvprof of different versions on device and host, respectively. As a result, remote profiling using Visual Profiler always fails.

      To solve this problem, I just installed the older version cuda-toolkit-6-5 on the device, using this package: cuda-repo-l4t-r21.1-6-5-prod_6.5-14_armhf.deb. To be more clear, the device is actually running L4T-r21.2.


      • Satish

        That’s right Chulian you need the same version of the CUDA toolkit on the host system and the target system and you did the right thing by installing 6.5.14 since that is the 6.5 production release that is required on both systems.

        We will get the Jetpack installer fixed to install the same version of the TK on both systems.

  • disqus_T8ujNM74Ki

    I am a newer for the Jetson TK1. After following step by step to create the first cross-complier project, there are some errors when linking:
    /usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/bin/ld: warning: libxcb.so.1, needed by /usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/lib/../lib/libX11.so, not found (try using -rpath or -rpath-link)
    /usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/lib/../lib/libGL.so:‘XextCreateExtension’ Reference to undefined
    /usr/lib/gcc-cross/arm-linux-gnueabihf/4.8/../../../../arm-linux-gnueabihf/lib/../lib/libglut.so:‘XF86VidModeGetViewPort’ Reference to undefined

    Can anyone help me to fix them?

  • Prem Khanal

    Hi Satish,
    Thanks for the wonderful article. I was able to cross compile on my Jetson TK1 board. I am having issues with libraries of open cv and flycapture sdk while doing a cross compile development. Hence I have decided to develop local on Jetson TK1. The toolkit that I installed on ARM system doesn’t include NSight. Please let me know where can I find Nsight Eclipse edition for ARM local development.

    • Satish

      There is no native ARM version of Nsight Eclipse Edition yet so your only choice is to use nvcc locally on the command line.

  • Marcus_V

    Can we *please* get an updated version of this page? The information is 15 months and several revisions of everything out of date.

    Please? Must we beg?

  • meng jun

    /usr/lib/gcc/arm-linux-gnueabihf/4.6/../../../../arm-linux-gnueabihf/bin/ld: cannot find -lGLU
    and other is like this

    i have done below:
    sudo scp ubuntu@ /usr/arm-linux-gnueabihf/lib
    sudo ln -s /usr/arm-linux-gnueabihf/libGL.so.1 /usr/local/cuda/lib/libGL.so

    • Kevin Kang

      did you try:
      $sudo ln -s /usr/arm-linux-gnueabihf/libGL.so.1 /usr/arm-linux-gnueabihf/libGL.so
      Instead of
      $sudo ln -s /usr/arm-linux-gnueabihf/libGL.so.1 /usr/local/cuda/lib/libGL.so

  • Hong Xu

    Hi,Satish.Firstly,thank for your acticle.
    I have a problem as follows.Can you help me?
    The problem:(my host system is ubuntu 14.04(32bit)on jetson tk1

    ubuntu@tegra-ubuntu:/disk1$ sudo dpkg -i cuda-repo-ubuntu1404_6.5-14_amd64.deb
    Unknown configuration key `foreign-architecture’ found in your `dpkg’
    configuration files. This warning will become a hard error at a later
    date, so please remove the offending configuration options and replace
    them with `dpkg –add-architecture’ invocations at the command line.
    Unknown configuration key `foreign-architecture’ found in your `dpkg’
    configuration files. This warning will become a hard error at a later
    date, so please remove the offending configuration options and replace
    them with `dpkg –add-architecture’ invocations at the command line.
    Unknown configuration key `foreign-architecture’ found in your `dpkg’
    configuration files. This warning will become a hard error at a later
    date, so please remove the offending configuration options and replace
    them with `dpkg –add-architecture’ invocations at the command line.
    dpkg: error processing archive cuda-repo-ubuntu1404_6.5-14_amd64.deb (–install):

    • kevin kang

      Hi Hong Xu,

      The “cuda-repo-ubuntu1404_6.5-14_amd64.deb” installer is a “X86 64-bit” deb installer, however, looks like you’re trying to install it on the target platform(“ubuntu@tegra-ubuntu:/disk1$”). Please note that the first step involved in cross-compilation is installing the CUDA Toolkit on your host system.

      Also you can refer to https://developer.nvidia.com/linux-tegra-r214 for more information of the supported softwares on TK1 platform. Thanks.

      • Hong Xu

        Thank you!

  • tamo2

    Hi Satish,
    I followed your instructions, and I can run the boxfilter sample remotely. I am having a problem debugging remotely. When I start remote debugging, I always get “Error in final launch sequence Connection is shut down ,Connection is shut down….” message. Any ideas?

    • tamo2

      OK, fixed. I had “gdb” in “CUDA GDB executable” in Debugger tag. Replaced it to “cuda-gdb” fixed the issue.