gpu_computing_spotlight_358x230

Developing New Materials with GPU-Accelerated Supercomputers

Josh_AndersonDr. Joshua A. Anderson is a Research Area Specialist at the University of Michigan who was an early user of GPU computing technology. He began his career developing software on the first CUDA capable GPU and now runs simulations on one of the world’s most powerful supercomputers.

Anderson’s “contributions to the development and dissemination of the open source, GPU-enabled molecular simulation software, HOOMD-blue, which enables scientific computations with unprecedented speed” earned him the 2015 CoMSEF Young Investigator Award for Modeling and Simulation.

Brad Nemire: Can you talk a bit about your current research?

Joshua Anderson: I work with the Glotzer Group at the University of Michigan. We use computer simulation to discover the fundamental principles of how nanoscale systems of building blocks self-assemble, and to discover how to control the assembly process to engineer new materials. Specifically, we focus on the role of particle shape and how changing the shape can result in different material properties.

Figure 1: an example system configuration from the shape allophiles project: Eric S. Harper, Ryan Marson, Joshua A. Anderson, Greg van Anders, and Sharon C Glotzer. Shape Allophiles Improve Entropic Assembly. Soft Matter, 2015. (doi:10.1039/C5SM01351H).
Figure 1: example system configuration from the shape allophiles project: Eric S. Harper, Ryan Marson, Joshua A. Anderson, Greg van Anders, and Sharon C Glotzer. Shape Allophiles Improve Entropic Assembly. Soft Matter, 2015. (doi:10.1039/C5SM01351H).

Over the past few years, I have been focusing on two-dimensional systems, using large scale simulations to study hexatic phase transitions for hard disks, and how patterning surfaces of polygons can create shape allophiles that improve self-assembly. The hexatic phase is an intermediate between the fluid and hexagonally ordered solid. In the hexatic phase, the orientation of bonds between particles has long range order, but translational order is short range and there is no crystal lattice. Shape allophiles are polygonal shapes cut so they fit together like puzzle pieces. These research projects are computationally demanding and could not have been run on any existing code. So before I could even begin the science research, I needed to develop, implement, and optimize the parallel algorithms necessary for these studies. Continue reading

thumbnail

Accelerating Materials Discovery with CUDA

In this post, we discuss how CUDA has facilitated materials research in the Department of Chemical and Biomolecular Engineering at UC Berkeley and Lawrence Berkeley National Laboratory. This post is a collaboration between Cory Simon, Jihan Kim, Richard L. Martin, Maciej Haranczyk, and Berend Smit.

Engineering Applications of Nanoporous Materials

Figure 1: The repeating crystal structure of metal-organic framework IRMOF-1. Atom color dictionary = {carbon: gray, oxygen: red, zinc: blue, hydrogen: white}.
Figure 1: The repeating crystal structure of metal-organic framework IRMOF-1. Atom color dictionary = {carbon: gray, oxygen: red, zinc: blue, hydrogen: white}.

Nanoporous materials have nano-sized pores such that only a few molecules can fit inside. Figure 1 shows the chemical structure of metal-organic framework IRMOF-1, just one of the many thousands of nanoporous materials that have been synthesized.

Nanoporous materials have many potential engineering applications based on gas adsorption: the process by which gas molecules adhere to a surface. In this case, the walls of the material’s pores form the surface to which gas molecules stick. Figure 2 shows the unit cell of the IRMOF-1 crystal structure and the corresponding depiction of IRMOF-1 as a raveled-up surface.

If we could unravel and flatten out the surface of IRMOF-1 in Figure 2, the surface area contained in a single gram of it could cover more than a soccer field! This provides a lot of surface area on which gas molecules can adsorb. These high surface areas are part of the reason that nanoporous materials are so promising for many engineering applications.

Figure 2: A nanoporous material can be abstracted as a raveled-up surface. On the left is the unit cell of the IRMOF-1 crystal structure. On the right is a depiction of the surface that IRMOF-1 forms.
Figure 2: A nanoporous material can be abstracted as a raveled-up surface. On the left is the unit cell of the IRMOF-1 crystal structure. On the right is a depiction of the surface that IRMOF-1 forms.

Continue reading

NASDAQ_graph_thumb

How We Achieved Record Finance Benchmark Performance on Tesla K80

STAC Research develops financial benchmarks in partnership with leading banks and software or hardware vendors. The STAC-A2 suite of benchmarks aims to represent the standard risk analysis workload that banks and insurance companies use to measure exposure on the financial markets. Earlier this year we published a Parallel Forall post on Monte Carlo simulation for the pricing of American options in STAC-A2.

Record Performance with Tesla K80

Recently, STAC Research published astonishing performance results for the STAC-A2 benchmarks on an NVIDIA Tesla K80. In short,  a single Tesla K80 driven by two CPU cores outperforms all previously audited systems in terms of pure performance and power efficiency.

For more on these results, read “Bank on It: Tesla Platform Shatters Record on Risk-Management Benchmark” on the NVIDIA Blog.

tesla-k80-3qtrWe obtained these new results after several optimizations of our previously audited code. First of all, a large fraction of the computations are now avoided due to a better factorization of the underlying mathematical process. Secondly, we tuned some of the kernel parameters to take advantage of the larger register file of the Tesla K80. Finally, we were able to significantly reduce the latency in one of the main loops of the benchmark. Let’s take a look at these optimizations. Continue reading

NASDAQ_graph_thumb

American Option Pricing with Monte Carlo Simulation in CUDA C++

In finance, an option (or derivative) is the common name for a contract that, under certain conditions, gives a firm the right or obligation to receive or supply certain assets or cash flows.  A financial firm uses options to hedge risks when it operates in the markets. It is critical for a firm to be able to accurately price those instruments and understand their dynamics to evaluate its positions, balance its portfolio and limit exposure to potential threats. The calculation of risk and prices for options is a computationally intensive task for which GPUs have a lot to offer. This post describes an efficient implementation of American Option Pricing using Monte Carlo Simulation with a GPU-optimized implementation of the Longstaff Schwarz algorithm.

NVIDIA recently partnered with IBM and STAC to implement the STAC-A2™ benchmark on two NVIDIA Tesla K20X GPUs. It is the first system that was able to calculate the risk and pricing of this particular complex option in less than a second. A system with two Tesla K20X GPUs is up to 6 times faster than a state-of-the-art configuration using only CPUs. Even more interestingly, adding one or two Tesla K20X GPUs to a system offers speedups of slightly more than 5x and 9x, respectively, compared to the same system without GPUs. Continue reading

cuda_spotlight

CUDA Spotlight: GPU-Accelerated Guidance and Control for Robotic Systems

This week’s CUDA Spotlight is on Jon Rogers of Texas A&M University. Jon is director of the Helicopter and Unmanned Systems Lab, where he works on new technologies for autonomous systems.

He is currently exploring new algorithms and sensing technologies to increase task complexity of robotic devices. His research encompasses the fields of non-linear dynamics, robust control, and high-performance computing.

NVIDIA: What problems has CUDA helped you solve?

Jon-Rogers-PhotoJon: CUDA has provided an entry point to GPU programming and execution that is highly compatible with our current guidance and control software. As we search for new ways to incorporate uncertainty quantification in real-time guidance laws, we are naturally drawn to GPU-based Monte Carlo due to its flexibility in handling nonlinear dynamics and non-Gaussian behavior.

We leverage CUDA primarily for parallel trajectory simulation, which means we have developed dynamic models for several vehicles (mostly aircraft) that run within a GPU kernel. Launching thousands of threads means we can run numerous dynamic simulations at once.

Continue reading