cuPyNumeric Permits Scientists to Harness GPU Acceleration at Cluster Scale

cuPyNumeric Permits Scientists to Harness GPU Acceleration at Cluster Scale

Whether or not they’re taking a look at nanoscale electron behaviors or starry galaxies colliding hundreds of thousands of sunshine years away, many scientists share a typical problem — they have to comb by means of petabytes of knowledge to extract insights that may advance their fields.

With the NVIDIA cuPyNumeric accelerated computing library, researchers can now take their data-crunching Python code and effortlessly run it on CPU-based laptops and GPU-accelerated workstations, cloud servers or huge supercomputers. The quicker they’ll work by means of their information, the faster they’ll make choices about promising information factors, tendencies price investigating and changes to their experiments.

To make the leap to accelerated computing, researchers don’t want experience in pc science. They’ll merely write code utilizing the acquainted NumPy interface or apply cuPyNumeric to current code, following finest practices for efficiency and scalability.

As soon as cuPyNumeric is utilized, they’ll run their code on one or hundreds of GPUs with zero code adjustments.

The most recent model of cuPyNumeric, now out there on Conda and GitHub, presents assist for the NVIDIA GH200 Grace Hopper Superchip, automated useful resource configuration at run time and improved reminiscence scaling. It additionally helps HDF5, a well-liked file format within the scientific neighborhood that helps effectively handle massive, advanced information.

Researchers on the SLAC Nationwide Accelerator Laboratory, Los Alamos Nationwide Laboratory, Australia Nationwide College, UMass Boston, the Heart for Turbulence Analysis at Stanford College and the Nationwide Funds Company of India are amongst those that have built-in cuPyNumeric to realize important enhancements of their information evaluation workflows.

Much less Is Extra: Limitless GPU Scalability With out Code Adjustments

Python is the most typical programming language for information science, machine studying and numerical computing, utilized by hundreds of thousands of researchers in scientific fields together with astronomy, drug discovery, supplies science and nuclear physics. Tens of hundreds of packages on GitHub rely on the NumPy math and matrix library, which had over 300 million downloads final month. All of those purposes may gain advantage from accelerated computing with cuPyNumeric.

Many of those scientists construct applications that use NumPy and run on a single CPU-only node — limiting the throughput of their algorithms to crunch by means of more and more massive datasets collected by devices like electron microscopes, particle colliders and radio telescopes.

cuPyNumeric helps researchers preserve tempo with the rising dimension and complexity of their datasets by offering a drop-in substitute for NumPy that may scale to hundreds of GPUs. cuPyNumeric doesn’t require code adjustments when scaling from a single GPU to a complete supercomputer. This makes it straightforward for researchers to run their analyses on accelerated computing programs of any dimension.

Fixing the Large Knowledge Drawback, Accelerating Scientific Discovery

Researchers at SLAC Nationwide Accelerator Laboratory, a U.S. Division of Vitality lab operated by Stanford College, have discovered that cuPyNumeric helps them velocity up X-ray experiments performed on the Linac Coherent Gentle Supply.

A SLAC staff centered on supplies science discovery for semiconductors discovered that cuPyNumeric accelerated its information evaluation software by 6x, lowering run time from minutes to seconds. This speedup permits the staff to run vital analyses in parallel when conducting experiments at this extremely specialised facility.

Through the use of experiment hours extra effectively, the staff anticipates it will likely be capable of uncover new materials properties, share outcomes and publish work extra shortly.

Different establishments utilizing cuPyNumeric embrace: 

  • Australia Nationwide College, the place researchers used cuPyNumeric to scale the Levenberg-Marquardt optimization algorithm to run on multi-GPU programs on the nation’s Nationwide Computational Infrastructure. Whereas the algorithm can be utilized for a lot of purposes, the researchers’ preliminary goal is large-scale local weather and climate fashions.
  • Los Alamos Nationwide Laboratory, the place researchers are making use of cuPyNumeric to speed up information science, computational science and machine studying algorithms. cuPyNumeric will present them with extra instruments to successfully use the lately launched Venado supercomputer, which options over 2,500 NVIDIA GH200 Grace Hopper Superchips.
  • Stanford College’s Heart for Turbulence Analysis, the place researchers are growing Python-based computational fluid dynamics solvers that may run at scale on massive accelerated computing clusters utilizing cuPyNumeric. These solvers can seamlessly combine massive collections of fluid simulations with widespread machine studying libraries like PyTorch, enabling advanced purposes together with on-line coaching and reinforcement studying.
  • UMass Boston, the place a analysis staff is accelerating linear algebra calculations to research microscopy movies and decide the power dissipated by lively supplies. The staff used cuPyNumeric to decompose a matrix of 16 million rows and 4,000 columns.
  • Nationwide Funds Company of India, the group behind a real-time digital cost system utilized by round 250 million Indians day by day and increasing globally. NPCI makes use of advanced matrix calculations to trace transaction paths between payers and payees. With present strategies, it takes about 5 hours to course of information for a one-week transaction window on CPU programs. A trial confirmed that making use of cuPyNumeric to speed up the calculations on multi-node NVIDIA DGX programs might velocity up matrix multiplication by 50x, enabling NPCI to course of bigger transaction home windows in lower than an hour and detect suspected cash laundering in close to actual time.

To study extra about cuPyNumeric, see a reside demo within the NVIDIA sales space on the Supercomputing 2024 convention in Atlanta, be a part of the theater speak within the expo corridor and take part within the cuPyNumeric workshop.   

Watch the NVIDIA particular handle at SC24.