To unlock next-generation discoveries, scientists rely on simulation to better understand complex molecules for drug discovery, physics for new sources of energy, and atmospheric data to better predict extreme weather patterns. Leading simulation and applications leverage NVIDIA Magnum IO to enable faster time to insight. Magnum IO exposes hardware-level acceleration engines and smart offloads, such as RDMA, NVIDIA GPUDirect, and NVIDIA SHARP, while bolstering the high bandwidth and ultra-low latency of NVIDIA InfiniBand and NVIDIA NVLink networked GPUs.
In multi-tenant environments, user applications may be unaware of indiscriminate interference from neighboring application traffic. Magnum IO, on the latest NVIDIA Quantum-2 InfiniBand platform, features new and improved capabilities for mitigating the negative impact on a user’s performance. This delivers optimal results, as well as the most efficient HPC and ML deployments at any scale.
Magnum IO Libraries and HPC Apps
VASP performance improves significantly when MPI is replaced with NCCL. UCX accelerates scientific computing applications, such as VASP, Chroma, MIA-AI, Fun3d, CP2K, and Spec-HPC2021, for faster wall-clock run times.
NVIDIA HPC-X increases CPU availability, application scalability, and system efficiency for improved application performance, which is distributed by various HPC ISVs. NCCL, UCX, and HPC-X are all part of the HPC-SDK.
Fast Fourier Transforms (FFTs) are widely used in a variety of fields, ranging from molecular dynamics, signal processing, and computational fluid dynamics (CFD) to wireless multimedia and ML applications. By using NVIDIA Shared Memory Library (NVSHMEM)™, cuFFTMp is independent of the MPI implementation and operates closer to the speed of light, which is critical as performance can vary significantly from one MPI to another.
The Qualitative Data Analysis (QUDA) Lattice Quantum Chromodynamics library can use NVSHMEM for communication to reduce overheads from CPU and GPU synchronization, and improve compute and communication overlap. This reduces latencies and improves strong scaling.
Multi-Node Multi-GPU: Using NVIDIA cuFFTMp FFTs at Scale
Largest Interactive Volume Visualization - 150TB NASA Mars Lander Simulation