Executive Summary

  • Announcement of a 500+ Petaflop sustained performance milestone using the Hardware Accelerated Cosmology Code (HACC) on the Frontier system at SC25.
  • Analysis of the software orchestration across 9,400+ nodes using AMD Instinct GPUs and the Slingshot-11 interconnect.
  • Discussion of the implications for large-scale cosmic simulations and dark matter research.

Strategic Deep-Dive

The SC25 conference marked a historic moment for the supercomputing community as the Frontier system, housed at Oak Ridge National Laboratory, achieved a sustained performance of over 500 Petaflops using the Hardware Accelerated Cosmology Code (HACC). While Frontier’s peak theoretical performance is rated in the exascale range, reaching a sustained 500+ Petaflops on a real-world scientific application is a profound achievement that validates the entire exascale investment. HACC is a highly specialized code used to simulate the distribution of dark matter and the formation of large-scale structures in the universe.

This performance level allows cosmologists to simulate billions of particles with gravitational interactions at a resolution previously thought impossible within reasonable timeframes.

Achieving this milestone required extreme software orchestration across more than 9,400 nodes. Each node in Frontier consists of an AMD EPYC CPU paired with four AMD Instinct GPUs, linked by the Slingshot-11 interconnect. The complexity of managing data movement across nearly 40,000 GPUs without hitting a communication bottleneck is a testament to the maturation of the ROCm 6.x software stack.

Specifically, the HACC team utilized advanced congestion control algorithms within the Slingshot fabric to ensure that the N-body gravitational solvers remained compute-bound rather than stalled by network latency. This level of “hardware-software co-design” is the hallmark of modern exascale computing, where the code is written to exploit the specific instruction sets and memory hierarchies of the underlying silicon.

Beyond the raw numbers, this milestone has immediate implications for global scientific research. The ability to run HACC at 500+ Petaflops means that simulations which once took months can now be completed in a matter of days. This acceleration enables a faster iterative loop for testing theories on dark energy and the early expansion of the universe.

Furthermore, it proves that the AMD-based architecture of Frontier is not just a benchmark champion but a practical, reliable workhorse for the world’s most complex computational problems. As the industry looks toward future “post-exascale” systems, the success of HACC on Frontier provides a blueprint for how to scale specialized scientific kernels to tens of thousands of nodes while maintaining high efficiency and numerical accuracy.