TACC's Stampede3 Supercomputer Uses Intel's Xeon Max with HBM2E and Ponte Vecchioby Anton Shilov on July 25, 2023 12:00 PM EST
The Texas Advanced Computing Center (TACC) unveiled its latest Stampede supercomputer for open science research projects, Stampede3. TACC anticipates that Stampede3 will come online this fall and will deliver its full performance in early 2024. The supercomputer will be a crucial component of the U.S. National Science Foundation’s (NSF) ACCESS scientific supercomputing ecosystem, and it is projected to serve the open science community from 2024 until 2029.
The third-generation Stampede cluster, which will be built by Dell, will incorporate 560 nodes equipped with Intel's Sapphire Rapids generation Xeon CPU Max processors, each offering 56 CPU cores and 64GB of on-package HBM2E memory. Surprisingly, TACC is going to be operating these nodes in HBM-only mode, so no additional DRAM will be attached to the CPU nodes – all of their memory will come from the on-chip HBM stacks.
With these specifications, Stampede3 is expected to have a peak performance of approximately 4 FP64 PetaFLOPS, while offering nearly 63,000 general-purpose cores. In addition, TACC also plans to install 10 Dell PowerEdge XE9640 servers with 40 Intel Data Center GPU Max compute GPUs for artificial intelligence and machine learning workloads.
Given this layout, the bulk of Stampede3's compute performance will be supplied by CPUs. This makes Stampede3 a bit of a rarity in this day and age, as most high-performance systems are GPU driven, leaving Stampede3 as one of the last supercomputers that relies almost solely on general-purpose CPUs.
And while the current cluster is primarily focused on CPU performance, TACC is also going to use the Intel GPUs in the latest Stampede revamp to investigate on how to incorporate larger numbers of GPUs into future versions of the system. For now, most of TACC's AI tasks are run on its Lone Star systems, which is powered by hundreds Nvidia A100 compute GPUs. So the organization's aim is to explore whether a portion of this workload can be transferred to Intel's Ponte Vecchio.
We are going to put in a small system with exploratory capability using Intel Ponte Vecchio," said Dan Stanzione, executive director of TACC. "We are still negotiating exactly how much of that will have, but I would say a minimum of 40 nodes and maximum of a hundred or so. […] We are just putting a couple of racks of Ponte Vecchio out there to see how people work with it."
Stampede3 will leverage 400 Gb/s Omni-Path Fabric technology that will enable a backplane bandwidth of 24TB/s. This setup will allow the machine to efficiently scale and minimize latencies, making it well-suited for various applications requiring simulations.
TACC also plans to reincorporate nodes from the previous version, Stampede2, which were based on older-generation Xeon Scalable CPUs. This integration will enhance the capacity of Stampede3 for high-memory applications, high-throughput computing, interactive workloads, and other previous-generation applications. In total, the new supercomputer system will feature 1,858 compute nodes with over 140,000 cores, more than 330 TBs of RAM, new storage capacity of 13 PBs, and a peak performance close to 10 PetaFLOPS.