AMD and HPE Team Up on Helios Rack-Scale AI Platform - HostingJournalist.com

HPE will be one of the first system vendors to bring AMD’s new ‘Helios’ rack-scale AI architecture to market globally, starting in 2026, giving cloud providers, research institutions, and large enterprises an alternative, open, Ethernet-based platform for building very large GPU clusters.

Helios is AMD’s attempt to define an industry-standard blueprint for AI factories at rack scale. Rather than focusing on a single chip or server, the architecture combines AMD EPYC CPUs, AMD Instinct GPUs, AMD Pensando smart networking, and the ROCm open software stack into a tightly integrated system. The platform is designed to deliver high performance per watt and easier scalability while avoiding some of the lock-in associated with proprietary interconnects and closed software ecosystems.

At the heart of the design is an emphasis on Ethernet as the fabric for scale-up and scale-out. Helios will support high-bandwidth communication using software-optimized Ethernet and a custom HPE Juniper Networking scale-up switch developed in collaboration with Broadcom. This switch implements the Ultra Accelerator Link over Ethernet (UALoE) standard to drive low-latency, high-throughput data movement between accelerators, aligning AMD’s AI infrastructure strategy with open, standards-based networking rather than vendor-specific fabrics.

AMD says a Helios rack configured with AMD Instinct MI455X GPUs, next-generation AMD EPYC ‘Venice’ CPUs, and AMD Pensando Vulcano NICs for scale-out networking can reach up to 2.9 exaFLOPS of FP4 performance per rack. Those components are tied together and exposed through the ROCm software ecosystem, which AMD positions as a common programming model for AI and HPC workloads across EPYC CPUs and Instinct GPUs.

Physically, Helios is based on the OCP Open Rack Wide design, the Open Compute Project standard that aims to simplify integration, deployment, and serviceability at rack scale. For customers and partners, this means the architecture is intended to slot into existing open-rack environments and shorten deployment timelines, rather than requiring entirely bespoke mechanical or power designs. HPE is using that flexibility to integrate its own networking elements, including the custom Ethernet switch and associated software stack tuned for AI traffic patterns.

For AMD, the extended partnership with HPE is a continuation of a long-running collaboration around supercomputing and exascale systems. For HPE, Helios offers a way to give its cloud service provider and large-scale AI customers faster deployment options, more flexibility in system design, and potentially lower risk when scaling up AI infrastructure. By adopting an open, rack-scale architecture with standards-based networking, HPE can offer a portfolio that aligns with customers who want large GPU clusters without being tied to a single proprietary stack.

Beyond Helios, AMD and HPE are also working together on new infrastructure for Europe’s HPC and AI ecosystems. The High-Performance Computing Center Stuttgart (HLRS) in Germany has selected AMD EPYC ‘Venice’ processors and AMD Instinct MI430X GPUs for its next flagship supercomputer, named Herder. The system will be built on the HPE Cray Supercomputing GX5000 platform and is intended to support both traditional numerical simulation workloads and a growing set of AI and machine learning applications.

Herder reflects a broader architectural shift in HPC centers, where classic simulation codes and next-generation AI models are increasingly co-located and often combined in hybrid workflows. HLRS leadership emphasizes that they must continue to deliver strong performance for established HPC applications while accommodating rising demand for AI. By using a common hardware platform for both, they aim to enable new computational methods that combine simulation, data analytics, and AI inference or training.

The Herder system is expected to be delivered in the second half of 2027 and enter production by the end of that year, replacing HLRS’s current primary supercomputer, Hunter. With this deployment, AMD and HPE are reinforcing their presence in the European research and industrial innovation landscape, where energy efficiency, open software, and flexibility in system usage are increasingly part of procurement criteria.

Taken together, the Helios architecture and the Herder deployment illustrate how AMD and HPE are trying to carve out a distinct position in an AI infrastructure market dominated by a small number of GPU and system vendors. The focus on open standards, Ethernet-based scaling, and a shared architecture for AI and classical HPC is designed to appeal to organizations that want high-end performance without locking into a single proprietary ecosystem.

Executive Insights FAQ

What makes AMD’s Helios architecture different from traditional GPU clusters?

Helios is a rack-scale, open, Ethernet-based AI platform that combines EPYC CPUs, Instinct GPUs, Pensando networking, and the ROCm software stack in a single, unified design, rather than leaving customers to assemble and tune these elements themselves.

Why is HPE’s custom scale-up Ethernet switch important for Helios?

The HPE Juniper Networking switch, developed with Broadcom and using the UALoE standard, is engineered specifically to provide high-bandwidth, low-latency communication between accelerators over Ethernet, enabling AI performance that traditionally required proprietary interconnects.

How does Helios support both AI and HPC workloads?

By using the ROCm open software ecosystem across EPYC CPUs and Instinct GPUs, Helios provides a common programming and runtime environment that can run deep learning, generative AI, and classical HPC codes on the same infrastructure.

What role does the OCP Open Rack Wide design play in Helios deployments?

The Open Rack Wide mechanical and power standard simplifies integration into existing data centers, helping partners and customers shorten deployment timelines and scale out AI racks in a more modular, serviceable way.

Why is the Herder supercomputer significant for AMD and HPE in Europe?

Herder showcases the use of next-generation EPYC and Instinct devices in a major European research center, demonstrating how AMD and HPE can support both large-scale simulation and emerging AI workloads in a single, energy-efficient supercomputing platform.

AMD and HPE Team Up on Helios Rack-Scale AI Platform – HostingJournalist.com

How to export database table via phpMyAdmin in cPanel with Fastdot

How to Break Free from the “Am I Ready?” Loop Webinar Replay

How to change your password in Gallery | FastDot Cloud Hosting

What is a Web Crawler and How it is Beneficial?

Canonical’s LTS Docker Image Portfolio is now available on Amazon ECR Public

Fast Domains

Business Hosting

PC Fusion

Linux Park

© Copyright

Executive Insights FAQ

Similar Posts

© Copyright