Broadcom has announced its new Tomahawk Ultra switch for high-performance computing and AI environments. The network chip achieves a breakthrough with 250ns latency at 51.2 Tbps throughput, making Ethernet more competitive with traditional HPC interconnects.
The Tomahawk Ultra chip completely eliminates packet loss via Link Layer Retry (LLR) and Credit-Based Flow Control (CBFC). LLR detects link errors with Forward Error Correction and automatically retransmits packets, while CBFC prevents buffer overflows. These mechanisms create a lossless Ethernet fabric.
The chip also optimizes Ethernet headers by reducing overhead from 46 to 10 bytes, a 78 percent improvement. These adaptive headers remain fully Ethernet-compliant but dramatically increase network efficiency. For different applications, headers can be customized to specific needs.
The Tomahawk Ultra chip represents a multi-year re-engineering of Ethernet switching by hundreds of engineers. With a latency of just 250 nanoseconds at full throughput of 51.2 Tbps, the chip dispels myths about Ethernet limitations in high-performance environments. The performance is impressive: up to 77 billion packets per second are processed, even with minimum packet sizes of 64 bytes.
A key feature is the ability to perform collective operations, such as AllReduce, Broadcast, and AllGather, directly within the switch chip. Traditionally, these tasks place a heavy load on XPUs, but Tomahawk Ultra takes this over from expensive compute resources. This endpoint-agnostic system works with different architectures and vendor ecosystems.
The chip supports advanced HPC topologies such as Dragonfly, Mesh, and Torus via topology-aware routing. Compliance with the UEC standard maintains the openness of the rich Ethernet ecosystem.
SUE-Lite and unified architecture
In addition to the full SUE specification, Broadcom introduces SUE-Lite, an optimized version for power- and area-sensitive accelerator applications. This lightweight variant retains the lossless and low-latency characteristics while significantly reducing the silicon footprint and power consumption.
Tomahawk Ultra, together with the 102.4 Tbps Tomahawk 6, forms the basis for a unified Ethernet architecture. This combination supports both scale-up Ethernet for AI and scale-out Ethernet for HPC and distributed workloads.
The chip is 100% pin-compatible with Tomahawk 5, enabling a fast time-to-market. Shipments have already started for deployment in rack-scale AI training clusters and supercomputing environments.
Tip: Broadcom defends VMware prices: customers are using bundle incorrectly