Broadcom has unveiled the Tomahawk 6 chip, which enables Ethernet switches in data centers to process up to 102.4 terabits per second. This bandwidth makes the chips ideal for large AI clusters, according to the chip company.
AI training is spread across multiple GPUs (usually from Nvidia) that need to constantly exchange data with each other. This communication requires significant bandwidth, which is usually the bottleneck for data-intensive AI workloads. Inference, the daily running of AI models, also places high demands on the network because the GPUs regularly need to retrieve data from external storage.
Tomahawk 6 addresses these network bottlenecks with what it calls Cognitive Routing 2.0. This functionality detects congestion and redirects traffic to other connections. At the same time, it collects information about technical issues for monitoring.
Co-packaged optics save costs
Data centers are increasingly connecting AI servers with fiber optic cables for higher speeds than traditional copper cables. For optical networks, Broadcom offers a Tomahawk 6 version with co-packaged optics (CPO).
CPO integrates transceivers directly into the switch processor. This eliminates the need for separate transceiver devices, saving hardware costs and reducing power consumption. Data is converted directly in the switch into light for transmission over fiber optics.
For copper cables, the chip supports long-reach passive copper cables. Standard copper cables have limited range, which means AI servers have to be located close together. The new cables relax these design constraints.
Support for up to 100,000 processors
In scale-out configurations, Tomahawk 6 can control clusters with up to 512 processors. Two-layer switch networks enable connections between more than 100,000 processors.
The chip is available starting today. According to Broadcom, several customers plan to integrate Tomahawk 6 into AI clusters with more than 100,000 processors.
Read also: 2024 is the year of Ethernet, thanks in part to AI