2024 is the year of Ethernet, thanks in part to AI

2024 is the year of Ethernet, thanks in part to AI

Ethernet looks set to win the battle from the competition and become the standard networking fabric. For that to happen, however, it must continue to evolve. This is certainly happening. For example, a new consortium has been formed around Ethernet and we are also seeing bandwidth increase significantly. Jonathan Davidson, EVP and GM Networking at Cisco, discusses these developments and also makes some predictions around Ethernet.

When it comes to networking fabrics, there aren’t that many flavors. Besides Ethernet, there is InfiniBand, Fibre Channel and Omni-Path. Ethernet and Fibre Channel are, respectively, the oldest (standardized in the early 1980s, developed by Xerox earlier in the 1970s) and second oldest (late 1980s), while InifiniBand’s history began in 1999 and Omni-Path’s in 2012.

The developments of InfiniBand and Omni-Path are also somewhat intertwined, as QLogic launched an InfiniBand product in 2006 but sold the technology to Intel in 2012. Together with a part of Cray, also acquired by Intel, this then enters the market in 2015/2016 as Omni-Path. In this way, the roots of both technologies can be traced back to roughly the same period, 1999/2000. In 2019, Intel sees no point in further developing Omni-Path itself and spins it out to a new venture with Cornelis Networks. The last independent supplier of InfiniBand products, Mellanox, is acquired by Nvidia in the same year (2019).

The main reason for the existence of alternatives to Ethernet is that they provided better performance, especially when it came to available bandwidths. With the rise of things like HPC, supercomputers and later AI/ML, this was increasingly seen as necessary. Not surprisingly, somewhere in the mid-2010s InfiniBand was the most popular interconnect technology for supercomputers.

Networkfabrics in 2024

After the rough outline of the history of network fabrics above, we have now arrived in 2024. We never really hear anything about Omni-Path to be honest, although it is certainly still there. InfiniBand gets quite a bit of attention in recent years, thanks to Nvidia’s dominance. Ethernet and Fibre Channel have really always been around, it seems. Ethernet, of course, was and is the standard network fabric for most home networks and networks in business environments. Fibre Channel has also taken root in that second environment, particularly in data centers (SAN).

Ethernet is catching up

Ethernet is not standing still, however. Many readers who do anything with and in networking, servers or storage will no doubt be familiar with the abbreviations FCoE and RoCE. These stand for Fibre Channel over Ethernet and RDMA over Converged Ethernet (RDMA stands for Remote Direct Memory Access). The second one in particular is interesting because that also goes by the name IBoE, or InfiniBand over Ethernet. So this puts Ethernet in direct competition with InfiniBand. That in itself, by the way, is not something of the last few years, but was started just after the aforementioned peak for InfiniBand in the mid 2010s. That’s when 10G Ethernet began to slowly but surely replace it.

By now we are well past 10G. 100G is already quite common by now. However, we also need to move past that, we infer from Cisco’s Davidson’s predictions. “It’s not a stretch to think the industry will soon reach a tipping point where network architectures require higher performance capacities to support new applications, data, and workflows including AI/ML,” he indicates. This higher performance, he says, will then come in the form of 400G and 800G networks. 400G/800G spine-leaf networks, combined with 100G SerDes on the chips, can provide a huge leap in bandwidth. This ultimately benefits both the server and client sides.

Ethernet reduces complexity and costs

The catch-up in performance that Ethernet has made over InfiniBand in particular not only means that (large) organizations can now run modern and demanding AI/ML workloads over Ethernet. “As most customers look to build out their networks to handle new types and greater volume of workloads, most want a single architecture,” predicts Davidson. This is because it makes it less complex to keep the network up and running. In addition, it also saves on the costs for organizations. In Davidson’s words, “One network fabric to run them all.”

Een man die op het podium staat voor een publiek.
Jonathan Davidson, EVP and GM Networking at Cisco, shown on stage during Cisco Live EMEA 2024 in Amsterdam

As it stands, it looks like Ethernet is the network fabric that’s going to win out. We’re not just saying that because someone from Cisco says so. This is a fairly broadly supported sentiment. The main reason for going for InfiniBand was always the performance it offers. That reason has now been more or less eliminated. That has allowed other trade-offs to be made. The standardization that Ethernet brings and the lower cost compared to InfiniBand will be the deciding factors. The collaboration announced by Cisco and Nvidia at Cisco Live in Amsterdam also underscores this. Together, the two companies will build an AI infrastructure using Ethernet as its foundation.

The evolution of network fabrics, by the way, is far from over, Davidson also predicts. Proprietary innovation, of which InfiniBand was and is an example, will continue to be there. There are still I/O challenges that will push the industry “through cycles of proprietary innovation and standards development.” He also gives an example of one such I/O challenge. “Moving data faster and more efficiently between models and data stores will bring GPU’s closer to primary storage,” he states. To add that this will “blur the lines between internal and external fabrics in server design.”

Collaboration is necessary

Even a company the size of Cisco can’t do all of this by itself. It must work with other vendors in the industry to ensure that customers can deploy AI-ready networks. These networks must be able to be configured in multiple ways and using multiple fabrics. This is the only way to “unlock the promise of AI,” according to Davidson.

Clearly, collaboration is needed to optimize Ethernet for what is coming. In that regard, it is worth noting that the Joint Development Foundation (part of the Linux Foundation) formed the Ultra Ethernet Consortium last summer. Cisco was one of the founding members, along with AMD, Arista, Broadcom, Eviden (an Atos Business), HPE, Intel, Meta and Microsoft. This consortium aims to further improve Ethernet together to optimize network performance across the entire network stack, from the physical layer to the application layer, especially for HPC and AI workloads.

A few months ago (in November 2023), another 27 members joined this consortium. You can find the full list here. According to Davidson, this rapid growth indicates that there is “strong interest in using Ethernet as the basis for networking solutions targeting AI/ML workloads.” In addition, it also indicates the need for broad industry collaboration to solve the major challenges that companies share together.

Clearly, Ethernet is in good shape. The developments certainly won’t stop either. The alternative network fabrics will have to be very strong to substantially stop the general standardization on Ethernet, although there will always be developments around proprietary fabrics. That’s just how innovation works. Chances are, however, that even these will eventually standardize on Ethernet, or at least coexist with it.