AI upends everything at the moment. The design of data centers is no exception. It asks big questions from companies like Schneider Electric. We sat down with Steve Carlini, Chief Advocate for AI in Data Centers at Schneider Electric, to hear about the answers to those questions.
The data center industry is racing towards 1 megawatt per rack. During Schneider Electric’s Global Innovation Summit, we even heard from Nvidia about plans for 2 and even 4 megawatt per rack that are already on its roadmap. It’s fair to say that data centers will drastically transform over the course of just a couple of years. The implications of these giant leaps for power, cooling, and facility design are significant. “Things escalated over the last few years”, in the words of Carlini. Let’s dig into what that means.
End of Intel’s domination
The main reason for the big changes in data center design is easily found. That is of course the rapid build-out of AI infrastructure. We will return to that shortly. However, Carlini mentions another reason in passing too for why we have been stuck with the same data center design for a long time.
The traditional data center architecture was dominated by one or two socket x86 servers, primarily from Intel. Those x86 servers rarely exceeded 5-10 kilowatts per rack. Until the rise of AI workloads over the past years, the industry as a whole and Intel in particular were quite happy to keep it that way. This also meant that “the industry and Intel tried to keep everyone away from liquid cooling”, says Carlini. That simply was not necessary for those racks. Interestingly, this suggests that technological capabilities to increase densities but market dynamics prevented their adoption.
However, with AI workloads operating at 40-50 kilowatts per rack, with some deployments exceeding 100 kilowatts per rack, that position became untenable. Rather, a fundamental shift was necessary. Modern AI servers operate many GPUs in parallel, in addition to CPUs and DPUs (Data Processing Units) and massive memory configurations. The thermal design power (TDP) of these implementations far exceeds traditional processors, creating the need for new cooling designs.
Leapfrogging to 800V DC power distribution
Cooling is one of the components that are undergoing big changes as the result of the rise of AI. Power delivery is another one. AI infrastructure develops so quickly now that this has fundamental implications for the roadmap of companies like Schneider Electric. Schneider collaboration with Nvidia demonstrates that. The company originally planned to introduce 600V DC systems but realized this voltage couldn’t support the 400 kilowatt per rack requirements of upcoming Nvidia Vera Rubin Ultra GPUs arriving at the end of 2026, early 2027.
The solution for this problem is as simple as it is drastic. Schneider jumped directly to a 800V DC architecture. This requires redesigning power distribution from utility connections through to servers. One approach involves “sidecars”. This means removing power supplies from server cabinets and housing them separately, similar to how data centers are increasingly deploying infrastructure modules outside the main facility.
Looking further ahead to 2028-2029, Carlini reveals that Nvidia’s Feynman architecture will push densities to 1 megawatt per rack, with Schneider already working on solutions for whatever comes next.
Liquid cooling becomes the standard
Air cooling reaches practical limits around 50 kilowatts per rack. Rear door heat exchangers can stretch to 72 kilowatts, but beyond that, liquid cooling becomes essential, we hear from Carlini. However, current liquid cooled data centers typically use direct-to-chip cooling only for processors, leaving 20-30% of equipment air cooled. The industry is developing designs to immerse networking equipment, power supplies, and other components in liquid cooling systems. This would really simplify things, according to Carlini: “Combining the two [liquid cooling and air cooling, ed.] inside a single data center is very complex, so that [standardizing on liquid cooling, ed.] could be an interesting idea.”
Contrary to public concerns about water consumption, Carlini emphasizes that closed-loop liquid cooling systems recirculate coolant rather than consuming water like traditional evaporative cooling towers. Some facilities even use seawater to cool the closed-loop systems, as demonstrated in a Schneider project in Portugal that we reported on earlier this year.
Liquid cooling continues to evolve
Liquid cooling itself will also continue to develop, we hear from Carlini. More specifically, he mentions microfluidic cooling. That basically means very small channels etched into the silicon of a chip. Cooling liquid runs through those channels and cools the chip. This could have a big impact on cooling capacities. It brings cooling as close to the source of the heat as possible. That should also mean that it potentially has a high efficiency. Not only because of proximity, but also because cooling can now be directed better at the hottest parts of a chip. Microsoft claims it has successfully tested this new way of cooling.
Developments such as microfluidic cooling could have a profound impact on how racks and accompanying infrastructure will be built towards the future. Also, it is not all about the type of cooling, but also about the way chips communicate with each other and communicate internally. What will the impact of an all-photonics network be on cooling, for example?
The first couple of stages building that type of end-to-end connection have been completed. The interesting parts for the discussion we have here are next on the roadmap for all-photonics networks: using photonics connections between and inside silicon on boards. By eliminating the electronic connections altogether, this must result in less energy use and therefore also heat. When we ask Carlini what the impact of this could be, he quotes 30-50% efficiency improvements in test sites using photonics. So the impact of this transition could be huge, also on what Schneider Electric’s roadmap will look like.
Liquid cooling systems as sources of data
When it comes to liquid cooling, it’s not just about adding the system of choice (e.g. direct-to-chip, immersion, microfluidic in future deployments) to the equation. These systems also generate data that can be used for monitoring/observability purposes. “We can integrate data from liquid cooling systems to the Nvidia platform. This means you can get alerts in your Nvidia environment”, Carlini says. It wasn’t an easy integration to build for Schneider, Carlini stresses. They worked on that for a year.
A year to build something that connects the cooling systems to the Nvidia platform sounds like a long time. However, it is important to do this. At the end of the day, the AI infrastructure game is all about optimizing infrastructure as much as possible. In other words, organizations want to squeeze every last GPU cycle out of their systems. If you know exactly what is going on end-to-end in your infrastructure from one single dashboard or management layer, that becomes easier.
Different way of thinking about location of data centers
Any discussion about data centers cannot ignore one of the bigger elephants in the room: how will all of them be connected to the grid. Carlini states during his presentation that, assuming the water discussion will be solved (see above), this is the next big challenge to solve. There are two parts to that challenge. The first is how to make sure data centered receive enough power to operate. The second is how the role of the data center can and should change as a part of the grid.
In order to solve the first part of the challenge, data center operators have various options. Location of data centers is an obvious one. Carlini mentions that Alaska has gotten an appetite for hosting data centers. The reason for this is twofold. On the one hand, it’s not very densely populated. On the other, it’s cold there, so there’s a lot of natural cooling available. This obviously also impacts the power usage.
Another example of location as an important deciding factor we saw earlier this year. We visited the new Start Campus site in Sines, on the coast of Portugal. That data center is gradually built out to become a 1.2 GW site. The reasons for choosing the location? The first one is that Portugal has a lot excess renewable energy that data centers can use. The second is that Start Campus wants to cool the liquid in the closed loop system of the data center with sea water. The third is that it is built next to a defunct power plant, so it can benefit from a lot of the infrastructure that is already there, including power delivery.
Data centers and the grid
The second part of the energy equation around data centers has to do with how they are part of the grid as a whole. Really big data centers like the new one being built in Abilene, Texas, partly have their own power supply. This particular site uses gas-powered turbines, in addition to the ‘normal’ grid. Small Modular Reactors (SMRs) on-site, running on recycled uranium from traditional nuclear facilities are an interesting route too. SMRs offer high inertia compared to wind and solar, which is good for grid stability.
When it comes to grid stability, data centers can play an interesting role too. More specifically, they can help stabilize and support the grid. Carlini mentions the Data Center Flexible Load Initiative (DCFlex) in this respect. The idea behind this is that data centers would not only be passive parts of the grid, but would also give back. That means that large data centers can become a grid resource and provide services to help balance it better. This initiative was launched in late 2024 and is supposed to be completed by the end of 2027. We have heard of smaller-scale initiatives than DCFlex around this too. These will most likely be operational sooner.
The types of initiatives above definitely sound like a good direction to move into. However, there are many moving parts to take into account. It will need a more dynamic approach to selling space in data centers, which is usually based on the amount of watts a customer wants. Irrespective of the actual load, the data center reserves that for the customer. If data centers need to be more dynamic, so do the contracts. New models must allow operators to participate in grid services (including contracts with utilities providers while guaranteeing customer power needs. Although not impossible to achieve, it is going to take some work.
The data center of the future
The data center of the future will be characterized by high-density computing, liquid cooling, sustainable power sources, and a more integrated role in the grid ecosystem. As technology continues to advance, data centers will become more efficient, flexible, and environmentally responsible. That may sound like an oxymoron to many people nowadays, but it’s the only way to get to the densities we need moving forward.
The article above is based on a presentation and a conversation we had with Steve Carlini during Schneider Global Innovation Summit. The complete conversation can be viewed below:
The opening image for this story is Gemini’s interpretation of a data center in the future, with some additional characteristics we asked for in our prompt. Very likely not even close to what they will actually look like.