GPU shortage drives Fujitsu to make best use of existing hardware

Fujitsu has announced a new technology that makes optimal use of CPUs and GPUs. Processes that have high execution efficiency are given priority. The Japanese company hopes to bail out organizations plagued by the global GPU shortage caused by the ubiquitous AI hype.

Earlier this year, Nvidia stated that data centers need a shake-up. It stated that more GPUs would be required to support the intensive workloads that depend on them. Fujitsu, however, has come up with a different solution. A clear name hasn’t been given to the duo of new techniques, but the Japanese company hopes to eventually deliver the technology in a software package.

Utilizing CPU and GPU

Firstly, Fujitsu details a technology that makes optimal use of CPU and GPU resources. In doing so, it distinguishes programs that could possibly also be processed by a CPU and others that depend on GPUs. It does this by predicting how much time the hardware acceleration required takes per program, redistributing GPUs in real time to process high-priority programs.

In the example below, a user wants to process 3 programs with a single CPU and two GPUs. The two GPUs are optimally utilized while the CPU comes to the rescue where needed, minimizing the total time to process the three programs. The alternative, presumably, would be for a program to have to wait for a GPU to be freed up, first occupied by the two other programs.

An example of “allocation switching” between CPU and GPU. Source: Fujitsu

Less waiting

Another solution from Fujitsu is one that makes real-time switching between multiple programs faster. With this technology, an HPC system with multiple computers would not have to wait for one program to finish to boot up another. With this, Fujitsu aims to make it possible to make such HPC systems deployable for running programs with restrictive latency requirements.

Fujitsu explains that the conventional communication method in an HPC system has many inherent latencies. This is due to timing gaps in switching between servers. Known as unicast, this method informs each server of a new switch sequentally. This is very reliable, but Fujitsu hopes to gain seconds in select situations with an alternative. ‘Broadcast’ communication lets each server know simultaneously that a switch is taking place, otherwise known as ‘real-time batch switching.’ It is said that packet drops are rare, but so the reliability compared to unicast decreases slightly. It is up to the user whether the time savings are worth it.

Specific applications for the new broadcast method include digital twins, generative AI and drug discovery, according to Fujitsu.

The two communication methods. Source: Fujitsu

Future application

Fujitsu itself has a platform to test advanced AI technologies, codenamed Kozuchi. On it, the company plans to apply the CPU/GPU optimization technique. Fujitsu hopes to use the new communication method for HPC systems for the 40-qubit quantum computer simulation it has under construction.

Other applications are still uncertain, but it is possible that software will become available that will make the new inventions more widely applicable.

Also read: Fujitsu fully integrates its public cloud services