Researchers speed up processing neural networks

Researchers at ETH in Zurich, Switzerland, have developed a technique that can accelerate the speed of neural networks up to more than 300 times. The technique greatly reduces the computing power for the inference process.

In the study, the Swiss researchers developed a technique for the inference process that reduces the computing power required for the transformer model “BERT” used by as much as 99 percent.

Transformer models are the underlying neural networks of AI models. These particular neural networks consist of several layers that are responsible for many of the parameters in an LLM. These transformers often require a lot of computing power because they must compute the product of all neurons and input dimensions.

Introducing FFFs

The research shows that not all neurons in the ‘feedforward’ layers need to be active during the inference process of each input. They therefore propose ‘fast feedforward’ layers or FFFs to replace the traditional feedforward layers.

By allowing the new FFFs to identify the appropriate neurons for each computation, this technology can reduce the “computational load” and thus the overall computational power required. Ultimately, this leads to faster and more efficient LLMs.

FFFs use a mathematical action for this purpose; Conditional Matrix Multiplication (CMM). This mathematical action replaces the Dense Matrix Multiplications (DMM) used by traditional feedforward networks.

Up to 341 times faster

Experiments with the specific BERT models show that the technology, based on an algorithm, can significantly speed up the processing of large AI models. Tests showed it to be up to 341 times faster.

The technique can also be applied to LLMs like GPT-3, according to the Zurich researchers. This opens up new possibilities for faster and more efficient natural language processing.

Tip: Anthropic launches Claude 2.1: ‘delivers improvements for enterprises’

Top story

Inside TCS’ digital race behind Formula E

The world of Formula E combines technology and speed with sustainability. It's a blend that Tata Consultancy ...

Erik van Klinken June 27, 2025

Tech calendar

Researchers speed up processing neural networks

Introducing FFFs

Up to 341 times faster

Stay tuned, subscribe!

AI only works if the infrastructure is right

Yealink delivers secure collaboration with Microsoft’s MDEP

Memory-safe malware: Rust challenges security researchers

SAP CEO says EU doesn’t need a massive AI buildout. Is he right?

Children with autism treated months earlier thanks to process automation

EU launches action plan for cybersecurity in healthcare

Orange Cyberdefense turns security into a business enabler

ChatGPT is a bad doctor, but that shouldn’t surprise anyone

GITEX DIGI_HEALTH 5.0 - Thailand

IT Arena

Innovation Week 2025

Luxembourg Venture Days

Appdevcon

Webdevcon

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices