Meta announced EnCodec, an AI-powered audio compression method that has the ability to make audio up to ten times smaller than MP3 without compromising on quality.
Meta said its technology enhances the sound quality of speech on low-bandwidth connections, such as cellphone calls in suboptimal network areas. It works great for music as well.
EnCodec was initially debuted on the 25th of October in the ‘High Fidelity Neural Audio Compression’ paper written by Meta AI researchers including Yossi Adi, Jade Copet, Gabriel Synnaeve and Alexandre Défossez. Meta also published a blog summarizing its research focused on EnCodec.
According to Meta, the technique follows a three-part system designed to compress audio and achieve the desired target size. It begins by transforming uncompressed information into a lower frame rate representation, aka ‘latent space’.
Afterwards, the ‘quantizer’ compresses the latent space representation according to the desired size while maintaining a record of the most valuable data, allowing it to be rebuilt to the original signal. Lastly, the decoder converts the compressed data into audio in real-time via a neural network on a single CPU.
The implementation of discriminators allows EnCodec to compress data and audio without loss of quality. Meta claims its researchers are the first to implement a neural network for audio compression and decompression of 48 kHz stereo audio.
When it comes to applications, the technique can provide ‘faster and better quality calls’ in suboptimal network areas. It’s also capable of delivering “rich metaverse experiences without requiring major bandwidth improvements”, Meta said.
Regardless of its potential, Meta’s EnCodec technology is still in the research phase. It’s a promising technique for reducing the bandwidth needs of high-quality without affecting quality. The technology may prove beneficial for mobile broadband service providers overburdened by high media streaming demands.