Meta AI resolves background noise issues with AV-HuBERT

Facebook’s parent company Meta may resolve the greatest hurdle in speech recognition platforms: background noise. The firm integrated visual cues examination to filter out external chatter in its speech recognition platform.

Background noise has been a great problem in contemporary speech recognition platforms, thus making it difficult for AIs to decipher verbal cues in a noisy space. Traditionally, noise-suppression techniques have separated the main sound from the chatter. However, these techniques failed to be as effective as the human sense of amalgamating auditory cues with vision.

Taking that into account, Facebooks’s parent company, Meta AI, has launched its latest conversational AI structure. The Audio-Visual Hidden Unit BERT (AV-HuBERT) is a system directed to train Artificial Intelligent devices by taking in auditory and visual cues. According to Meta AI, AV-HuBERT examines a video’s speech and lip movements without transcriptions.

How is AV-HuBERT different from other speech recognition platforms?

Meta’s AV-HuBERT supposedly is much more technologically advanced than others of its kind. The current market for speech recognition exclusively includes software programs that rely solely on audio input. These platforms struggle with differentiating the different voices of multiple speakers. However, the AV-HuBERT is meticulously engineered to combine visual cues with auditory data. The platform studies lip and teeth movement to understand the distinctions in various input streams. As a result, the program can decipher the speaker’s voice and differentiate it from background noise or chatter.

How effective is AV-HuBERT?

Meta AI perceives AV-HuBERT to be more than 75% accurate in delivering accurate audio-visual speech results. In addition, according to the firm, the model only requires 10% of data that other systems need to acquire the same results.

The system’s efficiency in data collection makes it a viable candidate for an ideal platform for understanding and decoding visual and auditory cues from different languages. Meta’s speech recognition system can be utilized to create advanced systems for more languages with large-scale labeled datasets.

Application of AV-HuBERT

The applications for an auditory-visual speech detection system are endless. For starters, the technology can be used in smartphones and smart-home devices to ensure the accurate understanding and transmission of data in high-noise environments. The system can also identify deepfakes, as it can analyze minute associations between auditory and visual cues. Meta’s AV-HuBERT can also be game-changing for VR avatars, giving them a more realistic touch.

Meta AI also claims to create and distribute a batch of pre-trained systems to other researchers to increase the scope of progress within the industry.

European telecom providers oppose network liberalization

A coalition of smaller telecom providers is warning of a "re-monopolization" of the European telecom market. ...

Berry Zwets 3 days ago

EU sticks to AI Act timeline despite pressure from companies

The European Commission is sticking to its planned implementation of the AI Act. This is despite lobbying fro...

Berry Zwets July 4, 2025

Top story

The sovereign cloud offers no guarantees, how can it do so?

Using the public cloud inherently requires a degree of trust in the chosen provider. Critical industries and ...

Erik van Klinken February 19, 2025

Whitepapers

Meta AI resolves background noise issues with AV-HuBERT

How is AV-HuBERT different from other speech recognition platforms?

How effective is AV-HuBERT?

Application of AV-HuBERT

Stay tuned, subscribe!

What we know about SafePay, the Ingram Micro attackers

Nvidia reaches milestone of $4 trillion market value

Ingram Micro slowly gets back on its feet after ransomware attack

It’s World Backup Day, but backups alone are not enough

Enterprise Data Cloud is a logical but important evolution of the Pure platform

How do you build a secure Synology storage system?

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices

Krijg Volledig Inzicht van Gebruiker tot Cloud met Cisco ThousandEyes

GITEX DIGI_HEALTH 5.0 - Thailand

IT Arena

Innovation Week 2025

Luxembourg Venture Days

Appdevcon