Mirasol3B: Google reaches milestone with AI model for video analysis

Google has presented a new AI model that can analyze lengthy videos. While AI solutions focused on text, image and sound have now achieved commercial success, there is no current tool that can process these areas together. With Mirasol3B, Google believes it has found an approach that can.

Few will describe AI development as easy, but by now a variety of applications such as ChatGPT, Midjourney and numerous business-oriented ML solutions have shown that much is already possible with the technology. Great strides have also been made in the audio field, such as with synthetic singing voices. However, “multimodality,” such as combining video, audio and textual content, is considerably more difficult to analyze.

Combiners

According to Google researchers Isaac Noble and Anelia Angelova, joint processing of modalities is difficult to keep in sync. One therefore presents Mirasol3B, which contains different components for audio and video and splits them up to stay synchronous. With this, what the researchers describe as “long videos” could be analyzed. They cite 512 frames as the largest input, although not every individual frame from a video is actually analyzed. Other AI models use only 32 to 64 frames per video, even if it is several minutes long, according to the researchers. With Mirasol3B, a video is divided into chunks of 4 to 64 frames, which are analyzed with a synchronous piece of audio. A “learning module” called a “Combiner” processes the combined data, after which the process repeats itself. However, each Combiner step after the first concentrates on the changes that have taken place, so duplicate frames do not require the same calculations.

Een diagram dat de verschillende delen van een video laat zien. — Source: Google

Possible applications include adding video content in an AI search engine, analyzing user-generated content for moderation and QA for professional videos.

For Google itself, AI-powered content moderation will undoubtedly sound appealing: its own YouTube platform receives hundreds of thousands of hours of new content daily, already largely moderated by algorithms. False positives can be challenged, as can human-driven reporting of harmful or banned content. During the Covid-19 pandemic, YouTube was forced to use even fewer people for content moderation. Having a better AI companion in providing this service would help alleviate the tasks for human moderators in the process.

Not open-source

While other ML experts such as Leo Tronchon of AI platform Hugging Face have been positive about the tool, others are skeptical. For example, Google has chosen not to share the model, the training data and the programming code required to run it. Mirasol3B is thus closed-source and its details are only accessible via the Google blog post and research paper.

Also read: AI model Google predicts weather more accurately than previously possible

Replatforming virtualized workloads: Do your VMs need a new home?

Finding a balance for VMs and containers

Sander Almekinders July 14, 2025

Zoho launches its own AI model and agent platform

Zoho announces Zia LLM, its large language model developed for business use. Additionally, the software compa...

Berry Zwets 2 days ago

Top story

Storyblok Blueprints, speedier setup for web developers

Storyblok is a headless CMS for web developers who want to make a bigger, faster market impact. It frees web ...

Adrian Bridgwater July 14, 2025

Anthropic unexpectedly restricts use of Claude Code

Users of Claude Code, Anthropic's AI code assistant, have been experiencing stricter usage limits since the b...

Mels Dees 13 hours ago

Expert Talks

Tech calendar

Mirasol3B: Google reaches milestone with AI model for video analysis

Combiners

Not open-source

Stay tuned, subscribe!

Building on 50 years analytics, SAS charts the future of AI

Broadcom launches Tomahawk Ultra with 250ns network latency

Storyblok Blueprints, speedier setup for web developers

Inova Health System's new network infrastructure enhances patient care and staff workflows

Evolution of private 5G and Wi-Fi: is convergence on the horizon?

What does HPE Financial Services do?

Accelerating SAP Adoption with AI: Interview with Thomas Pfiester at SAP Sapphire

How AI and automation are redefining ROI in the enterprise

Enhancing video encoding: The AV1 support in the new ARTPEC-9 System-on-Chip

How organisations can remain compliant while building resiliency during the AI era

GITEX DIGI_HEALTH 5.0 - Thailand

IT Arena

Innovation Week 2025

Luxembourg Venture Days

Appdevcon

Webdevcon

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices