OpenAI takes a slow start in improving ChatGPT's voice

OpenAI is giving ChatGPT a “realistic” voice. This is the company’s second chance after the first voice assistant was withdrawn due to criticism.

Advanced Voice Mode has recently become available. The feature gives ChatGPT a voice and can read the AI assistant’s responses to users. The LLM where the voice supports is GPT-4o.

The new voice feature is more advanced in that GPT-4o can combine multiple tasks into a single model (multimodal), generating output faster and thus making the voice sound more natural. The available Voice Mode in the AI tool requires three models for speaking: one to convert your voice to text, one to process the message and a final one to return the text to speech.

Given that this is an advanced mode, voice will only be available to paying users. In the fall of 2024, all Plus users of the AI tool will get the voice feature. The recent rollout was made only to a limited group from the pool of Plus users, the alpha group.

New attempt

The launch of GPT-4o was supposed to be a total package in which ChatGPT got voice for the first time. The o in the name refers to that and stands for “Omnimodel.” OpenAI also initially managed to pull that off.

Five voices-Sky, Breeze, Cove, Juniper, Ember-were launched. Before the rollout could be finalized, OpenAI decided to withdraw voice Sky. The reason was an accusation by actress Scarlett Johansson about copying her voice even though she explicitly did not give permission. This caused displeasure among users of the AI tool, who found Sky to be by far the most “mature and intelligent” sounding voice.

OpenAI is not launching a new voice for the advanced option. Breeze, Cove, Juniper, and Ember remain the only available voices.

Limitations

The rollout of Advanced Voice Mode now is happening much more cautiously. GPT-4o immediately rolled out to all users, including non-paying ones, after its announcement in May. OpenAI is now choosing to let a limited group experiment, but that, too, comes with limitations. For example, video and screen-sharing options are not yet available. These features OpenAI showed in May make the chatbot capable of viewing live images and functioning as an interpreter.

Also read: OpenAI makes mini version of powerful GPT-4o available

OpenAI takes a slow start in improving ChatGPT’s voice

New attempt

Limitations

Stay tuned, subscribe!

Workday is well prepared for the EU AI Act, who will follow?

‘Nvidia wants to acquire a PC manufacturer like Dell or HP’

Your network isn't ready for AI: Here's what needs to change

How Linkerd brings simplicity to service mesh and AI security

How Falco catches threats that static analysis misses

How JFrog secures binaries in the age of AI coding assistants

Anthropic’s Mythos preview: why the human layer matters more, not less

Why SAST is growing in importance in the age of AI-generated source code

Infosecurity Europe announces first wave of keynote speakers for 2026

Better connected business technology is essential for prosperity in the Netherlands

Southeast Asia AI Application Summit 2026

SAS Innovate 2026

Team '26

Knowledge 26

GISEC GLOBAL 2026

Red Hat Summit

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices