ChatGPT supports voice conversations and accepts images

OpenAI has further extended the functionality of its ChatGPT generative AI tool, making it possible to have voice conversations with the chatbot and upload images to learn more about them. The changes are currently only for the paid versions ChatGPT Plus and ChatGPT Enterprise.

ChatGPT users can now have conversations with the generative AI chatbot through speech. The now-integrated technology allows a spoken question to be picked up and translated so ChatGPT understands what is being said.

The new functionality is generated by a new text-to-speech model that creates human audio from text and a small speech sample. This is done with the OpenAI open-source AI model Whisperer which can turn spoken words into text. Answers are given by a choice of five different voices, both female and male are an option. This is done in collaboration with voice actors.

New image functionality

In addition to voice functionality, ChatGPT has also gained new image functionality. This allows users to upload a picture and ask the generative chatbot questions about it. Afterwards, they can converse with the chat tool about it to solve a problem, get more information about something or find directions to a remote location.

An example is a picture of the contents of a refrigerator and ask ChatGPT to put together a meal from the different supplies. Or instructions on how to perform a (minor) repair.

Samsung Galaxy S10e Samsung Galaxy S10e Samsung Galaxy S10e.

Privacy restrictions

OpenAI does indicate that the new technology could have a major impact on privacy, among other things. For that, the chat tool has limitations on analyzing and giving direct statements about individuals. This could be about persons about whom the AI model has more information. ChatGPT suffers from hallucinations, according to OpenAI, and the restrictions should protect the privacy of individuals.

Availability and further plans

The new voice and image functionality will become available to users of the paid versions ChatGPT Plus and Enterprise in the coming weeks. The technology will also become available in the iOS and Android apps.

In addition, OpenAI says it is working with Spotify on a new Voice Translation feature for its audio streaming platform. This will use OpenAI’s new AI-based voice model to translate podcasts. Podcast creators are promised they can use their own voice to extend the reach of their podcasts to other countries.

Also read: OpenAI links ChatGPT with DALL-E in DALL-E 3

Top story

Domain-specific AI beats general models in business applications

Visma’s AI team is quietly redefining document processing across Europe. With a background spanning nearly ...

Berry Zwets July 10, 2025

Tech calendar

ChatGPT supports voice conversations and accepts images

New image functionality

Privacy restrictions

Availability and further plans

Stay tuned, subscribe!

Zscaler Cellular brings Zero Trust to IoT and OT devices

Ingram Micro slowly gets back on its feet after ransomware attack

Domain-specific AI beats general models in business applications

Nvidia reaches milestone of $4 trillion market value

New Alteryx release tears down walls between cloud services and datasets

Wikidata unlocks its own knowledge base by vectorizing its data

SAP Datasphere makes data access easier

Appian’s Data Fabric gets more value out of data, wherever it resides

Krijg Volledig Inzicht van Gebruiker tot Cloud met Cisco ThousandEyes

GITEX DIGI_HEALTH 5.0 - Thailand

IT Arena

Innovation Week 2025

Luxembourg Venture Days

Appdevcon

Experience Synology’s latest enterprise backup solution

How to choose the right Enterprise Linux platform?

Enhance your data protection strategy for 2025

Strengthen your cybersecurity with DNS best practices