3 min Applications

ChatGPT supports voice conversations and accepts images

ChatGPT supports voice conversations and accepts images

OpenAI has further extended the functionality of its ChatGPT generative AI tool, making it possible to have voice conversations with the chatbot and upload images to learn more about them. The changes are currently only for the paid versions ChatGPT Plus and ChatGPT Enterprise.

ChatGPT users can now have conversations with the generative AI chatbot through speech. The now-integrated technology allows a spoken question to be picked up and translated so ChatGPT understands what is being said.

The new functionality is generated by a new text-to-speech model that creates human audio from text and a small speech sample. This is done with the OpenAI open-source AI model Whisperer which can turn spoken words into text. Answers are given by a choice of five different voices, both female and male are an option. This is done in collaboration with voice actors.

New image functionality

In addition to voice functionality, ChatGPT has also gained new image functionality. This allows users to upload a picture and ask the generative chatbot questions about it. Afterwards, they can converse with the chat tool about it to solve a problem, get more information about something or find directions to a remote location.

An example is a picture of the contents of a refrigerator and ask ChatGPT to put together a meal from the different supplies. Or instructions on how to perform a (minor) repair.

Samsung Galaxy S10e Samsung Galaxy S10e Samsung Galaxy S10e.

Privacy restrictions

OpenAI does indicate that the new technology could have a major impact on privacy, among other things. For that, the chat tool has limitations on analyzing and giving direct statements about individuals. This could be about persons about whom the AI model has more information. ChatGPT suffers from hallucinations, according to OpenAI, and the restrictions should protect the privacy of individuals.

Availability and further plans

The new voice and image functionality will become available to users of the paid versions ChatGPT Plus and Enterprise in the coming weeks. The technology will also become available in the iOS and Android apps.

In addition, OpenAI says it is working with Spotify on a new Voice Translation feature for its audio streaming platform. This will use OpenAI’s new AI-based voice model to translate podcasts. Podcast creators are promised they can use their own voice to extend the reach of their podcasts to other countries.

Also read: OpenAI links ChatGPT with DALL-E in DALL-E 3