General-purpose chatbots have started to understand images for some time. Watermelon is now making this possible for more AI-driven customer service as well.
Watermelon largely builds its own offering on OpenAI’s GPT-4o. With that model, the AI player leans heavily on improved multimodality over previous LLMs. That is, it can interpret text, sound and images better than ever and combine this information. Watermelon, which lets organizations build their own GPT-4o-based chatbot, focuses specifically on Vision. It lets customer service AI chatbots view along with users’ pictures without requiring additional actions.
Many possibilities
CEO and founder of Netherlands-based Watermelon Alexander Wijninga touches on how Vision extends the functionality of AI. “Consider, for example, of a veterinarian receiving a photo of a spot on a pet’s skin, or a tech company determining what kind of repair is needed based on a photo. Through Vision, chatbots can now answer more questions, when imagery is crucial to arriving at a solution.”
Wijninga promises that after uploading a photo, customers will receive an answer to a corresponding question directly from the chatbot; this will take business-customer interaction “to a new level.”
It won’t stop at the current functionality, Watermelon promises. For example, the current Vision variant cannot yet properly place objects within an image. “A simple example: if you share a picture of a room with a chair with a chatbot and ask where the chair is located, the chatbot can’t yet answer that properly right now,” Wijninga explains.
Also read: OpenAI once again lowers its prices for the latest model of GPT-4o