xAI, Elon Musk’s AI startup, recently previewed its new multimodal LLM Grok Vision (Grokv) 1.5. This LLM is more powerful than the original recently launched Grok-1.5 LLM and should compete with equivalent LLMs from the likes of OpenAI, Anthropic and Google.

The preview of Grokv 1.5 presented by xAI shows that this provider’s new Large Language Model (LLM) is much more powerful than its earlier Grok-1.5 LLM, which could already handle more context than the original Grok-1 LLM. A multimodal LLM understands and generates multiple modalities such as images, audio and video in addition to text. The new Grokv 1.5 LLM ‘understands’ documents, photos, screenshots, graphs, diagrams and more in addition to text and images.

‘Real-world spatial understanding’

Under the hood, the new LLM could easily compete with other more or less equivalent LLMs. Grokv 1.5 specializes in so-called ‘multidisciplinary reasoning’.

In addition, the LLM features ‘real-world spatial understanding’. This technology allows an LLM to reason with complete texts, understand scientific images and deal with visual content in a ‘human way’.

Whiteboard met een stroomdiagram voor een spel om getallen te raden en ernaast geschreven pythoncode, die de logica van het spel nabootst.

This allows the LLM to ‘translate’ drawings into children’s stories, identify which objects in a group are the largest, help car drivers by checking whether there is enough room to get around an object and translate tables into CSV format. It could even tell if a wooden floor is rotten and needs to be replaced.

Benchmark with similar LLMs

In addition to providing the above extensive functionality, the new Grokv 1.5- LLm has also been tested to measure its competition with other more or less equivalent LLMs. According to this benchmark, Grokv 1.5 outperforms, for example, Anthropic’s GPT 4.5v, Claude, and 3Sonnet, Google’s Claude 3 Opus variant, and Gemini Pro 1.5.

Notably, the new LLM performs better in xAI’s own ‘RealWorldQA’ benchmark. Elon Musk’s AI company created this benchmark to measure real-world spatial understanding.

Grokv 1.5 will soon be available to testers, initially subscribers of the social media platform X’s Premium+ service.

