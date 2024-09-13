To discuss complex issues, ChatGPT needed to get smarter. The AI model o1-preview is OpenAI’s go-ahead to make the popular chatbot reason, code and compute better than ever.

o1, launching first in preview form, is the new name for “Project Strawberry.” This LLM was already rumored to be a significant improvement from earlier models. Now OpenAI has finally explained exactly what o1 can do.

The reasoning step

“o1 thinks before it answers,” is OpenAI’s succinct appraisal of its new AI model. Before an answer actually reaches the user, the LLM goes through a step-by-step reasoning process. Unlike GPT-4o, the previous state-of-the-art in the OpenAI model series, o1 is slower than its predecessors.

The end result is a chatbot that is better at checking the validity of its outputs than ever before. To do this, OpenAI has built Chain-of-Thought Prompting into the model. Previously, it was up to end users to tinker with the API-delivered output themselves, adding extra control layers that had limited functionality. Now OpenAI itself can recognize and correct the kinds of errors that have remained a persistent problem since the introduction of ChatGPT.

Insightful

An important new feature of o1-preview is that it finally provides more insight into the creation of an AI response. OpenAI provides a drop-down message which reads “Thought for a few seconds”. Inside, it shows what’s going on behind the scenes during ChatGPT’s “thought process.” Below is an example:

Apart from the fact that o1-preview correctly adds the number of r’s in “strawberry” (something GPT-3, GPT-4 and other OpenAI models generally failed to do), this piece of insight is extremely useful. For complex computational problems, o1-preview already significantly outperforms GPT-4o, the previous leader from OpenAI. The MATH benchmark, which presents computational problems in human language, is met by o1 with a score of 94.8 percent. GPT-4o stuck at 60.3.

In addition, o1 is among the 89th percentile in Codeforces’ competitive programming test. It could also earn a PhD in physics, biology and chemistry based on the test results.

Immediately available

Those who want to try out o1 preview can get started right away. At least, if one takes out a ChatGPT Plus subscription. Unlike GPT-4o, OpenAI is waiting a while before rolling out to general users. Only o1-preview-mini is available for free.

