After months of speculation, Mythos Preview has finally been turned into a properly released LLM. It takes the form of Claude Fable 5, a constrained version of Anthropic’s cybersecurity-busting Mythos. Meanwhile, Mythos 5 will be available in a limited capacity to security researchers and features fewer guardrails.
With the promise of being able to “do profound good for the world”, Anthropic is displaying an air of confidence around Fable 5 that the coming days and weeks will prove the veracity of. At the steep price of 10 dollars per million input tokens and 50 dollars per million output tokens (though deeply discounted versus Mythos Preview’s costs), Fable 5 is far from cheap. Nevertheless, users on paid plans will be able to access the model until June 22nd, when it will move to usage credits until capacity allows Anthropic to bring Fable 5 back in the way it is offering it now.
The benchmarks don’t say all that much
Over the past few years, AI benchmarks (especially first-party ones) have hardly been a predictor of real-world performance and a positive reception. We’ve seen Meta’s Llama 4 model, for instance, perform well in these tests and fail to reach meaningful adoption. Meanwhile, scores for some benchmarks got too high to be enough of a differentiator, partly owing to the fact LLM makers optimize for certain questions and of course because the LLMs genuinely improve.
At any rate, Mythos 5 and Fable 5 score either a little higher or far higher than Opus 4.8, GPT-5.5 and Gemini 3.1 Pro. A major caveat exists for the Fable 5 scores, however: any question on topics deemed dangerous (cybersecurity, biology, chemistry, and model distillation) gets handed over to Opus 4.8, a significantly less capable model. We’ve played around with Fable 5 and it indeed blocks our initial attempts at circumventing these safeguards, but it is still plenty capable.
A showcase
Every LLM has had a certain personality. That is to say: the unthinking model will echo a specific type of query response that is repeatable and, at times, limiting. Examples include earlier Claude models exclaiming the phrase “You’re absolutely right!” or GPT-5.1 through GPT-5.4’s strange obsession with gremlins. One thing that is immediately apparent with Fable 5 is its apparent tendency to comment on the user’s query. Claude is still very much in the realm of “LLM-y” responses (which it itself says is partly a function of what it’s asked to do), but it will appear to self-reflect a little bit on the input. Of course, this may very well be the same echo of a self-reflection LLMs are “taught” to do, which can convincingly mimic human thinking.
In addition, the model shows the difficulty of separating reasoning models from LLMs that don’t. Aside from taking some time to answer a question, Fable 5 will include its line of thinking far more in the actual displayed answer than other LLMs we’ve tested.
These are early days, and we’ve asked it just a handful of questions. Then again, it did come up with a party trick as requested. It correctly created a short essay in which the first letter of each sentence spelled out a message, while increasing the length of each sentence by 1 word until it reached 22. It got this self-imposed task exactly right – and it even wrote software to check itself. (The hidden message was, quite snarkily, “the proof is in the pattern”) Even the text itself was a commentary on the task we gave it. Impressive stuff for a party trick, and realms beyond what the likes of GPT-3.5 or 2023 models could muster.
A head start for the defenders
The implications of Fable 5’s release are as yet unclear. What we do know, is that guardrails get bypassed almost invariably. The prohibitive costs for this LLM will either drop or be beaten by a similarly capable model. But even if it doesn’t, we’ve seen time and again how red teamers and threat actors will find ways to exploit newly found capabilities. Anthropic’s line of thinking revolves around beating the latter group to the punch by empowering the former. Both through official communications and by hearing things anecdotally, we feel talk of Mythos’ capabilities isn’t overblown. This means Anthropic must have a lot of faith in its ability to prevent misuse – or, well, it knows what power it can display right before it allegedly IPOs soon, beating OpenAI in more ways than one by doing so.
Also read: EU cybersecurity agency gains access to Anthropic Mythos