3 min Applications

Details leak on Anthropic’s “step-change” Mythos model

Details leak on Anthropic’s “step-change” Mythos model

Anthropic has confirmed it is testing Claude Mythos, a new AI model it describes as a ‘step change’ in capabilities, after a data leak inadvertently exposed draft documents about the model. The leak, caused by a configuration error in Anthropic’s content management system, made close to 3,000 unpublished assets publicly accessible.

Anthropic was not the one to break this news. Security researchers Roy Paz of LayerX Security and Alexandre Pauwels of the University of Cambridge discovered the exposed data store, which contained a draft blog post describing the model in detail. Fortune reviewed the documents and informed Anthropic on Thursday, after which the company restricted public access. Anthropic attributed the incident to “human error” in the configuration of its content management system (CMS), describing the exposed material as “early drafts of content considered for publication.”

In all, nearly 3,000 unpublished assets linked to Anthropic’s blog were publicly searchable in the unsecured data store. Assets published via the CMS are set to public by default and assigned a publicly accessible URL unless a user explicitly changes that setting.

A new tier above Opus

The leaked draft describes Claude Mythos under the product name “Capybara”. It would represent a new model tier that sits above Anthropic’s current flagship Opus line. “Capybara is a new name for a new tier of model: larger and more intelligent than our Opus models — which were, until now, our most powerful,” the draft stated. The two names appear to refer to the same underlying model.

Anthropic currently offers models in three tiers: Opus (most capable), Sonnet (faster and cheaper), and Haiku (smallest and fastest). Capybara would add a fourth, pricier tier above all three. According to the draft, it scores “dramatically higher” than Claude Opus 4.6 on tests of software coding, academic reasoning, and cybersecurity. Opus 4.6 had only recently topped Terminal-Bench 2.0 at 65.4%, surpassing GPT-5.2-Codex, as we previously reported.

Asked directly, Anthropic confirmed the model: “We’re developing a general purpose model with meaningful advances in reasoning, coding, and cybersecurity. Given the strength of its capabilities, we’re being deliberate about how we release it. We consider this model a step change and the most capable we’ve built to date.”

Unprecedented cybersecurity risks

Cybersecurity is the dimension Anthropic appears most concerned about. It’s somewhat ironic given the fact details about it were leaked due to a misconfiguration – perhaps Mythos can step in to prevent a repeat in the future. At any rate, the draft describes the model as “currently far ahead of any other AI model in cyber capabilities” and warns it “presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.”

Because of these concerns, Anthropic is restricting early access to organizations focused on cyber defense, giving them time to harden their systems. The company has dealt with misuse before. Anthropic previously blocked exploitation of Claude for cybercrime, and in one documented case, discovered and disrupted a Chinese state-sponsored campaign that had already used Claude Code to infiltrate roughly 30 organizations. The risk profile of Mythos adds a new dimension to those concerns. In an earlier security test, Claude was turned into a malware factory within eight hours.

The leaked documents also included a PDF about an upcoming invite-only two-day retreat for European CEOs at an 18th-century English countryside manor. Anthropic CEO Dario Amodei is set to attend. A spokesperson described it as “part of an ongoing series of events we’ve hosted over the past year.”