Despite being seemingly well-intentioned, Anthropic’s call for a pause to frontier AI development regurgitates a flawed logic that we thought we were past. Additionally, the AI company is presenting seriously flawed arguments to suggest AI may soon become self-improving. More than ever, the human factor (and associated human fears) seems dominant as ever.
In its ominously titled blog “When AI builds itself“, Anthropic highlights how the first LLM-based chatbots were crafted by human researchers, after which autonomous work has started aiding them at an increasing rate. The future beyond autonomous agents (essentially LLM systems or, well, a bunch of AI models wearing a trenchcoat) is one of “closing the loop”. “In the future, agents could become capable enough to build and train models themselves.” The word “could” is load-bearing here, and we can hear the strain all the way from Silicon Valley.
The most important AI improvements are human-made
The writers of the piece, Anthropic Institute Lead Marina Favaro and co-founder of Anthropic Jack Clark, admit that they are mostly raising questions rather than answering any of them. Despite this, the article gestures at evidence of something, it’s just not anything but anecdotal. One such example is the 8x increase to code contributed per quarter per person at the AI company since the start of 2025. While being “almost certainly an overstatement of the true productivity gain”, the massive uplift in contributions is also “an acceleration” according to the authors.
We do not wish to challenge the points made in the article one by one, but this example illustrates how the questions raised in the paper aren’t more than loose data points that in no way correlate to a self-improving AI, the future we’re supposedly heading towards, potentially, someday, in the year 20XX? as written in the piece.
The important element to note here is that there is no clearly identifiable line from AI chatbots to self-improvement in any way. Instead, evidence has shown that human work (including from Anthropic itself) has upgraded the effective use cases for LLMs. We do note that breakthrough AI models have indeed shown dramatically improved accuracy, consistency and quality of outputs. Mythos may well be one to smash long-established barriers when it comes to LLM applicability. It may well do better than any human security researcher out there. Nevertheless, as Anthropic confirms, AI development has rarely seen “eureka” moments. One thing the company fails to make clear is that such moments have happened around AI development all the time.
RAG, the Model Context Protocol (an Anthropic(!) invention), agent harnesses and inference time compute (i.e. ‘reasoning’ for models) have all been human creations on top of AI. These have shown remarkable step changes for AI use cases and added robustness to the fundamentally probabilistic nature of Transformers. This plumbing around AI model infrastructure looks like it will continue to serve up human-led improvements. Admittedly, yes, automation is helping this process along, but even with the supposedly improved suggestions for next steps and greater agentic freedom to go along with it, there’s no actual self-improvement here.
Add it to the “not-to-do” list
Emergent AI self-improvement isn’t likely to be seen inside corporate environments themselves. One should expect it to happen inside a (hopefully controlled) environment from one of the leading AI labs, Anthropic among them. “The evidence suggests that the human role is narrowing at each step in the AI development process”, the piece suggests. Compute costs appear to be the only restriction on model suggestions consistently beating human code samples.
What Anthropic is describing, is really the implementation of AI as is. As a society and a global economy, we’re learning what the ethical, economic and practical limits are of AI deployments, having barely trusted the technology to do very much at all up to the present. Fear of general algorithms in life is generally a mistrust of the use of a technology, not the fundamentals of the technology itself. Anyone being laid off by a management team that thinks it can replace jobs with AI, is not getting fired “by AI”. The alternative view is that humans have gained widespread, subsidized access to AI API calls and haven’t yet understood what they have.
A further note on the supposed self-improvement Anthropic is highlighting. There is no discussion or explanation of the behavior seen by Claude that one may describe as self-learning. This is somewhat surprising. The only hint of this is Claude apparently getting better at suggesting next steps, a development attributable as one of an ever-growing “not-to-do” list just as much of genuine self-reflection.
2026 is 2023 part 2
Anthropic’s completely untenable suggestion is to “have the option to slow or temporarily pause frontier AI development”. We call this untenable because no AI lab can be entirely prevented from developing new LLMs. DeepSeek could easily ignore a centralized plea from Washington, Anthropic itself isn’t exactly in the U.S.’ good graces, and Google and OpenAI are already operating at a heightened level of secrecy around new models. At any rate, the Anthropic Institute will try to prove us wrong here.
All in all, it appears Anthropic is falling back on the flawed and ultimately failed campaign from various tech leaders back in 2023 that signed an open letter to pause AI development beyond what was then the frontier: GPT-4. In addition, OpenAI CEO Sam Altman suggested at the time that advanced AI could be misused. He formulated that fear in a plethora of ways, including a constant refrain of “Aritificial General Intelligence” (AGI), all while advocating against legislative restrictions – as he still does. If the fears uttered by Anthropic and OpenAI are well-founded, then they are actively accelerating us towards these dangers. If they are not, they serve mostly to raise the profile of the technology’s capabilities in the eyes of the world – and the public investors that will soon get a crack at the soon-to-IPO AI companies.
Conclusion: an absence of evidence
No evidence has emerged of Transformer-based AI models iterating upon themselves. Nevertheless, nobody can count out scaling laws applying in such a way that a given parameter count combined with a large enough dataset delivers us self-improving AI. Then again, there are lots of things we can’t rule out for certain. This is why we wait for evidence or telltale signs of a development.
The maturing ecosystem around AI is increasingly adding complexity, robustness, applicability and general improvements to barebones LLMs. How that will lead to economic winners in the AI race long-term, we don’t know yet. What is clear, is that Anthropic has a vested interest in making its LLMs the focal point of said ecosystem, just as SaaS vendors look to make the control or implementation layer the focus and system integrators wish to come up with enterprise-ready solutions on top of commodifying technologies.
Anthropic has plenty of motives for its rhetoric. It is actively looking to steer AI policy and governance in a way that will helpfully keep its own ballooning costs under control, all while ensuring the barrier to entry for frontier AI development remains high. OpenAI has tried to do something similar around the time GPT-4 emerged. Last time, reality proved human implementations of AI are a far tougher issue than the technology itself.
Also read: Anthropic releases Claude Opus 4.8, promising a more honest model