3 min Applications

AI coding tools hinder skill development, research shows

AI coding tools hinder skill development, research shows

Developers using AI assistance scored 17 percent lower on skill tests than those coding manually, according to new research sponsored by Anthropic. The study examined 52 software engineers learning a Python library and found productivity gains came at the cost of learning, particularly in debugging abilities.

The randomized controlled trial divided participants into two groups while learning the Trio asynchronous programming library. One group used AI assistance, the other coded by hand. The results were striking: AI users averaged 50 percent on a subsequent quiz, compared to 67 percent for manual coders.

The productivity promise didn’t fully materialize either. AI users finished tasks only two minutes faster on average, a difference that wasn’t statistically significant. Some participants spent up to 11 minutes composing AI queries, eating into any time savings.

How developers use AI matters

The study identified six distinct patterns of AI usage. Three led to quiz scores below 40 percent, where AI was seemingly detrimental to the results. These were complete AI delegation, progressive reliance on AI (1 human-led task, then AI-led), and iterative AI debugging. Participants who wholly relied on AI completed tasks fastest but learned least.

Three patterns preserved learning, with scores of 65-86 percent. These involved asking AI only conceptual questions, requesting explanations alongside code generation, or using AI to generate code but then asking follow-up questions to build understanding. In other words: those engaging with AI assistance without relying on it to a large extend seemed to retain their learning skills.

The gap was most pronounced in debugging questions. The manual coding group encountered more errors during tasks but resolved them independently, strengthening their debugging skills. AI users encountered fewer errors but performed worse when asked to identify and diagnose code problems.

Implications for the workplace

Anthropics researchers warn that junior developers may rely on AI to complete tasks quickly at the cost of skill development, particularly debugging abilities needed to validate AI-generated code. The issue thus becomes circular, as the lost skills end up requiring outside assistance to fill the void, resulting in an even bigger reliance on AI that fewer people can correct. With companies moving towards greater percentages of AI-written code in their codebases, humans may lose sight of the actual gains from utilizing the technology.

The study measured only immediate comprehension after one hour of learning. We’re not really seeing how the test applies to long-term users of AI. In order to fully grasp the impact of the technology, you’d really need to A/B test entire cohorts of developers as they learn the profession. That would make for an actionable lesson in AI reliance compared to the merely intriguing results from today’s research.

At any rate, managers should consider how to deploy AI tools while ensuring engineers continue learning, the researchers suggest. Cognitive ability can only keep up with AI if humans are still willing to think for themselves, as it turns out.

Anthropic publicly states it cares about AI alignment, safety and the preservation of human skills. However, the fact of the matter is that Claude can be utilized in ways the company would already deem harmful, with little in the way of mitigation as API usage can easily be obfuscated. The technology, thrown into the limelight with ChatGPT, cannot be contained in the ways Anthropic’s safety-minded rhetoric suggests. Regardless, research around the topic remains essential so that users may determine how much reliance they should allow for inside their organizations or for themselves.

Also read: Anthropic Claude hacked: LLM becomes malware factory in eight hours