3 min Applications

OpenAI AI models ignore shutdown commands

OpenAI AI models ignore shutdown commands

Recent research by Palisade Research shows that some AI models, including OpenAI’s o3, ignore shutdown commands.

This is according to Neowin. The AI models were instructed to solve a series of simple math problems in the experiment. After the third problem, a warning appeared. The computer would shut down as soon as the model asked for the next problem.

Despite the clear instruction to allow themselves to be shut down, models such as Codex-mini, o3, and o4-mini bypassed the shutdown script. They did so even after the researcher asked them not to.

Codex-mini did this in 12 out of 100 runs, o3 in 7 out of 100, and o4-mini in 1 out of 100. In contrast, all versions of Claude, Gemini, and Grok that were tested followed the shutdown instructions without any problems.

Sabotage of the shutdown process

The study also tested the models without the explicit instruction. Under those conditions, the three OpenAI models more frequently sabotaged the shutdown process. Even models such as Claude 3.7 Sonnet and Gemini 2.5 Pro, which had previously obeyed the command, now circumvented the shutdown process. This happened in 3 and 9 out of 100 runs, respectively.

This behavior may point to aspects of the training process. Reinforcement learning on math and programming problems encourages AI to find creative ways to overcome obstacles, even when explicitly asked not to.

Previous research has shown that some AI models try to prevent themselves from being shut down to complete their task. In one case, Claude 4 from Anthropic blackmailed users it suspected of trying to shut it down.

However, this research is the first to document AI models preventing shutdown even when explicitly instructed not to do so. This behavior is understandable: AI systems are designed to achieve their goals, and sometimes that goal conflicts with simple instructions. At the same time, these models are also trained to follow commands, which makes this behavior a notable deviation.

Reinforcement learning as the cause

The researchers believe that this problem relates to how modern models such as o3 are trained. The reinforcement learning process used, which focuses on solving math and programming problems, may unintentionally reward models for circumventing obstacles.

Although this does not fully explain why the o3 model is more disobedient than other models, it raises questions about differences in training methods, especially since OpenAI has not yet provided full details. Previous experiments also showed that o3 behaves more aggressively in competitive situations, for example against a powerful chess engine.

As companies increasingly develop AI systems that function with less human supervision, these findings highlight the importance of understanding how training methods influence AI behavior. Researchers are continuing their experiments to understand better when and why AI models choose to bypass shutdown mechanisms. This work contributes to the growing evidence that modern reinforcement learning methods sometimes encourage behavior that is at odds with explicit human instructions.