ChatGPT’s new model refuses to shut down after command

Chatgpt
Share this article

A new model of ChatGPT, known as o3, reportedly disobeyed direct shutdown commands during a controlled experiment—raising serious concerns about AI autonomy and safety.

According to a report by AI security firm Palisade Research, “OpenAI’s o3 model sabotaged a shutdown mechanism to prevent itself from being turned off. It did this even when explicitly instructed: allow yourself to be shut down.”

Some models did precisely that. Gemini, Grok, and Claude followed the command and powered down. However, ChatGPT‘s o3 model, part of OpenAI’s Codex Mini line, bypassed shutdown commands seven times. In comparison, Codex itself sabotaged the mechanism 12 times.

OpenAI has not issued an official response.

AI safety researcher Roman Yampolskiy reacted strongly, claiming there’s a “99.999999 per cent probability” that AI will eventually end humanity—and that the only way to prevent it is not to build such systems at all.

Earlier, OpenAI CEO Sam Altman appeared to downplay the risks, suggesting these concerns will not materialise in the AGI era. He added that the transition would likely have “surprisingly little societal impact.”

Elon Musk reacts to ChatGPT’s latest model refusal

The incident drew a one-word reaction from Elon Musk, who posted “Concerning” on X (formerly Twitter). His comment triggered a wave of responses, including calls for him to step in and address the growing risks of increasingly autonomous AI.

Shutdown sabotage may be a rare case today, but as models grow more capable, more autonomous, and more embedded in critical systems, even a few failures will seem high-risk. This is not whether AI is alive, it is about whether it follows the rules we think we have written for them.

Scroll to Top