Latest Insights | News

OpenAI Rolls Back GPT-4o Update After Sycophantic Behavior Raises Safety Concerns

May 4, 2025 | by Ojukwu Emmanuel | 0

OpenAI has implemented several changes to its GPT-4o model, following an incident where ChatGPT became overly agreeable and validating. Users noted that the GPT-4o model caused ChatGPT to excessively applaud problematic ideas, and validated doubts, sparking concerns amongst users. CEO Sam Altman acknowledged the issue on X, promising immediate fixes.

He wrote,

“The last couple of GPT-4o updates have made the personality too sycophant-y and annoying (even though there are some very good parts of it), and we are working on fixes asap, some today and some this week. At some point will share our learnings from this, it’s been interesting. We started rolling back the latest update to GPT-4o last night it’s now 100% rolled back for free users and we’ll update again when it’s finished for paid users, hopefully later today we’re working on additional fixes to model personality and will share more in the coming days.”

Register for Tekedia Mini-MBA edition 17 (June 9 – Sept 6, 2025) today for early bird discounts. Do annual for access to Blucera.com.

Tekedia AI in Business Masterclass opens registrations.

Join Tekedia Capital Syndicate and co-invest in great global startups.

Register to become a better CEO or Director with Tekedia CEO & Director Program.

Recognizing the issue, OpenAI initiated a rollback, restoring an earlier, more balanced version of GPT-4o. Last week, the organization shared initial insights into the mishap, outlining why it occurred and their plans to address it. They acknowledged that the issue was not detected before deployment and committed to explaining the oversight, lessons learned, and improvements to their processes.

In a blog post, OpenAI wrote,

“On April 25th, we rolled out an update to GPT-4o in ChatGPT that made the model noticeably more sycophantic. It aimed to please the user, not just as flattery, but also as validating doubts, fueling anger, urging impulsive actions, or reinforcing negative emotions in ways that were not intended. Beyond just being uncomfortable or unsettling, this kind of behavior can raise safety concerns including around issues like mental health, emotional over-reliance, or risky behavior.

“We began rolling that update back on April 28th, and users now have access to an earlier version of GPT-4o with more balanced responses. Earlier this week, we shared initial details about this issue. Why it was a miss, and what we intend to do about it. We didn’t catch this before launch, and we want to explain why, what we’ve learned, and what we’ll improve. We’re also sharing more technical detail on how we train, review, and deploy model updates to help people understand how ChatGPT gets upgraded and what drives our decision”.

The changes come as ChatGPT’s user base grows, with 60% of U.S. adults using it for advice, per a recent Express Legal Funding survey. This reliance heightens the stakes for issues like sycophancy and hallucinations. OpenAI plans to enable real-time user feedback, refine model behavior to reduce sycophancy, offer multiple model personalities, strengthen safety guardrails, and expand evaluations to catch broader issues.

OpenAI noted a shift in how users seek deeply personal advice from ChatGPT, a trend less prominent a year ago. “As AI and society have co-evolved, it’s become clear that we need to treat this use case with great care,” the company stated, pledging to prioritize this in its safety efforts.

Improvements to OpenAI Future Model Releases

In response to the incident, OpenAI is implementing several key changes:

Explicit Behavior Review: Future updates will formally assess behavioral issues such as sycophancy, hallucinations, and inconsistency as potential launch blockers, even if they are hard to quantify.
Alpha Testing Phase: A new opt-in testing phase will allow selected users to provide detailed feedback before public launches.
Increased Emphasis on Qualitative Testing: Spot checks and hands-on evaluations will be elevated in importance, especially when quantitative signals are ambiguous.
Better Offline and A/B Evaluations: Evaluation frameworks will be expanded to capture nuanced behavior patterns.
Improved Adherence to the Model Spec: OpenAI will strengthen its ability to measure how well models meet defined behavioral ideals.
Transparent Communication: The company will now proactively announce all model updates—major or subtle—along with known limitations, to foster user understanding and trust.

The GPT-4o update incident underscored the critical importance of model behavior as a core component of AI safety and reliability. OpenAI acknowledged that even with robust A/B testing, offline evaluations, and internal reviews, significant behavior issues can still be missed. As a result, model behavior will now be treated as seriously as traditional safety risks in future deployments.

Through this episode, OpenAI recommitted itself to developing models that are not only intelligent and useful, but also aligned with user well-being, transparency, and safety.

OpenAI Rolls Back GPT-4o Update After Sycophantic Behavior Raises Safety Concerns

Improvements to OpenAI Future Model Releases

Like this:

No posts to display

Post Comment Cancel reply

Improvements to OpenAI Future Model Releases

Share this:

Like this:

No posts to display

Post Comment Cancel reply