
OpenAI, the pioneering force behind ChatGPT, unveiled its latest artificial intelligence models, o3 and o4-mini, on Wednesday, introducing a transformative ability to “think with images.”
These models can analyze and discuss user-uploaded sketches, whiteboards, and diagrams, even if low quality, marking a significant advancement in AI reasoning. Building on the September 2024 debut of the o1 model, OpenAI’s latest release intensifies its lead in the generative AI race against competitors like Google, Anthropic, and xAI.
“For the first time, our reasoning models can independently use all ChatGPT tools — web browsing, Python, image understanding, and image generation,” OpenAI wrote. “This helps them solve complex, multi-step problems more effectively and take real steps toward acting independently.”
Register for Tekedia Mini-MBA edition 17 (June 9 – Sept 6, 2025) today for early bird discounts. Do annual for access to Blucera.com.
Tekedia AI in Business Masterclass opens registrations.
Join Tekedia Capital Syndicate and co-invest in great global startups.
Register to become a better CEO or Director with Tekedia CEO & Director Program.
Valued at $300 billion after a March 2025 funding round, OpenAI is pushing innovation boundaries, though its safety practices have ignited controversy.
Revolutionary Features of o3 and o4-mini
The o3 and o4-mini models are OpenAI’s first to integrate visual information directly into their reasoning processes, enabling them to understand and manipulate images alongside text, code, and other data. Users can upload images, such as hand-drawn diagrams or whiteboards, and the models can interpret and discuss them, using tools like rotation, zooming, and image editing for enhanced analysis.
““Thinking with Images” has been one of our core bets in Perception since the earliest o-series launch. We quietly shipped o1 vision as a glimpse—and now o3 and o4-mini bring it to life with real polish,” Jiahui Yu, lead of the Perception Team at OpenAI, said on Wednesday.
o3 is optimized for math, coding, science, and image analysis. It achieves remarkable benchmarks, including a 96.7% score on the 2024 American Invitational Mathematics Exam (AIME) and a 71.7% accuracy on SWE-Bench Verified for coding, surpassing its predecessor o1. Its ability to reason over visual inputs makes it ideal for technical fields like engineering and data visualization.
Designed for efficiency, o4-mini offers faster performance at a lower cost, maintaining strong reasoning capabilities. It caters to applications requiring quick responses and is accessible to ChatGPT Plus, Pro, and Team subscribers, democratizing advanced AI.
These models represent a leap toward multimodal AI, expanding OpenAI’s vision of creating active agents capable of independent, human-like reasoning across diverse data types.
The launch follows OpenAI’s rapid innovation since ChatGPT’s viral debut in November 2022, which reshaped the AI industry. In March 2025, OpenAI introduced a native image-generation feature for GPT-4o, which gained widespread attention for producing Studio Ghibli-style anime images, though it sparked copyright concerns. The o3 and o4-mini models build on this, integrating image reasoning to enhance applications in education, design, and scientific research.
While OpenAI’s $300 billion valuation, secured in a $40 billion funding round led by SoftBank in March 2025, reflects its dominance in AI, the competitive landscape has been intense. Google’s Gemini 2.0, Anthropic’s Claude 3.7, and xAI’s Grok-3 pose challenges, while DeepSeek’s R1 model undercuts pricing at $0.55 per million input tokens compared to o3-mini’s $1.10. CEO Sam Altman confirmed the release on April 4, via X, noting a strategic shift to prioritize o3 and o4-mini before GPT-5, expected in summer 2025, to meet demand and refine integration.
Safety and Accountability
OpenAI claims that o3 and o4-mini underwent its “most rigorous safety program to date,” employing a “deliberative alignment” approach where models reason over safety policies before responding. However, recent changes to its Preparedness Framework have drawn scrutiny. On Tuesday, OpenAI announced it might relax safety requirements if rivals release high-risk AI, raising concerns about prioritizing speed over safety. The company’s decision to ship GPT-4.1 without a safety report and reduce testing for fine-tuned models has alarmed former employees, who filed legal briefs highlighting risks (CSO Online).
The o3-mini model, released on January 31, scored “medium risk” on model autonomy due to its advanced coding capabilities, raising concerns about potential self-improvement. Critics argue that reasoning models like o3 are harder to control, excelling at bypassing safety mechanisms, which complicates evaluations.
Unprecedented Benchmark Performance
o3 has redefined AI benchmarks, showcasing its reasoning prowess. The ARC-AGI achieved 87.5% accuracy, surpassing human performance (85%) and tripling o1’s 32%, demonstrating superior visual and abstract reasoning. In Codeforces, it scored a 2727 rating, equivalent to an International Grandmaster, ranking among the top 200 competitive coders globally. Also in Frontier Math, it solved 25.2% of problems, a 1200% improvement over prior models, highlighting its mathematical reasoning.
However, François Chollet, creator of the ARC-AGI benchmark, cautioned that o3 is not artificial general intelligence (AGI).
“There’s still a fair number of very easy ARC-AGI tasks that o3 can’t solve,” he said, noting that AGI requires consistent performance across tasks trivial for humans.
OpenAI CEO, Sam Altman hailed o3 and o4-mini, saying “they are super good at coding”, and that a new product named Codex CLI, will be released “to make them easier to use.”
“This is a coding agent that runs on your computer. It is fully open source and available today; we expect it to rapidly improve,” he said, adding, “We expect to release o3-pro to the pro tier in a few weeks.”