Home Latest Insights | News AI Model Wars Intensify as Google Launches Gemini 2.5 Deep Think, Escalating Race with ChatGPT, Grok

AI Model Wars Intensify as Google Launches Gemini 2.5 Deep Think, Escalating Race with ChatGPT, Grok

AI Model Wars Intensify as Google Launches Gemini 2.5 Deep Think, Escalating Race with ChatGPT, Grok

The race among leading artificial intelligence labs to dominate the next phase of AI reasoning has entered a new gear, with Google DeepMind rolling out Gemini 2.5 Deep Think, its most advanced AI model yet.

The company claims the new model is capable of answering complex questions by generating and weighing multiple independent thoughts before selecting the most accurate answer — a major step up from conventional single-agent AI models.

Starting Friday, Gemini 2.5 Deep Think will be available through Google’s $250-a-month Ultra subscription plan, giving high-end users access to what the company calls its “first publicly available multi-agent system.” This system works by spawning multiple AI agents to approach a question from different angles simultaneously, combining those threads into a coherent and refined response.

Register for Tekedia Mini-MBA edition 19 (Feb 9 – May 2, 2026): big discounts for early bird

Tekedia AI in Business Masterclass opens registrations.

Join Tekedia Capital Syndicate and co-invest in great global startups.

Register for Tekedia AI Lab: From Technical Design to Deployment (next edition begins Jan 24 2026).

While the method is significantly more computationally intensive, Google says it results in vastly better reasoning and accuracy.

The rollout follows DeepMind’s presentation of the system at its I/O 2025 conference in May, but the company now says the released version incorporates newer reinforcement learning techniques that allow the model to reason more creatively and effectively.

“Deep Think can help people tackle problems that require creativity, strategic planning and making improvements step-by-step,” the company said in a statement.

In benchmarking tests, Gemini 2.5 Deep Think scored 34.8% on Humanity’s Last Exam (HLE) — a rigorous measure of AI understanding across math, humanities, and science — outperforming its competitors. Elon Musk’s xAI’s Grok 4 scored 25.4%, while OpenAI’s o3 achieved 20.3%. On LiveCodeBench 6, which tests performance on competitive coding challenges, Google’s model also led with 87.6%, outpacing Grok 4 (79%) and o3 (72%).

These gains add to the intensifying arms race in the AI sector. Over the past few months, xAI, OpenAI, and Anthropic have all pushed out new models, each touting breakthroughs in performance and reasoning.

OpenAI, for instance, has been refining its GPT-4 and o3 model lines, and recently hinted at more powerful iterations under internal testing, including a multi-agent system similar to Google’s and xAI’s. OpenAI’s Noam Brown confirmed that the company used a multi-agent setup for its own gold-medal performance at this year’s International Math Olympiad (IMO), though the model hasn’t yet been released to the public.

xAI’s Grok 4 Heavy, meanwhile, has been marketed as a direct rival to both ChatGPT and Gemini, leveraging a multi-agent system that Musk says delivers superior performance across coding, logic, and problem-solving tasks. While Grok 4 models are increasingly being integrated into the X platform, access remains limited and premium-tiered, much like Google’s Deep Think.

Anthropic is also in the mix with its Claude family of AI models. Its latest offering, Claude Research Agent, is similarly powered by multi-agent systems and designed to generate highly detailed and structured research outputs.

These moves collectively underscore a critical industry trend: the convergence around multi-agent reasoning. While traditional large language models (LLMs) typically process queries as single-threaded thought streams, multi-agent systems divide and parallelize reasoning, often using internal debate-like mechanisms. The result is not just more accurate answers, but also responses that show better reasoning steps, especially in complex tasks like mathematics, programming, and scientific discovery.

However, the progress comes with a cost. Multi-agent models require significantly more computing power, making them expensive to run and maintain. As a result, the leading tech companies have opted to restrict these models to their highest-paying subscribers. Google’s $250/month Ultra plan for Gemini 2.5 Deep Think mirrors similar premium-tier strategies from both OpenAI and xAI.

Despite the price wall, Google is also making some of the model’s capabilities available to select mathematicians and academic researchers, particularly the variation of the system that secured a gold medal at the IMO. This version, Google says, takes hours to generate responses — unlike consumer-facing models that operate in seconds — but offers the kind of deep, methodical reasoning researchers crave.

In the coming weeks, Google plans to open the Gemini API for developers and enterprise testers, aiming to observe how the multi-agent system performs in real-world environments outside Google’s sandbox.

The AI model war is clearly far from over. With every major lab now aligning behind multi-agent architecture and pushing boundaries on creativity, strategic reasoning, and deep cognition, the race is no longer just about answering questions — it’s about thinking more like humans.

Google announced the rollout of its Gemini 2.5 Deep Think artificial intelligence model on Friday, releasing the tool to its paid Ultra subscribers. First unveiled in the spring, Gemini 2.5 is a “multi-agent” reasoning model, making it well suited for devising “creative solutions to complex problems” such as math and coding, Google says. A different version of the model earned a gold medal score at the International Mathematical Olympiad last month; Google says it is also releasing that version to a group of mathematicians and researchers.

No posts to display

Post Comment

Please enter your comment!
Please enter your name here