
OpenAI has announced a new AI-powered research tool called ChatGPT Deep Research, designed to help users conduct thorough investigations across various fields, including finance, science, policy, and engineering.
Unlike ChatGPT’s quick-response capabilities, Deep Research aims to analyze, interpret, and cross-reference information from multiple sources, providing more precise and reliable research results.
The feature, launched on January 28, 2025, is initially available to ChatGPT Pro users, with a 100-query-per-month limit. OpenAI plans to expand access to Plus and Team subscribers next, followed by Enterprise users. However, the rollout is geographically restricted, with no release timeline for users in the U.K., Switzerland, and the European Economic Area.
Register for Tekedia Mini-MBA edition 16 (Feb 10 – May 3, 2025) today for early bird discounts.
Tekedia AI in Business Masterclass opens registrations.
Join Tekedia Capital Syndicate and co-invest in great global startups.
Register to become a better CEO or Director with Tekedia CEO & Director Program.
Unlike traditional chatbot responses, ChatGPT Deep Research is tailored for situations where users need more than just summarized answers. Whether making complex financial decisions, drafting scientific reports, or comparing high-value purchases, the tool aims to provide a more meticulous, multi-sourced research experience.
Users can access Deep Research through the ChatGPT web interface by selecting the feature in the composer and entering a query. There is also an option to attach files or spreadsheets for AI-assisted analysis. However, it’s currently a web-only feature, with mobile and desktop app integration expected later this month.
One of the defining aspects of Deep Research is the longer processing time, ranging from 5 to 30 minutes per query. OpenAI says users will receive a notification once the research is complete, reflecting the AI’s more thorough investigative process.
At launch, Deep Research’s outputs are text-only, but OpenAI has announced plans to introduce embedded images, data visualizations, and other analytical outputs in the near future. Additionally, the tool is expected to integrate with subscription-based and internal data sources for more specialized research capabilities.
Addressing AI Hallucinations and Accuracy Concerns
One of the most pressing concerns with AI research tools is accuracy. ChatGPT and other AI models have a history of hallucinations, where the AI generates misleading or incorrect information. OpenAI has attempted to mitigate these risks by ensuring every Deep Research output includes citations, source references, and a summary of its reasoning process.
Despite these safeguards, OpenAI acknowledges that Deep Research is not infallible. The company warns that the AI may still make mistakes, misinterpret data, or fail to differentiate between credible sources and rumors. Additionally, OpenAI notes that formatting errors in reports and citations remain a known issue.
A Smarter AI Model for Research
The o3 AI model, a specialized version of OpenAI’s GPT-4o, powers Deep Research. Unlike previous versions, this model has been optimized for web browsing, data analysis, and reasoning, using a method called reinforcement learning.
This means the AI learns from trial and error, receiving virtual “rewards” for achieving accurate and well-reasoned research results. The model is trained to search, interpret, and analyze vast amounts of online data, dynamically adjusting its approach when new information is encountered.
Additionally, Deep Research’s AI can analyze user-uploaded files, generate and iterate on graphs, and even embed cited images and charts from sources. This makes it particularly useful for technical and data-heavy research projects.
Performance and Competitor Comparisons
To evaluate Deep Research’s capabilities, OpenAI tested it using Humanity’s Last Exam, a benchmark containing 3,000 expert-level questions across various academic fields. The o3 model powering Deep Research scored 26.6%, far ahead of competitors:
- Gemini Thinking (Google) – 6.2%
- Grok-2 (X AI) – 3.8%
- GPT-4o (OpenAI’s previous model) – 3.3%
While 26.6% may not seem like an outstanding score, OpenAI emphasizes that Humanity’s Last Exam is designed to be significantly more challenging than standard benchmarks.
However, OpenAI concedes that Deep Research still has limitations. The AI may struggle with ambiguous or conflicting information, occasionally fail to communicate uncertainty, and still require fact-checking by users.
Will Users Actually Double-Check AI Research?
The launch of Deep Research raises an important question: Will users critically analyze AI-generated reports, or will they simply copy-paste the output without verification?
While OpenAI aims to make AI research more reliable with citations and documentation, past experiences with AI search tools suggest that many users tend to trust AI-generated responses without cross-referencing.
The success of Deep Research may depend on whether users actively engage with the cited sources or simply use the tool as a shortcut for generating seemingly authoritative reports.
Interestingly, OpenAI’s Deep Research is not the first AI tool to bear that name. Google announced a similar AI-powered research feature with the exact same name less than two months ago.
While OpenAI has not commented on this similarity, the competition between major AI companies to create the most powerful and reliable research assistant is heating up.
Days ago, Chinese AI company DeepSeek released a groundbreaking chatbot named R1, whose prowess had been acknowledged by OpenAI CEO, Sam Altman. DeepSeek took the tech world by storm, igniting fresh competition with cheaper spending on the project. Altman in response, promised that OpenAI will announce new features in the coming weeks.
However, as Deep Research expands to more users and devices, the real test will be whether it can deliver trustworthy, well-cited information—or if it becomes another AI tool that occasionally gets it wrong, forcing users to verify every claim before relying on its output.