OpenAI has introduced a new artificial intelligence tool named deep research that conducts extensive online research for users, addressing tasks ranging from complex scientific inquiries to personalized product recommendations. The service is available to paying customers through OpenAI’s ChatGPT chatbot.
What is deep research?
Deep research can generate comprehensive reports in as little as five to 30 minutes, a task that typically takes users “many hours,” according to OpenAI. The tool analyzes a variety of resources, including text, images, PDFs, and user-uploaded files, to synthesize information similarly to a research analyst. Kevin Weil, OpenAI’s chief product officer, emphasized its ability to execute complex tasks, comparing its performance to that of a human researcher.
OpenAI launches o3-mini, still more expensive than DeepSeek R1
This launch follows OpenAI’s introduction of another AI agent called Operator, which assists with tasks like booking flights and managing grocery orders. Both services are available exclusively to users subscribed to the $200-per-month ChatGPT Pro plan, indicating a strategic focus on paid subscription services to fund these advanced features.
Model | Accuracy (%) |
---|---|
GPT-4o | 3.3 |
Grok-2 | 3.8 |
Claude 3.5 Sonnet | 4.3 |
Gemini Thinking | 6.2 |
OpenAI o1 | 9.1 |
DeepSeek-R1* | 9.4 |
OpenAI o3-mini (medium)* | 10.5 |
OpenAI o3-mini (high)* | 13.0 |
OpenAI deep research ** | 26.6 |
The tool exemplifies a broader trend in the AI industry towards developing agents capable of performing multi-step tasks with minimal supervision. Competitors, including Microsoft Corp. and Anthropic, are also exploring similar technologies in hopes of enhancing productivity across both personal and professional tasks.
OpenAI CEO Sam Altman has indicated that the development of such agents may represent a significant breakthrough in artificial intelligence. The urgency of this progress is underscored by increasing competition from Chinese AI firms like DeepSeek, which are quickly advancing in the sector.
Despite its capabilities, OpenAI has cautioned about potential limitations associated with deep research. The tool may produce fabricated information and often confuses credible sources with rumors. Users may encounter limitations, such as the inability to submit more than 100 queries per month during the initial rollout.
The launch of deep research was demonstrated at an event in Washington, where it successfully compiled information about Albert Einstein, including generating relevant questions for hypothetical congressional hearings. The reports produced by deep research also include citations, although inaccuracies can arise from a phenomenon known as “hallucination” in AI.
GAIA | Level 1 | Level 2 | Level 3 | Avg. |
---|---|---|---|---|
Previous SOTA | 67.92 | 67.44 | 42.31 | 63.64 |
Deep Research (pass@1) | 74.29 | 69.06 | 47.60 | 67.36 |
Deep Research (cons@64) | 78.66 | 73.21 | 58.03 | 72.57 |
OpenAI plans to expand access to deep research more broadly in the future, targeting users subscribed to its Plus, Team, and Enterprise plans. The tool utilizes a version of the company’s latest reasoning technology, OpenAI o3, which is specifically optimized for web browsing and data analysis.
Deep research’s training incorporated real-world tasks requiring both browsing and reasoning capabilities. It also employs reinforcement learning techniques, enhancing its ability to navigate and synthesize information effectively. Recent evaluations have shown the model is achieving unprecedented accuracy in complex research tasks.
OpenAI has reported that, in an evaluation called Humanity’s Last Exam, the model powering deep research scored 26.6% accuracy, a notable achievement for AI systems tackling expert-level questions across diverse subjects. Furthermore, on the GAIA public benchmark, the tool surpassed previous performance records by demonstrating capabilities that necessitate reasoning and multi-modal fluency.
While the tool is currently very compute-intensive, OpenAI anticipates improvements to make it more efficient and user-friendly over time, with plans for future iterations that may enhance its features and accessibility.
Deep research became available to ChatGPT users on Sunday, with future enhancements expected to roll out across mobile and desktop platforms. OpenAI envisions expanding the tool’s capabilities to include access to more specialized data sources, thereby enriching the context and personalization of its outputs.
DeepSeek R1 vs o3-mini in performance, cost, and usability showdown
Is ChatGPT deep research worth it?
Yes, if:
- You need fast, comprehensive research: Deep Research can generate detailed reports in 5 to 30 minutes, saving you hours of manual work. If you frequently need quick, well-synthesized information, this tool is a game-changer.
- You handle complex tasks: The tool is designed to perform multi-step tasks, making it ideal for professionals who need to analyze data, compile reports, or conduct in-depth research across various domains.
- You’re a ChatGPT Pro subscriber: If you’re already paying for the $200-per-month ChatGPT Pro plan, you’ll have access to Deep Research and other advanced features like Operator, making it a valuable addition to your toolkit.
- You value AI-driven productivity: If you’re looking to leverage AI to enhance productivity in both personal and professional settings, Deep Research aligns with the broader trend of AI agents performing complex tasks with minimal supervision.
- You’re in a competitive field: With competitors like Microsoft and Anthropic developing similar tools, staying ahead of the curve by using advanced AI research tools could give you an edge.
No, if:
- You’re on a tight budget: At $200 per month, the ChatGPT Pro plan is a significant investment. If you don’t need advanced AI tools frequently, the cost may not justify the benefits.
- You’re concerned about accuracy: Deep Research has limitations, including the potential to produce fabricated information or confuse credible sources with rumors. If your work requires 100% accuracy, this tool might not be reliable enough.
- You exceed query limits: During the initial rollout, users are limited to 100 queries per month. If your research needs exceed this limit, you may find the tool restrictive.
- You prefer manual research: If you enjoy or require hands-on control over your research process, relying on an AI tool might not align with your workflow or preferences.
- You don’t need advanced features: If your research needs are simple or infrequent, the advanced capabilities of Deep Research might be overkill, and you could achieve your goals with more basic tools.
Featured image credit: OpenAI