Deep Research Examples

A collection of practical examples demonstrating how to build complex, multi-step AI research workflows. These examples showcase different approaches to orchestrating LLM agents for systematic analysis tasks.

What is Deep Research?

Deep research refers to AI workflows that go beyond simple prompting—systematically processing large datasets, querying multiple data sources, and maintaining state across extended operations. These workflows break complex tasks into specialized steps, each with its own context, tools, and objectives.

Repository Structure

deep-research-examples/
├── examples/
│   ├── ai_theme_plays/           # DeepAgents: Multi-stage analysis with enforcement
│   ├── gemini_cloudflare_workflows/ # Cloudflare Workers: Async research with Gemini
│   ├── pm_deep_agent/            # DeepAgents: Product research with dual streams
│   └── stagehand_company_news/   # Stagehand: Browser automation for news discovery
└── README.md

Approaches

This repository demonstrates several approaches to deep research:

Approach	Best For	Examples
DeepAgents	Database queries, batch processing, multi-source analysis	AI Theme Plays, PM Deep Agent
Cloudflare Workflows	Long-running async workflows, serverless orchestration	Gemini Cloudflare Workflows
Stagehand	Browser automation, web scraping, dynamic content	Company News Discovery

Examples

Company News Discovery (Stagehand)

Path: examples/stagehand_company_news/

Tech Stack: Stagehand + Multi-Provider LLMs (OpenAI, Anthropic, Google, Ollama) + S3

Automated discovery of company news and press release pages using browser automation and LLM verification. Given a company domain, finds the official press release listing page.

The Challenge: News pages have no standard location (/news, /newsroom, /press, /media), links are often hidden in dropdown menus, and search engines return individual articles instead of listing pages.

How It Works:

A 3-step discovery flow with LLM verification at each step:

Search - Query DuckDuckGo for site:{domain} news OR press and verify candidates
Homepage - Extract nav/header/footer links, expand dropdowns, verify candidates
Site Search - Use the site's own search bar as a last resort

Key Features:

Multi-provider LLM support - Compare accuracy across OpenAI, Anthropic, Google, and Ollama
S3 persistence - All session data (results, metrics, logs) flushed after each company
Stealth browser automation - Handles bot detection with non-headless mode
Incremental caching - Skip LLM inference on repeated runs

View Full Documentation →

Earnings Alignment Analysis (Cloudflare Workers)

Path: examples/gemini_cloudflare_workflows/

Tech Stack: Cloudflare Workers + Python Workflows + Gemini Deep Research + D1 + R2 + Hyperdrive

Automated analysis of how companies execute on their earnings call promises. Given a ticker and earnings date, fetches the transcript and subsequent press releases, then uses Gemini Deep Research to identify alignment between management guidance and actual announcements.

The Challenge: Gemini Deep Research takes ~30 minutes to complete, requiring async workflow orchestration with persistent state management across serverless functions.

How It Works:

A 6-step Cloudflare Workflow:

fetch_data - Retrieves earnings transcript and press releases from MongoDB
prepare_upload - Uploads context documents to R2 and Gemini FileStore
start_research - Initiates Gemini Deep Research job
poll_for_result - Polls until research completes (~30 min)
extract_structured_output - Uses Workers AI to parse results into categories
save_result - Persists to D1 and Postgres via Hyperdrive

Output Categories:

Confirmed Execution - Guidance followed through with press releases
Unaddressed Guidance - Promises with no subsequent PR confirmation
New Developments - PR announcements not previewed in earnings

View Full Documentation →

AI Theme Plays (DeepAgents)

Path: examples/ai_theme_plays/

Tech Stack: LangChain + DeepAgents + PostgreSQL + MongoDB + S3

A sophisticated analysis pipeline that takes earnings transcripts (like Jensen Huang's GTC keynote) and systematically finds companies that align with mentioned themes.

The Challenge: Process 2,400 companies against extracted themes, validate each match with press release evidence from MongoDB, and rank the top 100 by alignment strength—all while preventing the LLM from skipping items or producing inconsistent output.

How It Works:

Four specialized subagents in sequence:

Transcript Analyzer - Extracts key themes from the input transcript
Company Matcher - Processes all 2,400 companies in batches of 50 from PostgreSQL
Press Release Validator - Queries MongoDB for press releases and validates matches
Final Ranker - Consolidates all data and ranks the top 100 companies

Key Innovations:

Stateful Tools with Sequential Enforcement - Tools track expected state and reject invalid operations
Validation Middleware - Intercepts tool calls to verify input/output counts match
Schema-Driven Prompts - Dynamically generates JSON examples from Pydantic models
Pydantic Validation - All file writes/reads validate against typed models

View Full Documentation →

PM Deep Agent (DeepAgents)

Path: examples/pm_deep_agent/

Tech Stack: LangChain + DeepAgents

A product management research agent that compares how companies market their products versus how users actually discuss them on social media.

The Challenge: Determine whether marketing use-cases and personas align with those expressed by real users on social platforms.

How It Works:

An orchestrating agent coordinates two specialized subagents:

Marketing Sub-Agent - Analyzes first-party marketing materials (product pages, docs, case studies)
Social Media Sub-Agent - Analyzes user-generated content (Twitter/X, Reddit, Hacker News)

The main agent compares both outputs to identify alignment, over-positioning, and unmet opportunities.

View Full Documentation →

Getting Started

Each example includes its own README with detailed setup instructions, architecture diagrams, and implementation details:

Example	Focus
Company News Discovery	Browser automation, multi-provider LLMs, web scraping
Earnings Alignment Analysis	Async workflows, Gemini Deep Research, serverless orchestration
AI Theme Plays	Workflow enforcement, multi-database integration, batch processing
PM Deep Agent	Dual-stream research, comparative analysis

Additional Resources

License

MIT License - see LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
examples		examples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Research Examples

What is Deep Research?

Repository Structure

Approaches

Examples

Company News Discovery (Stagehand)

Earnings Alignment Analysis (Cloudflare Workers)

AI Theme Plays (DeepAgents)

PM Deep Agent (DeepAgents)

Getting Started

Additional Resources

License

About

Uh oh!

Releases

Packages

License

CollierKing/deep-research-examples

Folders and files

Latest commit

History

Repository files navigation

Deep Research Examples

What is Deep Research?

Repository Structure

Approaches

Examples

Company News Discovery (Stagehand)

Earnings Alignment Analysis (Cloudflare Workers)

AI Theme Plays (DeepAgents)

PM Deep Agent (DeepAgents)

Getting Started

Additional Resources

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages