A collection of practical examples demonstrating how to build complex, multi-step AI research workflows. These examples showcase different approaches to orchestrating LLM agents for systematic analysis tasks.
Deep research refers to AI workflows that go beyond simple prompting—systematically processing large datasets, querying multiple data sources, and maintaining state across extended operations. These workflows break complex tasks into specialized steps, each with its own context, tools, and objectives.
deep-research-examples/
├── examples/
│ ├── ai_theme_plays/ # DeepAgents: Multi-stage analysis with enforcement
│ ├── gemini_cloudflare_workflows/ # Cloudflare Workers: Async research with Gemini
│ ├── pm_deep_agent/ # DeepAgents: Product research with dual streams
│ └── stagehand_company_news/ # Stagehand: Browser automation for news discovery
└── README.md
This repository demonstrates several approaches to deep research:
| Approach | Best For | Examples |
|---|---|---|
| DeepAgents | Database queries, batch processing, multi-source analysis | AI Theme Plays, PM Deep Agent |
| Cloudflare Workflows | Long-running async workflows, serverless orchestration | Gemini Cloudflare Workflows |
| Stagehand | Browser automation, web scraping, dynamic content | Company News Discovery |
Path: examples/stagehand_company_news/
Tech Stack: Stagehand + Multi-Provider LLMs (OpenAI, Anthropic, Google, Ollama) + S3
Automated discovery of company news and press release pages using browser automation and LLM verification. Given a company domain, finds the official press release listing page.
The Challenge: News pages have no standard location (/news, /newsroom, /press, /media), links are often hidden in dropdown menus, and search engines return individual articles instead of listing pages.
How It Works:
A 3-step discovery flow with LLM verification at each step:
- Search - Query DuckDuckGo for
site:{domain} news OR pressand verify candidates - Homepage - Extract nav/header/footer links, expand dropdowns, verify candidates
- Site Search - Use the site's own search bar as a last resort
Key Features:
- Multi-provider LLM support - Compare accuracy across OpenAI, Anthropic, Google, and Ollama
- S3 persistence - All session data (results, metrics, logs) flushed after each company
- Stealth browser automation - Handles bot detection with non-headless mode
- Incremental caching - Skip LLM inference on repeated runs
Path: examples/gemini_cloudflare_workflows/
Tech Stack: Cloudflare Workers + Python Workflows + Gemini Deep Research + D1 + R2 + Hyperdrive
Automated analysis of how companies execute on their earnings call promises. Given a ticker and earnings date, fetches the transcript and subsequent press releases, then uses Gemini Deep Research to identify alignment between management guidance and actual announcements.
The Challenge: Gemini Deep Research takes ~30 minutes to complete, requiring async workflow orchestration with persistent state management across serverless functions.
How It Works:
A 6-step Cloudflare Workflow:
- fetch_data - Retrieves earnings transcript and press releases from MongoDB
- prepare_upload - Uploads context documents to R2 and Gemini FileStore
- start_research - Initiates Gemini Deep Research job
- poll_for_result - Polls until research completes (~30 min)
- extract_structured_output - Uses Workers AI to parse results into categories
- save_result - Persists to D1 and Postgres via Hyperdrive
Output Categories:
- Confirmed Execution - Guidance followed through with press releases
- Unaddressed Guidance - Promises with no subsequent PR confirmation
- New Developments - PR announcements not previewed in earnings
Path: examples/ai_theme_plays/
Tech Stack: LangChain + DeepAgents + PostgreSQL + MongoDB + S3
A sophisticated analysis pipeline that takes earnings transcripts (like Jensen Huang's GTC keynote) and systematically finds companies that align with mentioned themes.
The Challenge: Process 2,400 companies against extracted themes, validate each match with press release evidence from MongoDB, and rank the top 100 by alignment strength—all while preventing the LLM from skipping items or producing inconsistent output.
How It Works:
Four specialized subagents in sequence:
- Transcript Analyzer - Extracts key themes from the input transcript
- Company Matcher - Processes all 2,400 companies in batches of 50 from PostgreSQL
- Press Release Validator - Queries MongoDB for press releases and validates matches
- Final Ranker - Consolidates all data and ranks the top 100 companies
Key Innovations:
- Stateful Tools with Sequential Enforcement - Tools track expected state and reject invalid operations
- Validation Middleware - Intercepts tool calls to verify input/output counts match
- Schema-Driven Prompts - Dynamically generates JSON examples from Pydantic models
- Pydantic Validation - All file writes/reads validate against typed models
Path: examples/pm_deep_agent/
Tech Stack: LangChain + DeepAgents
A product management research agent that compares how companies market their products versus how users actually discuss them on social media.
The Challenge: Determine whether marketing use-cases and personas align with those expressed by real users on social platforms.
How It Works:
An orchestrating agent coordinates two specialized subagents:
- Marketing Sub-Agent - Analyzes first-party marketing materials (product pages, docs, case studies)
- Social Media Sub-Agent - Analyzes user-generated content (Twitter/X, Reddit, Hacker News)
The main agent compares both outputs to identify alignment, over-positioning, and unmet opportunities.
Each example includes its own README with detailed setup instructions, architecture diagrams, and implementation details:
| Example | Focus |
|---|---|
| Company News Discovery | Browser automation, multi-provider LLMs, web scraping |
| Earnings Alignment Analysis | Async workflows, Gemini Deep Research, serverless orchestration |
| AI Theme Plays | Workflow enforcement, multi-database integration, batch processing |
| PM Deep Agent | Dual-stream research, comparative analysis |
MIT License - see LICENSE file for details.