Skip to content

Add Firecrawl curated skill for web scraping and search#163

Open
leonardogrig wants to merge 1 commit intoopenai:mainfrom
firecrawl:add-firecrawl-skill
Open

Add Firecrawl curated skill for web scraping and search#163
leonardogrig wants to merge 1 commit intoopenai:mainfrom
firecrawl:add-firecrawl-skill

Conversation

@leonardogrig
Copy link

I'm Leo, dev rel at Firecrawl. The skills catalog covers a lot of ground already, from deployment to observability to browser automation. One thing that's missing is a dedicated web data extraction skill: scraping pages to clean markdown, searching the web, discovering URLs across a site. Adding Firecrawl here so Codex users can pull structured web content into their workflows without writing scraping logic from scratch.

Happy to adjust anything, add extra credits for testing, or help with integration details.


What this adds

A new curated skill at skills/.curated/firecrawl/ that gives Codex agents the ability to search the web, scrape pages, and extract structured data via the firecrawl CLI.

Files:

  • skills/.curated/firecrawl/SKILL.md - CLI reference with search, scrape, and map commands, auth setup, file organization patterns, parallelization, and sandbox troubleshooting
  • skills/.curated/firecrawl/agents/openai.yaml - Skill metadata (display name, description, icons, default prompt)
  • skills/.curated/firecrawl/assets/firecrawl-small.svg - Small icon (currentColor SVG for theming)
  • skills/.curated/firecrawl/assets/firecrawl.png - Large icon (100x100 PNG)
  • skills/.curated/firecrawl/LICENSE.txt - Apache 2.0

Why this matters for the Skills Catalog

The existing curated skills handle deployment (Vercel, Netlify, Cloudflare, Render), browser automation (Playwright), project management (Linear), observability (Sentry), and content creation (PDF, docs, spreadsheets). What's missing is a way for agents to pull live web content into their context: reading documentation pages, researching APIs, checking current information, extracting data from URLs.

Firecrawl fills that gap. It's a CLI tool that handles:

  • Web search - Search the web with optional scraping of results, filtering by source (web, news, images), time range, and location
  • Page scraping to clean markdown - Convert any URL to LLM-friendly markdown, with JS rendering support and tag-level filtering
  • Site-wide URL discovery - Map all URLs on a domain, with search filtering and subdomain support
  • Structured output - File-based output that keeps agent context clean, with parallel execution support

How agents use it

# Search the web and scrape results
firecrawl search "react server components" --scrape -o .firecrawl/search-rsc.json

# Scrape a documentation page
firecrawl scrape https://docs.example.com/api -o .firecrawl/api-docs.md

# Discover all URLs on a site
firecrawl map https://docs.example.com --search "webhooks" -o .firecrawl/webhook-urls.txt

# Parallel scraping for multiple pages
firecrawl scrape https://site1.com -o .firecrawl/1.md &
firecrawl scrape https://site2.com -o .firecrawl/2.md &
wait

Setup

npm install -g firecrawl-cli
firecrawl login --browser

The skill includes auth error handling and sandbox troubleshooting (with sandbox_permissions=require_escalated guidance for network access).

Follows repo conventions

  • Directory structure matches existing curated skills (agents/openai.yaml, assets/, LICENSE.txt, SKILL.md)
  • SKILL.md frontmatter uses only allowed fields (name, description)
  • Skill name follows hyphen-case rules
  • Icons match the format: small SVG with currentColor, large PNG at 100x100
  • Passes quick_validate.py validation
  • Purely additive, no existing files modified

Firecrawl DevRel

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@leonardogrig leonardogrig requested a review from a team February 13, 2026 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant