Releases: DevsHero/ShadowCrawl
Release v2.0.0-rc
🥷 ShadowCrawl v2.0.0-rc: The "Boss Slayer" Release
This release marks a major evolution from a simple scraper to a Sovereign Stealth Intelligence Engine. We’ve engineered v2.0.0-rc to shatter the "unscrapable" walls of enterprise anti-bot systems while giving full control back to the user.
💥 High-Fidelity "Cyborg" Extraction (Native Only)
The headline feature is the non_robot_search tool—our nuclear option for high-security targets.
- 99.99% Bypass Success: Successfully tested against Cloudflare Turnstile, DataDome, Akamai, and LinkedIn gatekeepers.
- HITL (Human-In-The-Loop): A unique hybrid approach that bridges your native Brave/Chrome session. If a human can see it, ShadowCrawl can scrape it.
⚠️ Docker Limitation:non_robot_searchrequires a native GUI environment to launch the browser. It is NOT supported within Docker. For full HITL power, run the binary natively on macOS (primary target) or Linux.
🚀 Performance & Stability Hardening
- Global Mutex Locking: Prevents browser profile race conditions. Tools now queue and execute sequentially to avoid "Profile Locked" errors.
- Deep Metadata Fallback: Our engine now digs into HTML embedded IDs (like LinkedIn
urn:li:jobPosting) when standard JSON-LD is missing. - Smart Watchdog: Implemented OS-level process management to guarantee no "Zombie" browser processes remain after a scrape.
📂 Verified Evidence (Proof of Work)
Check the new sample-results/proof/ directory for live scrape artifacts generated by this version:
- LinkedIn: Verified bypass of job posting gatekeepers.
- nowsecure.nl: Successful extraction through Cloudflare Turnstile.
- cdiscount.com: Confirmed bypass of DataDome behavioral blocks.
📦 Quick Start & Deployment
1. Standard Stack (Automated Scrapes)
Best for server-side deployment and standard JS-heavy sites via Browserless.
docker compose -f docker-compose-local.yml up -d --build
2. Stealth Stack (HITL / Anti-Bot Bypass)
Required for non_robot_search. Run natively on your host machine to allow GUI interaction.cd mcp-server
cargo run --release --features non_robot_search🔗 Latest Changes
Full Commit Details: 370f6fe
Highlights: Implement stealth techniques for Chromiumoxide, safety kill switches, and evidence documentation.
🙏 Support the Mission
Built by a Solo Developer for the open-source community. Help us grow:
Star the repo ⭐
Become a Sponsor 💖
Full Changelog: https://github.com/DevsHero/ShadowCrawl/compare/v1.0.0...v2.0.0-rc
Release v1.1.0
Docker Image
Published to: ghcr.io/devshero/shadowcrawl:1.1.0
release: v1.1.0 quality/runtime hardening
Changes
-
centralize shared content quality policy helpers
-
add runtime quality_mode support across structured/scrape/crawl/extract paths
-
implement strict_proxy_health with non-strict diagnostic fallback
-
improve proxy connection testing with HEAD->GET fallback
-
harden scrape outputs (raw HTML omitted by default in JSON/batch)
-
enrich search output with search_id, dedupe, metadata, published_at
-
improve stdio/agent-mode resilience for research_history when memory unavailable
-
remove leaked registry tokens and local test artifacts; harden gitignore
-
Build and push Docker image
-
ShadowCrawl MCP Server v1.1.0
Pull the image:
docker pull ghcr.io/devshero/shadowcrawl:1.1.0Full Changelog: v1.0.1...v1.1.0
Release v1.0.1
Docker Image
Published to: ghcr.io/devshero/shadowcrawl:1.0.1
Changes
- Rebrand ShadowCrawl
- Build and push Docker image
- ShadowCrawl MCP Server v1.0.1
Pull the image:
docker pull ghcr.io/devshero/shadowcrawl:1.0.1Full Changelog: v0.3.0...v1.0.1
Release v1.0.0
🚀 Release v1.0.0 (General Availability)
The search-scrape project has evolved. v1.0.0 GA delivers a robust, self-hosted alternative to premium scraping APIs, specifically engineered for AI Agent workflows and MCP-native environments.
📦 Docker Image
The official image is now available on GitHub Container Registry:
ghcr.io/DevsHero/search-scrape:1.0.0
💎 Key Features & Enhancements (Since v0.3.0)
Unified MCP Surface: 100% parity between HTTP and stdio transport layers. No more tool-drift between CLI and IDE usage.
The Power of 8: Fully validated and optimized tool catalog:
search_web & search_structured (Federated Intelligence)
scrape_url & scrape_batch (Stealth Extraction)
crawl_website (Recursive Discovery)
extract_structured (Schema-driven Heuristics)
research_history (Long-term Semantic Memory via Qdrant)
proxy_manager (Autonomous Proxy Rotation)
Production Reliability: Hardened process lifecycle and improved shutdown handling to prevent premature task cancellation in VS Code/Cursor.
Enterprise-Grade Proxy Ops: Simplified proxy management using ip.txt as a canonical source with improved normalization.
Ready-to-Use Documentation: New specialized guides for IDE Setup and SearXNG Tuning to ensure 99.9% success rates.
⚡ Quick Start
Get up and running in under 60 seconds:
git clone https://github.com/DevsHero/AnvilSynth.git && cd AnvilSynth docker compose -f docker-compose-local.yml up -d --build
Verify Service Health
curl -s http://localhost:5001/health
Discover Tool Capabilities
curl -s http://localhost:5001/mcp/tools
📝 Important Notes
Stealth Performance: While v1.0.0 includes advanced anti-bot measures, highly protected sites (e.g., LinkedIn/PerimeterX) perform best with high-quality residential proxies.
Validation: This release has passed the RELEASE_READINESS_2026-02-12 validation suite
Pull the image:
docker pull ghcr.io/devshero/search-scrape:latestRelease v0.3.0
docker pull ghcr.io/devshero/search-scrape:0.3.0
Release v0.2.0
v0.2.0
Integrated key enhancements while maintaining original performance:
- crawl_website: Added recursive crawling for deep content extraction.
- scrape_batch: Added concurrent scraping for better efficiency.
- extract_structured: Added LLM-based structured JSON output.
Special thanks to @lutfi238 for the excellent work on these features!
Ref: https://github.com/lutfi238/search-scrape
- Update documents
- Update cargo packages
- Github action workflows
Docker Image
Published to: ghcr.io/DevsHero/search-scrape:0.2.0
Changes
- Build and push Docker image
- Search-Scrape MCP Server v0.2.0
Pull the image:
docker pull ghcr.io/DevsHero/search-scrape:0.2.0