FIREHOSE v5

Smart Web Extraction API for AI Agents & LLMs

Recursive crawling, JS rendering, intelligent extraction, structured data.
Beats Firecrawl at 1/1000th the price. No signup required.

Try it FREE View Pricing

Features

Smart Extraction

Trafilatura-powered article extraction. Strips nav, ads, footers. Returns clean text + metadata (title, author, date, language).

JS Rendering

Headless Chromium in an isolated jail. Renders SPAs, React, Vue. Auto-fallback when static extraction yields too little content.

Recursive Crawl

BFS crawl with parallel workers. Sitemap discovery, robots.txt compliant. Up to 200 pages per request.

Structured Data

Extract JSON-LD and microdata natively. No LLM needed. Schema.org, OpenGraph, product data — all included.

Batch Processing

Send up to 10 URLs in one request. Parallel fetching with 4 workers. One response, all results.

Privacy-First

No logs, no tracking. Free tier needs no API key. Requests proxied through isolated FreeBSD jails.

Endpoints

/v1/markdown

Convert a single URL or raw HTML to Markdown.

# Smart extraction with metadata
curl -s "https://synthetic-context.net/v1/markdown?url=https://example.com"

# With JSON-LD + microdata extraction
curl -s "https://synthetic-context.net/v1/markdown?url=https://example.com&extract=jsonld,microdata"

# JS rendering (paid tier)
curl -s -H "X-API-Key: sk_YOUR_KEY" \
  "https://synthetic-context.net/v1/markdown?url=https://spa-app.com&js=1"

# Batch POST
curl -s -X POST https://synthetic-context.net/v1/markdown \
  -H "Content-Type: application/json" \
  -d '{"urls": ["https://a.com", "https://b.com"], "extract": "jsonld"}'

/v1/crawl

Recursively crawl an entire website. Discovers pages via sitemap + link extraction.

# Crawl up to 10 pages, depth 2
curl -s -X POST https://synthetic-context.net/v1/crawl \
  -H "Content-Type: application/json" \
  -d '{"url": "https://docs.example.com", "max_pages": 10, "max_depth": 2}'

# Crawl with JS rendering + structured data (paid tier)
curl -s -X POST https://synthetic-context.net/v1/crawl \
  -H "Content-Type: application/json" \
  -H "X-API-Key: sk_YOUR_KEY" \
  -d '{"url": "https://spa.com", "max_pages": 50, "js": true, "extract": "jsonld"}'

Response Format

{
  "markdown": "# Article Title\n\nClean extracted text...",
  "meta": {"title": "...", "author": "...", "date": "...", "language": "en"},
  "jsonld": [{"@type": "Article", "name": "..."}],
  "microdata": [{"@type": "https://schema.org/Product", "name": "..."}],
  "source": "https://example.com",
  "length": 4521
}

Try It

Enter a URL and see the result:


Pricing

FreePaidPremium
Price05 sats/reqContact
Requests/day10010,000Unlimited
JS Rendering-YesYes
Crawl max pages550200
API KeyNot neededX-API-Key headerX-API-Key header
Batch URLs101010
Structured dataYesYesYes

vs Competitors

FeatureFirehoseFirecrawlCrawl4AIJina Reader
Smart extractionYesYesYesYes
JS renderingYesYesYesYes
Recursive crawlYesYesYesNo
Sitemap discoveryYesYesYesNo
JSON-LD extractionNative (0 LLM)Via LLM ($)Via LLM ($)No
Microdata extractionNativeNoNoNo
robots.txtYesYesYesNo
No signupYesNoYesNo
Price5 sats (~$0.005)From $38/moSelf-hostedFrom $0.01/req