Scrape
Extract clean Markdown content, metadata, and links from any URL.
https://api.crawly.bikal.co/v1/scrapeThe Scrape endpoint extracts the main content from a single URL and returns it as clean Markdown. It automatically handles JavaScript-rendered pages using a headless browser, follows redirects, and strips navigation, ads, and boilerplate.
Each request costs 1 credit ($0.001). Failed requests are automatically refunded. The endpoint supports any publicly accessible webpage, including SPAs and dynamically loaded content.
Request Body
urlstringREQUIREDThe URL to scrape. Must be a valid HTTP or HTTPS URL including the protocol.
timeoutnumberDefault: 30Maximum time in seconds to wait for the page to load. Range: 5 to 60.
Examples
curl -X POST https://api.crawly.bikal.co/v1/scrape \-H "Authorization: Bearer YOUR_API_KEY" \-H "Content-Type: application/json" \-d '{"url": "https://example.com"}'
Response
Success 200
{"success": true,"url": "https://example.com","markdown": "# Example Domain\n\nThis domain is for use in illustrative examples...","metadata": {"title": "Example Domain","description": "Example Domain for documentation","language": "en","og_image": "https://example.com/og.png","status_code": 200}}
Response Fields
successbooleanWhether the scrape completed successfully.
urlstringThe final URL after following any redirects.
markdownstringThe extracted page content as clean Markdown.
metadataobjectPage metadata including title, description, language, og_image, and status_code.
Errors
| Status | Type | Description |
|---|---|---|
400 | VALIDATION_ERROR | Missing or invalid url parameter |
401 | AUTH_ERROR | Missing or invalid API key |
402 | CREDITS_EXHAUSTED | Insufficient credits |
422 | EMPTY_CONTENT | Page loaded but no content could be extracted |
429 | RATE_LIMITED | Rate limit exceeded (120 requests per minute) |
502 | CONNECTION_ERROR | Could not connect to the target URL |
504 | PAGE_TIMEOUT | Page took too long to load |
All error responses follow this format:
{"success": false,"error": "Could not extract content from this page","error_type": "EMPTY_CONTENT"}
Pricing
Credits are deducted upfront when the request is made. If the request fails, the credit is automatically refunded to your balance.
Notes
- Every scrape fetches a live, up-to-date version of the page.
- JavaScript-rendered pages (SPAs) are fully supported via headless browser rendering.
- The API automatically follows redirects and resolves the final URL.
- Content is extracted as clean Markdown with ads, navigation, and boilerplate removed.
- Maximum timeout is 60 seconds. Requests exceeding this return a
504error. - All URLs must include the protocol (
https://).