POST /scrape
Endpoint to scrape a single webpage using a custom scraper configuration.
https://api.webcrawlerapi.com/v1/scrape
Format: JSON
Method: POST
Request
Available request params:
crawler_id
- (required) The ID of the custom scraper to use (check the list here.input
- (required, object) Additional input parameters required by the scraper.webhook_url
- (optional) URL to receive a POST request when scraping is complete.
Example:
{
"crawler_id": "webcrawler/url-to-md",
"input": {
"url": "https://www.example.com"
}
}
Response
The response will contain a job ID that can be used to check the scraping status and retrieve results:
{
"id": "23b81e21-c672-4402-a886-303f18de9555"
}
Checking Results
You can check the scraping status and results using these endpoints:
- Get status only:
GET /v1/scrape/{id}/meta
See GET /scrape/:id/meta for details.
- Get structured data only:
GET /v1/scrape/{id}/result
See GET /scrape/:id/result for details.
- Get both status and data:
GET /v1/scrape/{id}
See GET /scrape/:id for details.
Status Response Example
{
"id": "23b81e21-c672-4402-a886-303f18de9555",
"url": "https://example.com/product/123",
"status": "done",
"crawler_id": "webcrawler/url-to-md",
"page_status_code": 200,
"created_at": "2024-03-20 15:30:45",
"finished_at": "2024-03-20 15:30:47",
"structured_data": {
"title": "Example Product",
"price": 99.99,
"currency": "USD"
}
}
The status
field can be:
new
- Job is queuedin_progress
- Currently scrapingdone
- Scraping completed successfullyerror
- Scraping failed
When using asynchronous scraping with webhook_url
, your server will receive a POST request with the complete result
once the scraping is finished.
Error Responses
400 Bad Request
- Invalid parameters or missing required fields401 Unauthorized
- Invalid or missing API key402 Payment Required
- Insufficient account balance404 Not Found
- Specified crawler not found500 Internal Server Error
- Server-side error
Refer to Async Requests for more information about handling asynchronous scraping jobs.