POST /scrape
Endpoint to scrape a single webpage.
https://api.webcrawlerapi.com/v2/scrape
Format: JSON
Method: POST
Request
Available request params:
url
- (required) The URL of the webpage to scrape.prompt
- (optional) A prompt to run on the scraped content. This can be used to extract specific information or to format the output.output_format
- (optional) The format of the output. Can bemarkdown
,cleaned
orhtml
. Default ismarkdown
.clean_selectors
- (optional) CSS selectors to clean from the output. Read more about advanced cleaning in clean selectors.
Example:
{
"url": "https://www.example.com",
"output_format": "markdown",
"clean_selectors": [
".advertisement,.footer"
}
}
Response
The response will contain a status and the output in the requested format.
{
"status": "done",
"markdown": "## Example Product\n\nThis is an example product page. It has a title, a price, and a description.",
"page_status_code": 200,
"page_title": "Example Product"
}
Scrape errors
If the scrape fails, the response will have 200 status code but the success
will be false
, the error_code
and error_message
will be set.
For example:
{
"success": false,
"error_code": "name_not_resolved",
"error_message": "Unable to resolve domain name"
}
Read more about error codes in Error section.
Error Responses
400 Bad Request
- Invalid parameters or missing required fields401 Unauthorized
- Invalid or missing API key402 Payment Required
- Insufficient account balance500 Internal Server Error
- Server-side error
Refer to Async Requests for more information about handling asynchronous scraping jobs.