docs
API
Scrape

POST /scrape

Endpoint to scrape a single webpage.

https://api.webcrawlerapi.com/v2/scrape

Format: JSON
Method: POST

Request

Available request params:

  • url - (required) The URL of the webpage to scrape.
  • prompt - (optional) A prompt to run on the scraped content. This can be used to extract specific information or to format the output.
  • output_format - (optional) The format of the output. Can be markdown, cleaned or html. Default is markdown.
  • clean_selectors - (optional) CSS selectors to clean from the output. Read more about advanced cleaning in clean selectors.

Example:

{
    "url": "https://www.example.com",
    "output_format": "markdown",
    "clean_selectors": [
        ".advertisement,.footer"
    }
}

Response

The response will contain a status and the output in the requested format.

{
    "status": "done",
    "markdown": "## Example Product\n\nThis is an example product page. It has a title, a price, and a description.",
    "page_status_code": 200,
    "page_title": "Example Product"
}

Scrape errors

If the scrape fails, the response will have 200 status code but the success will be false, the error_code and error_message will be set.

For example:

{
    "success": false,
    "error_code": "name_not_resolved",
    "error_message": "Unable to resolve domain name"
}

Read more about error codes in Error section.

Error Responses

  • 400 Bad Request - Invalid parameters or missing required fields
  • 401 Unauthorized - Invalid or missing API key
  • 402 Payment Required - Insufficient account balance
  • 500 Internal Server Error - Server-side error

Refer to Async Requests for more information about handling asynchronous scraping jobs.