docs
API
Scrape

POST /scrape

Endpoint to scrape a single webpage using a custom scraper configuration.

https://api.webcrawlerapi.com/v1/scrape

Format: JSON
Method: POST

Request

Available request params:

  • crawler_id - (required) The ID of the custom scraper to use (check the list here.
  • input - (required, object) Additional input parameters required by the scraper.
  • webhook_url - (optional) URL to receive a POST request when scraping is complete.

Example:

{
    "crawler_id": "webcrawler/url-to-md",
    "input": {
        "url": "https://www.example.com"
    }
}

Response

The response will contain a job ID that can be used to check the scraping status and retrieve results:

{
    "id": "23b81e21-c672-4402-a886-303f18de9555"
}

Checking Results

You can check the scraping status and results using these endpoints:

  1. Get status only:
GET /v1/scrape/{id}/meta

See GET /scrape/:id/meta for details.

  1. Get structured data only:
GET /v1/scrape/{id}/result

See GET /scrape/:id/result for details.

  1. Get both status and data:
GET /v1/scrape/{id}

See GET /scrape/:id for details.

Status Response Example

{
    "id": "23b81e21-c672-4402-a886-303f18de9555",
    "url": "https://example.com/product/123",
    "status": "done",
    "crawler_id": "webcrawler/url-to-md",
    "page_status_code": 200,
    "created_at": "2024-03-20 15:30:45",
    "finished_at": "2024-03-20 15:30:47",
    "structured_data": {
        "title": "Example Product",
        "price": 99.99,
        "currency": "USD"
    }
}

The status field can be:

  • new - Job is queued
  • in_progress - Currently scraping
  • done - Scraping completed successfully
  • error - Scraping failed

When using asynchronous scraping with webhook_url, your server will receive a POST request with the complete result once the scraping is finished.

Error Responses

  • 400 Bad Request - Invalid parameters or missing required fields
  • 401 Unauthorized - Invalid or missing API key
  • 402 Payment Required - Insufficient account balance
  • 404 Not Found - Specified crawler not found
  • 500 Internal Server Error - Server-side error

Refer to Async Requests for more information about handling asynchronous scraping jobs.