Skip to content

Getting starting with scraping

Scraping is the process of extracting data from webpages. Currently we support raw content extraction.

Prerequisites

In order to use Webcrawler API you need first to obtain an API key:

  1. Register on Webcrawler API Dashboard
  2. Navigate to the API key section
  3. Copy your API key

First request

To make your first request you can use the following curl command:

curl --request POST \
--url https://api.webcrawlerapi.com/v1/scrape \
--header 'Authorization: Bearer <PASTE YOUR ACCESS KEY HERE>' \
--data '{
"url": "https://stripe.com/"
}'

This command will start a new scrape job that will extract raw content from the Stripe website.

Response:

{
"task_id": "b1b1b1b1-b1b1-b1b1-b1b1-b1b1b1b1b1b1"
}

Scrape request is done in asynchronous way. It means that you will receive a response with a task id. You can use this task id to check the status of the scraping task.

Get scraping result

curl --request GET \
--url https://api.webcrawlerapi.com/v1/scrape/<TASK_ID> \
--header 'Authorization: Bearer <PASTE YOUR ACCESS KEY HERE>'

Response:

{
"job_id": "bd98c98a-99a5-43ea-b650-a8c7662d4d28",
"type": "html",
"extracted_content": "<!doctype html>\n<html>\n<head>\n <title>Example Domain</title>\n\n ... \n</body>\n</html>\n\n",
"page_status_code": 200,
"created_at": "2024-06-17 07:02:51"
}