n8n WebcrawlerAPI integration
n8n is a powerful workflow automation tool that allows you to connect various services and automate tasks.You can use n8n to integrate WebCrawlerAPI for crawling websites and extracting data, which can then be used for training large language models (LLMs) or other purposes.
There are 2 ways to integrate WebcrawlerAPI with n8n: using the official WebcrawlerAPI node or the HTTP Request node. Both methods allow you to scrape webpages and extract data.
Using the official WebcrawlerAPI node (recommended)
- Go to
Settings
and selectCommunity Nodes
.
Search for n8n-nodes-webcrawlerapi
and click "Install".
After installation, you will see the WebcrawlerAPI node in the Community Nodes list
- In your workflow, add a new node and search the WebcrawlerAPI node.
You will see the WebcrawlerAPI node
- Click on the node to configure it. You will need to add your API key (opens in a new tab) to connect your WebcrawlerAPI account.
-
After entering your API key, click "Connect" to verify your credentials.
-
Now you can configure the node to scrape a webpage!
Enter the URL you want to scrape in the "URL" field and select the output format (Markdown, Cleaned Text, or HTML).
Using HTTP Request node
- In your workflow add a new node and select the HTTP Request node.
- Then tap "Import cURL" and paste the following snippet, using your API key (opens in a new tab) and the URL you want to crawl:
curl --request POST \
--url https://api.webcrawlerapi.com/v2/scrape \
--header 'Authorization: Bearer <YOUR API KEY>' \
--header 'Content-Type: application/json' \
--data '{
"url": "https://webcrawlerapi.com"
}'
Tap "Test step" to make sure everything is working correctly. You should see the response with markdown from the API in the output panel.