POST /feed

Creates a new feed for monitoring website changes. Feeds automatically crawl the specified URL at regular intervals and detect new or changed content.

https://api.webcrawlerapi.com/v2/feed

Format: JSON Method: POST

Request

Required Parameters

url - the seed URL where the crawler starts. Can be any valid URL.

Optional Parameters

name - a friendly name for the feed
scrape_type - content format: markdown (default), cleaned, or html
items_limit - maximum number of pages to crawl per feed run (default: 10)
max_depth - maximum depth of crawling from the starting URL (0-10)
whitelist_regexp - regular expression to whitelist URLs. Only URLs that match the pattern will be crawled
blacklist_regexp - regular expression to blacklist URLs. URLs that match the pattern will be skipped
respect_robots_txt - if true, the crawler will respect the website's robots.txt file (default: false)
main_content_only - extract only the main content of article or blog post (default: false)
webhook_url - URL to receive POST notifications when changes are detected

Example:

{
    "url": "https://example.com/blog",
    "name": "Example Blog Feed",
    "scrape_type": "markdown",
    "items_limit": 20,
    "respect_robots_txt": true,
    "max_depth": 1,
    "webhook_url": "https://yourserver.com/webhook"
}

curl example

curl --request POST \
  --url https://api.webcrawlerapi.com/v2/feed \
  --header 'Authorization: Bearer <YOUR_API_KEY>' \
  --header 'Content-Type: application/json' \
  --data '{
    "url": "https://example.com/blog",
    "name": "Example Blog Feed",
    "scrape_type": "markdown",
    "items_limit": 20,
    "respect_robots_txt": true,
    "max_depth": 1,
    "webhook_url": "https://yourserver.com/webhook"
  }'

Response

Example:

{
    "id": "2hGdKxPqR5mNvL3wYzT8j",
    "url": "https://example.com/blog",
    "name": "Example Blog Feed",
    "scrape_type": "markdown",
    "items_limit": 20,
    "status": "active",
    "next_run_at": "2024-01-02T12:00:00Z",
    "webhook_url": "https://yourserver.com/webhook",
    "created_at": "2024-01-01T12:00:00Z"
}

Response Fields

id - unique feed identifier (use this to access feed endpoints)
url - the monitored URL
name - friendly name (if provided)
scrape_type - content format
items_limit - max pages per crawl run
status - feed status: active, paused, or canceled
next_run_at - ISO 8601 timestamp when the next crawl will run
webhook_url - webhook endpoint (if configured)
created_at - ISO 8601 timestamp when the feed was created

Error Responses

400 Bad Request - Invalid request parameters (invalid URL, regex pattern invalid, etc.)
402 Payment Required - Insufficient balance to create the feed
403 Forbidden - Organization is suspended

Pricing

Feeds use the same pricing as regular crawl jobs - you are charged per page crawled during each scheduled run.

Limits

Maximum 100 feeds per organization
All feeds run every periodically automatically
Force-run interval: Minimum 1 hour between manual runs

POST /feed

On this page