Webcrawler API LogoWebCrawlerAPI
APIFeed

POST /feed

Create a new scheduled feed to monitor website changes

Creates a new feed for monitoring website changes. Feeds automatically crawl the specified URL at regular intervals and detect new or changed content.

https://api.webcrawlerapi.com/v2/feed

Format: JSON Method: POST

Request

Required Parameters

  • url - the seed URL where the crawler starts. Can be any valid URL.

Optional Parameters

  • name - a friendly name for the feed
  • scrape_type - content format: markdown (default), cleaned, or html
  • items_limit - maximum number of pages to crawl per feed run (default: 10)
  • max_depth - maximum depth of crawling from the starting URL (0-10)
  • allow_subdomains - if true, the crawler will also crawl subdomains (default: false)
  • whitelist_regexp - regular expression to whitelist URLs. Only URLs that match the pattern will be crawled
  • blacklist_regexp - regular expression to blacklist URLs. URLs that match the pattern will be skipped
  • respect_robots_txt - if true, the crawler will respect the website's robots.txt file (default: false)
  • main_content_only - extract only the main content of article or blog post (default: false)
  • webhook_url - URL to receive POST notifications when changes are detected

Example:

{
    "url": "https://example.com/blog",
    "name": "Example Blog Feed",
    "scrape_type": "markdown",
    "items_limit": 20,
    "allow_subdomains": false,
    "respect_robots_txt": true,
    "max_depth": 1,
    "webhook_url": "https://yourserver.com/webhook"
}

Response

Example:

{
    "id": "2hGdKxPqR5mNvL3wYzT8j",
    "url": "https://example.com/blog",
    "name": "Example Blog Feed",
    "scrape_type": "markdown",
    "items_limit": 20,
    "status": "active",
    "next_run_at": "2024-01-02T12:00:00Z",
    "webhook_url": "https://yourserver.com/webhook",
    "created_at": "2024-01-01T12:00:00Z"
}

Response Fields

  • id - unique feed identifier (use this to access feed endpoints)
  • url - the monitored URL
  • name - friendly name (if provided)
  • scrape_type - content format
  • items_limit - max pages per crawl run
  • status - feed status: active, paused, or canceled
  • next_run_at - ISO 8601 timestamp when the next crawl will run
  • webhook_url - webhook endpoint (if configured)
  • created_at - ISO 8601 timestamp when the feed was created

Error Responses

  • 400 Bad Request - Invalid request parameters (invalid URL, regex pattern invalid, etc.)
  • 402 Payment Required - Insufficient balance to create the feed
  • 403 Forbidden - Organization is suspended

Pricing

Feeds use the same pricing as regular crawl jobs - you are charged per page crawled during each scheduled run.

Limits

  • Maximum 100 feeds per organization
  • All feeds run every periodically automatically
  • Force-run interval: Minimum 1 hour between manual runs