Webcrawler API LogoWebCrawlerAPI
APIFeed

GET /feed/:id/json

Get feed changes in JSON Feed format

Returns the feed content in JSON Feed format (https://jsonfeed.org/).

https://api.webcrawlerapi.com/v2/feed/:id/json

Format: JSON Feed 1.1 Method: GET

Request

URL Parameters:

  • id - (required) the feed ID

Query Parameters:

  • page - (optional, default: 1) page number for pagination
  • page_size - (optional, default: 1000, max: 1000) number of items per page

Headers:

  • Authorization: Bearer {api_key} - (required) your API key

Response

Content-Type: application/feed+json; charset=utf-8

Example:

{
    "version": "https://jsonfeed.org/version/1.1",
    "title": "WebCrawlerAPI Feed: example.com",
    "home_page_url": "https://example.com/blog",
    "feed_url": "https://api.webcrawlerapi.com/v2/feed/abc123/json",
    "description": "Change tracking feed for example.com",
    "authors": [
        {
            "name": "WebCrawlerAPI",
            "url": "https://webcrawlerapi.com"
        }
    ],
    "items": [
        {
            "id": "item123",
            "url": "https://example.com/blog/web-scraping-guide",
            "title": "Getting Started with Web Scraping",
            "summary": "New page discovered",
            "date_modified": "2024-01-01T12:05:00Z",
            "tags": ["new"],
            "_webcrawlerapi": {
                "change_type": "new",
                "page_status_code": 200,
                "content_url": "https://cdn.webcrawlerapi.com/content/...",
                "page_size": 45678
            }
        },
        {
            "id": "item456",
            "url": "https://example.com/docs/api",
            "title": "API Documentation",
            "summary": "Page content has changed",
            "date_modified": "2024-01-01T12:05:00Z",
            "tags": ["changed"],
            "_webcrawlerapi": {
                "change_type": "changed",
                "page_status_code": 200,
                "content_url": "https://cdn.webcrawlerapi.com/content/...",
                "page_size": 23456
            }
        }
    ]
}

Response Fields

Standard JSON Feed Fields

  • version - JSON Feed specification version
  • title - feed title (feed name or domain)
  • home_page_url - the monitored URL
  • feed_url - URL of this JSON feed endpoint
  • description - feed description
  • authors - array of authors

Item Fields

  • id - unique item identifier
  • url - page URL
  • title - page title (if available)
  • summary - description of the change
  • date_modified - when the change was detected
  • tags - array containing the change type

WebCrawlerAPI Extension (_webcrawlerapi)

  • change_type - type of change: new, changed, or unavailable
  • page_status_code - HTTP status code when crawled
  • content_url - URL to fetch the page content in the format specified by the feed's scrape_type setting (markdown, cleaned, or html)
  • page_size - size of the page in bytes

Change Types

  • new - A new page was discovered that wasn't in the previous crawl
  • changed - The page content has changed since the last crawl
  • unavailable - A page that was previously available is no longer accessible

Pagination

The JSON feed supports flexible pagination to help you manage large numbers of changes:

  • page - Navigate through pages of results (default: 1)
  • page_size - Control how many items per page (default: 1000, max: 1000)
  • Page 1 contains the most recent changes
  • Empty pages return a valid feed structure with an empty items array

Example pagination workflow:

// Get first page to see total items
const page1 = await fetch('/v2/feed/abc123/json?page=1&page_size=100');

// Fetch subsequent pages
const page2 = await fetch('/v2/feed/abc123/json?page=2&page_size=100');

Example Usage

curl

# Get first page with default limit (1000 items)
curl -H "Authorization: Bearer YOUR_API_KEY" \
  "https://api.webcrawlerapi.com/v2/feed/abc123/json"

# Get specific page with custom limit
curl -H "Authorization: Bearer YOUR_API_KEY" \
  "https://api.webcrawlerapi.com/v2/feed/abc123/json?page=2&page_size=100"

JavaScript

// Get first page with custom page size
const response = await fetch(
  'https://api.webcrawlerapi.com/v2/feed/abc123/json?page=1&page_size=100',
  {
    headers: {
      'Authorization': 'Bearer YOUR_API_KEY'
    }
  }
);
const feed = await response.json();

// Process new and changed items
feed.items.forEach(item => {
  console.log(`${item._webcrawlerapi.change_type}: ${item.url}`);
});

Python

import requests

# Get second page with 100 items per page
response = requests.get(
    'https://api.webcrawlerapi.com/v2/feed/abc123/json',
    params={'page': 2, 'page_size': 100},
    headers={'Authorization': 'Bearer YOUR_API_KEY'}
)
feed = response.json()

for item in feed['items']:
    change_type = item['_webcrawlerapi']['change_type']
    print(f"{change_type}: {item['url']}")

Error Responses

  • 400 Bad Request - Invalid request parameters
  • 404 Not Found - Feed not found or does not belong to your organization