APIFeed
GET /feed/:id/json
Get feed changes in JSON Feed format
Returns the feed content in JSON Feed format (https://jsonfeed.org/).
https://api.webcrawlerapi.com/v2/feed/:id/jsonFormat: JSON Feed 1.1 Method: GET
Request
URL Parameters:
id- (required) the feed ID
Query Parameters:
page- (optional, default:1) page number for paginationpage_size- (optional, default:1000, max:1000) number of items per page
Headers:
Authorization: Bearer {api_key}- (required) your API key
Response
Content-Type: application/feed+json; charset=utf-8
Example:
{
"version": "https://jsonfeed.org/version/1.1",
"title": "WebCrawlerAPI Feed: example.com",
"home_page_url": "https://example.com/blog",
"feed_url": "https://api.webcrawlerapi.com/v2/feed/abc123/json",
"description": "Change tracking feed for example.com",
"authors": [
{
"name": "WebCrawlerAPI",
"url": "https://webcrawlerapi.com"
}
],
"items": [
{
"id": "item123",
"url": "https://example.com/blog/web-scraping-guide",
"title": "Getting Started with Web Scraping",
"summary": "New page discovered",
"date_modified": "2024-01-01T12:05:00Z",
"tags": ["new"],
"_webcrawlerapi": {
"change_type": "new",
"page_status_code": 200,
"content_url": "https://cdn.webcrawlerapi.com/content/...",
"page_size": 45678
}
},
{
"id": "item456",
"url": "https://example.com/docs/api",
"title": "API Documentation",
"summary": "Page content has changed",
"date_modified": "2024-01-01T12:05:00Z",
"tags": ["changed"],
"_webcrawlerapi": {
"change_type": "changed",
"page_status_code": 200,
"content_url": "https://cdn.webcrawlerapi.com/content/...",
"page_size": 23456
}
}
]
}Response Fields
Standard JSON Feed Fields
version- JSON Feed specification versiontitle- feed title (feed name or domain)home_page_url- the monitored URLfeed_url- URL of this JSON feed endpointdescription- feed descriptionauthors- array of authors
Item Fields
id- unique item identifierurl- page URLtitle- page title (if available)summary- description of the changedate_modified- when the change was detectedtags- array containing the change type
WebCrawlerAPI Extension (_webcrawlerapi)
change_type- type of change:new,changed, orunavailablepage_status_code- HTTP status code when crawledcontent_url- URL to fetch the page content in the format specified by the feed'sscrape_typesetting (markdown, cleaned, or html)page_size- size of the page in bytes
Change Types
new- A new page was discovered that wasn't in the previous crawlchanged- The page content has changed since the last crawlunavailable- A page that was previously available is no longer accessible
Pagination
The JSON feed supports flexible pagination to help you manage large numbers of changes:
- page - Navigate through pages of results (default: 1)
- page_size - Control how many items per page (default: 1000, max: 1000)
- Page 1 contains the most recent changes
- Empty pages return a valid feed structure with an empty
itemsarray
Example pagination workflow:
// Get first page to see total items
const page1 = await fetch('/v2/feed/abc123/json?page=1&page_size=100');
// Fetch subsequent pages
const page2 = await fetch('/v2/feed/abc123/json?page=2&page_size=100');Example Usage
curl
# Get first page with default limit (1000 items)
curl -H "Authorization: Bearer YOUR_API_KEY" \
"https://api.webcrawlerapi.com/v2/feed/abc123/json"
# Get specific page with custom limit
curl -H "Authorization: Bearer YOUR_API_KEY" \
"https://api.webcrawlerapi.com/v2/feed/abc123/json?page=2&page_size=100"JavaScript
// Get first page with custom page size
const response = await fetch(
'https://api.webcrawlerapi.com/v2/feed/abc123/json?page=1&page_size=100',
{
headers: {
'Authorization': 'Bearer YOUR_API_KEY'
}
}
);
const feed = await response.json();
// Process new and changed items
feed.items.forEach(item => {
console.log(`${item._webcrawlerapi.change_type}: ${item.url}`);
});Python
import requests
# Get second page with 100 items per page
response = requests.get(
'https://api.webcrawlerapi.com/v2/feed/abc123/json',
params={'page': 2, 'page_size': 100},
headers={'Authorization': 'Bearer YOUR_API_KEY'}
)
feed = response.json()
for item in feed['items']:
change_type = item['_webcrawlerapi']['change_type']
print(f"{change_type}: {item['url']}")Error Responses
400 Bad Request- Invalid request parameters404 Not Found- Feed not found or does not belong to your organization