Guides
Advanced Cleaning on Crawled Data
It is possible to add extra cleaning options to your crawling job. There is a special parameter called clean_selectors.
Cleaning Selectors
Cleaning selectors are used to clean the data in the crawled pages. They are applied to the data after the data is crawled. All found elements will be cleaned using the cleaning selectors.
The default value is:
script, style, noscript, iframe, img, footer, header, nav, headFormat is a comma separated list of CSS selectors.
API Example
{
"url": "https://books.toscrape.com/",
"scrape_type": "markdown",
"items_limit": 10,
"clean_selectors": ".card, #main-header"
}