How to fix "Execution context was destroyed, most likely because of a navigation"?
PlaywrightThis happens when evaluation starts on one document and the page navigates before it finishes. Coordinate the click and ...
Comprehensive glossary of web scraping, crawling, and API terms. Learn the essential concepts and terminology used in web data extraction.
This happens when evaluation starts on one document and the page navigates before it finishes. Coordinate the click and ...
Frame was detached means the iframe was removed or reloaded while you were using it. Re-acquire the frame right before a...
net::ERR_ABORTED often appears when navigation is interrupted by another navigation, redirect, or page close. Avoid over...
net::ERRCONNECTIONREFUSED means nothing is listening on the target host/port. Start the web server before tests and veri...
These errors indicate network failure (offline state, blocked DNS, proxy/VPN issues, or request blocking). Stabilize tes...
This means HTTP auth credentials are missing or incorrect. Set valid httpCredentials when creating the browser context. ...
API requests and browser pages may run in separate contexts, so auth cookies do not automatically sync. Share auth via s...
This usually means storageState was saved in one context but not loaded in the project that runs your tests. Create auth...
This is a test code bug, usually from missing fixture args or wrong variable scope. Always destructure fixtures in the t...
The locator exists but is hidden, covered, or outside expected UI state. Wait for visibility and stable state before cli...
This occurs when test() is invoked outside a test file context, often in config/helpers imported by config. Keep test() ...
Downloads hang when listeners are attached after the click or when the app opens downloads in a new flow. Register waitF...
The element was re-rendered between lookup and action, so the handle became stale. Prefer locators (auto-retry) instead ...
ERRNAMENOT_RESOLVED means DNS lookup failed for the URL host. Verify baseURL, environment variables, and network access ...
Playwright package is installed, but browser binaries are missing in the environment. Install browsers (and system deps ...
This happens when the page navigates or reloads while your script is still evaluating in the old document. Wait for navi...
Upload failures usually come from wrong file paths or hidden/non-file inputs. Target a real <input type="file"> and pass...
Location APIs fail when permission is not granted for the right origin or context. Set geolocation and permissions at co...
Mobile layout can fail when viewport/device settings are overridden later in config or test. Use one clear project devic...
Elements inside iframes are not reachable from top-level page.locator(...). Use frameLocator() and wait for iframe conte...
Popup tests fail when code listens for the new page too late. Create the popup wait promise first, then click the opener...
A protocol invalid-argument error means a browser command received unsupported or malformed input. Validate option shape...
Random exits are commonly resource-related: low memory, too many workers, or unstable shared CI agents. Reduce concurren...
Interception fails when route patterns do not match actual request URLs or are registered too late. Register page.route(...
App code using Shadow DOM can hide internals from brittle CSS selectors. Prefer user-facing locators (getByRole, getByLa...
strict mode violation means your locator matches multiple elements, but the action needs exactly one. Narrow the locator...
This error appears when the browser process exits unexpectedly or test cleanup closes context/page early. Check for cras...
TimeoutError usually means Playwright could not find an actionable element in time. Use a more specific locator, wait fo...
Strict mode means a locator used for an action must match exactly one element. Refine the locator with role/name/test id...
This error appears when code uses page after page.close(), context.close(), or browser.close(). Ensure cleanup runs afte...
DevTools windows can be treated as regular pages by enabling the handleDevToolsAsPage option when launching or connectin...
BackendNodeId is exposed in the a11y snapshot. Each node in the snapshot includes a backendNodeId that lets you map acce...
Use the capability to retrieve detailed initiator data from CDP when available, and filter out goog: data from events by...
This feature allows opening a page in a tab or a window. newPage() can now be called with window options to choose where...
Overview Landmarks such as header, nav, main, aside, and footer provide semantic regions that assist screen readers and ...
The CDP message ID generator can be configured by passing a custom idGenerator to the Connection constructor. This enabl...
To stop the xdg-open popup in Puppeteer, configure a Chrome policy URLAllowlist and use a Chrome binary that reads that ...
Summary This change adds a public getter to CdpBrowserContext to expose the internal Connection object. It returns the p...
How to expose the url property for links If you need the full URL of a link in Puppeteer, use the url property that was ...
Summary Duplicate header values should normally be merged into a single header value separated by a comma and a space. T...
Fetch.enable wasn't found is raised when trying to enable the Fetch domain for a worker. The fix is to ignore this error...
Puppeteer now dispatches each CDP message in its own JavaScript task by scheduling dispatch with setTimeout. This ensure...
To open DevTools for a page in Puppeteer, use the new Page.openDevTools() method. It calls the DevTools interface for th...
Use the ignoreCache option with Page.reload to reload while ignoring the browser cache. ``js await page.reload({ ignoreC...
Solution Use the Emulation.setUserAgentOverride command via a CDP session to override the user agent instead of relying ...
The deprecation note indicates that the HTTPRequest.postData API is deprecated in Puppeteer. This means you should avoid...
Fixes Puppeteer not waiting for all targets when connecting by only awaiting child targets for tab targets. When connect...
To align with the protocol behavior, create the Response when the responseStarted event fires, rather than after the res...
Summary The pageerror event may emit not only Error objects but also values of unknown type. Treat the payload as unknow...
browser.close() and browser.disconnect() both end your current control flow, but they affect the browser lifecycle diffe...
The test server was removed from the release-please workflow to simplify the release process and remove an unnecessary e...
If you run into TS18028 private identifiers errors when compiling Puppeteer types with TypeScript, set the TypeScript ta...
To fix the TS18028 error, set the TypeScript target to ES2015 or higher. The error occurs because private identifiers (#...
Firefox currently mutates the headers object returned by request.headers() in a way that does not reflect in the respons...
This was fixed to prevent accidental mutations of the underlying headers. HttpRequest.headers() no longer allows mutatin...
Summary Firefox addon pages navigated via moz-extension:// are treated as webextension contexts. Puppeteer currently doe...
Answer Avoid blocks by scraping politely and limiting request rates. Respect robots.txt, identify your user agent, and s...
Answer Clean scraped data by trimming whitespace, normalizing formats, and removing duplicates. Validate fields with sch...
Answer Handle pagination by identifying the next page link, page parameter, or API cursor. Start from the first page and...
Answer Use a headless browser to render the page before extracting data. Wait for key selectors to appear or for network...
Answer Web crawling is about discovering and fetching pages, while web scraping is about extracting data from those page...
Answer Web scraping legality depends on the site terms, the data collected, and local laws. Public data may be allowed, ...
Answer Common tools include Beautiful Soup, Scrapy, Playwright, Puppeteer, and Selenium. Lightweight parsers are great f...
Answer Ethical scraping means minimizing harm and respecting site owners and users. Follow robots.txt, terms of service,...
Answer The best format depends on how you plan to use the data. CSV is simple and works well for tabular data and quick ...
Answer Web scraping is the process of extracting specific data from web pages and converting it into structured formats....
Answer Web crawling focuses on discovering and retrieving pages, while web scraping extracts specific data from those pa...
Answer Match crawl frequency to how often content changes and how quickly you need updates. High‑change sites may need m...
Answer To avoid getting blocked, crawl politely and predictably. Respect robots.txt, use reasonable rate limits, and ide...
Answer To crawl JavaScript‑heavy sites, use a headless browser to render pages before extracting content. Wait for criti...
Answer Web crawling legality depends on the website, the data you collect, and the laws in your jurisdiction. Many sites...
Answer Common web crawling tools include Scrapy, Apache Nutch, Playwright, Puppeteer, and managed crawler platforms. Scr...
Answer Common crawler data includes URLs, status codes, headers, page content, metadata, links, and timestamps. Many sys...
Answer Crawl budget is the number of pages a crawler can fetch within time and resource constraints. It is limited by yo...
Answer robots.txt is a file at a site root that tells crawlers which paths they may or may not access. It uses a simple ...
Answer Web crawling is the automated process of discovering and fetching web pages by following links so you can build a...