How to fix hidden flakes when retries make failures "pass" in Playwright?

Playwright retries are designed to improve stability for genuinely intermittent conditions — network latency, slow CI machines, external service variability. But when retries mask underlying test logic bugs or fragile selectors, they create a false sense of reliability. A test that passes on the second attempt every time appears green in reports, but the first-attempt failure means real bugs could be slipping through. Retries should reduce noise, not hide problems.

Common mistake

// playwright.config.ts
export default defineConfig({
  retries: 3, // High retry count — obscures consistent first-attempt failures
});

// Test with a fragile selector that consistently fails on first attempt
test('closes modal', async ({ page }) => {
  await page.goto('/dashboard');
  await page.locator('.modal-close-btn').click(); // Fails first attempt when modal hasn't opened yet
});

If the modal takes 200ms to open and the retry adds 30 seconds, the test is technically "passing" but burning 30 seconds per run masking a missing await expect(modal).toBeVisible().

The fix

Use retries selectively and monitor the retry rate as a quality metric:

import { defineConfig } from '@playwright/test';

export default defineConfig({
  retries: process.env.CI ? 1 : 0, // 1 retry in CI, 0 locally to expose flakes during dev
  reporter: [
    ['html'],
    ['json', { outputFile: 'test-results/results.json' }],
  ],
  use: {
    trace: 'on-first-retry', // Always capture trace when retry fires
    video: 'on-first-retry',
  },
});

Review the HTML report after CI runs to find tests with non-zero retry counts — these are your flake candidates. Fix them systematically:

// Before: flake masked by retry
test('dismisses toast', async ({ page }) => {
  await page.goto('/dashboard');
  await page.locator('.toast-close').click(); // No wait — fails until toast renders
});

// After: explicit wait removes the flake
test('dismisses toast', async ({ page }) => {
  await page.goto('/dashboard');
  const toast = page.getByRole('alert');
  await expect(toast).toBeVisible({ timeout: 5000 }); // Wait for toast to appear
  await toast.getByRole('button', { name: 'Dismiss' }).click();
  await expect(toast).toBeHidden();
});

Track flaky tests formally — fail CI builds when the first-attempt failure rate exceeds a threshold:

// Use Playwright's built-in flaky test detection in JSON output
// Parse results.json to count tests where retry !== 0

Why it works

Playwright records whether a test passed on the first attempt or required retries in the test result object. Traces captured on first retry contain the exact DOM state, network requests, and console logs at the moment of failure — this information is usually sufficient to diagnose why the selector or assertion was fragile. By keeping the retry count at 1 (just enough to avoid hard failures from one-off network blips) and reviewing all retried tests, you maintain the retry benefit while exposing the flake signal.

Tips

Run your test suite with retries: 0 locally before committing — any test that now fails is a real flake that was being masked.
Add a CI quality gate: if more than 5% of test runs required retries in the last week, treat it as a bug backlog item, not acceptable baseline noise.
Playwright's --repeat-each=5 flag reruns every test 5 times in a single run — useful for isolating flaky tests that fail intermittently.
Some first-attempt failures are genuinely acceptable (cold start of a container, first request latency) — use retries: 1 for CI and accept that retried tests warrant investigation, not automatic acceptance.