5 Playwright Patterns That Eliminated Flaky Tests on My Team
Suneet Malhotra
Mar 10, 2026
5 Playwright Patterns That Eliminated Flaky Tests on My Team
If you run a test suite long enough, flaky tests become the background radiation of your CI pipeline. Green on retry, red on Tuesdays, mysteriously passing on your laptop but failing in CI. I spent years fighting this at companies like Amazon, Tinder, and now Motorola Solutions — and I finally have a playbook that actually works.
Here are five Playwright patterns that took our end-to-end pass rate from 74% to 99.2% in under six weeks.
1. Replace Hard Waits with Auto-Retrying Assertions
The single biggest source of flakiness I see on teams is page.waitForTimeout(3000). It's a guess. Sometimes three seconds is enough, sometimes it isn't.
Playwright's auto-retrying assertions like expect(locator).toBeVisible() and expect(locator).toHaveText() poll the DOM automatically until the condition is true or the timeout expires. No guessing. No arbitrary sleeps.
Before: Tests that passed 80% of the time. After: The same tests passing 99%+ because they wait for exactly what they need.
2. Use Web-First Locators Instead of CSS Selectors
CSS selectors are brittle. A designer changes a class name and suddenly 40 tests break. Playwright's web-first locators — getByRole(), getByLabel(), getByText() — find elements the way a user would.
I migrated our entire suite from CSS selectors to role-based locators over a single sprint. The number of "locator not found" failures dropped by 85%. Bonus: your tests now double as accessibility audits.
3. Isolate Test State with API Setup
Every test that clicks through a login flow, creates a user, and navigates three pages just to test a single button is a test waiting to break. We moved all setup into API calls using Playwright's request fixture.
Our tests now create their own users, seed their own data, and tear it down — all via API. Each test is independent. Parallelization went from "terrifying" to "trivial."
4. Leverage Trace Viewer for Debugging — Not Console Logs
When a test fails in CI, most engineers add console.log statements and re-run. That's a 10-minute feedback loop at best. Playwright's Trace Viewer captures a full timeline: screenshots, DOM snapshots, network requests, and console output — all in one interactive UI.
We configured traces to capture on first retry. Now when a test fails, the engineer opens the trace, scrubs to the failure point, and sees exactly what happened. Average debug time dropped from 45 minutes to under 10.
5. Shard Tests Across CI Workers
A 60-minute test suite is a 60-minute bottleneck on every PR. Playwright's built-in sharding (--shard=1/4) distributes tests across parallel CI workers with zero configuration.
We shard across four GitHub Actions runners. Total wall-clock time: 14 minutes. And because each shard runs fewer tests, flakiness from resource contention dropped too. Win-win.
The Results
After implementing these five patterns across our Playwright suite at Motorola Solutions:
- Pass rate: 74% → 99.2%
- Suite duration: 58 min → 14 min
- Weekly flaky-test triage: 12 hours → under 1 hour
- Developer trust in tests: "I'll just skip it" → "If it's red, it's real"
The best part? None of this required exotic tooling or a framework migration. It's all built into Playwright today.
What I'd Do Next
If you've got these basics locked down, the next frontier is AI-powered self-healing locators. I've been experimenting with using local LLMs via Ollama to dynamically regenerate broken selectors — and the early results are promising.
QA automation isn't about writing more tests. It's about writing tests that actually tell you the truth. These five patterns are how you get there.
— Suneet Malhotra, Sr. Manager of Test Engineering at Motorola Solutions
Share this post
You Might Also Like
I Replaced Half My QA Workflow with Playwright AI Agents — Here's What Actually Happened
After six months running AI-assisted testing with Playwright's MCP integration and self-healing tests in production, I have thoughts. Spoiler: it's not the apocalypse QA engineers feared.
QA EngineeringI Replaced My Entire Playwright Test Maintenance Workflow With AI — And Saved 8 Hours a Week
Test maintenance used to eat my Tuesdays alive. Flaky selectors, broken locators, UI drift after every sprint. Here's how I rebuilt the whole workflow around AI and got my time back.
Quantitative TradingThe Ninety Minutes My Engine Sits Out
My stock engine refuses to open any new position after 2:30 PM ET. It surrenders the most active hour of the day on purpose. Here is the arithmetic behind the refusal.
Career & Best PracticesThe Numbers I Used to Ask You to Trust
My April posts reported measured numbers you had to take on faith. My recent ones derive every figure from public config. The change was not discipline. It was topology.
Latest Blog Posts
The Ninety Minutes My Engine Sits Out
My stock engine refuses to open any new position after 2:30 PM ET. It surrenders the most active hour of the day on purpose. Here is the arithmetic behind the refusal.
The Numbers I Used to Ask You to Trust
My April posts reported measured numbers you had to take on faith. My recent ones derive every figure from public config. The change was not discipline. It was topology.
Five Up, Three Down, Even Money
My bracket risks 3% to make 5%, which reads like a favorable bet. On a price with no drift it is exactly break-even, and the reason is a theorem, not a coincidence.
Related Tools & Demos
Multi-Model LLM Harness
One interface to call any AI model — capability routing, fallback chains, budgets, circuit breakers, and a quality feedback loop. A practical architecture pattern write-up.
Automated Trading System
Multi-engine trading platform with real-time risk management, regime-based strategy selection, and automated order execution.
View Source Code →Personal Health Analytics
Multi-modal health data platform integrating wearables, lab results, and lifestyle tracking with predictive habit modeling.
View Source Code →
Stay in the Loop
Get weekly insights on AI-driven QA, engineering leadership, and automation strategies.
No spam, ever. Unsubscribe anytime.