AI Rules for QA Automation Teams

QA Automation: Tests That Catch Bugs and Stay Green

QA automation teams face two challenges: catching bugs before production AND maintaining a test suite that does not become a burden. A test suite that catches every bug but is flaky (fails randomly), slow (takes 2 hours to run), or brittle (breaks when the UI changes slightly) is worse than no test suite — it slows down development and erodes trust in testing. AI rules for QA automation encode: the right testing strategy (test pyramid), maintainable patterns (page objects, fixtures), reliability practices (no flaky tests), and CI integration (tests run on every PR).

The QA automation stack: test frameworks (Playwright, Cypress, Selenium for E2E; Jest, Vitest, pytest for unit/integration), assertion libraries, test data management (factories, fixtures, seeding), and CI integration (GitHub Actions, Jenkins, CircleCI). AI rule: 'Detect the project's test framework. Generate tests using the existing framework and conventions. Playwright project: Playwright patterns. Jest project: Jest patterns. Never introduce a second test framework without explicit team decision.'

The core QA automation AI rules: follow the test pyramid (more unit tests, fewer E2E tests), use page objects for UI tests (abstract page details from test logic), manage test data with factories (not hardcoded values), prevent flaky tests (no sleeps, proper waits), and integrate with CI (tests run on every PR, block merge on failure).

Test Pyramid and Test Strategy

The test pyramid: many unit tests (fast, isolated, test individual functions), fewer integration tests (test interactions between components), and few E2E tests (test full user workflows through the UI). AI rule: 'Unit tests: 70% of the test suite. Test business logic, utilities, and data transformations. Fast (< 1 second each). No external dependencies (mock database, API calls). Integration tests: 20%. Test component interactions, API endpoints, database queries. Moderate speed. E2E tests: 10%. Test critical user workflows (login, checkout, signup). Slower but high confidence.'

When to write which: AI rule: 'New utility function: unit test. New API endpoint: integration test (test the endpoint with a real database). New user flow: E2E test (only for critical paths). New bug fix: regression test at the lowest level that catches the bug (prefer unit over E2E). The AI should default to unit tests and only generate E2E tests for explicitly critical flows.'

Test naming: test names should describe the expected behavior, not the implementation. AI rule: 'Test names: describe what should happen. Good: "returns empty array when no users match the filter". Bad: "test getUsersByFilter". The test name is the documentation — when it fails, the name should explain what broke without reading the test code.'

💡 Default to Unit Tests, Not E2E

When the AI generates a test for a new function: a unit test runs in 10ms. An E2E test for the same functionality: 5-30 seconds (browser startup, page load, navigation). 100 unit tests: 1 second. 100 E2E tests: 8-50 minutes. The E2E tests catch more integration issues, but the cost compounds. AI rule: write a unit test unless the behavior specifically requires browser interaction (navigation, form submission, visual rendering). Most business logic can be tested at the unit level.

Page Objects and Test Data Management

Page Object Model (POM): abstract page elements and interactions into reusable classes. The test calls page.login(username, password). The page object knows: the username field selector, the password field selector, the submit button selector, and the post-login verification. When the UI changes: update the page object, not every test. AI rule: 'E2E tests: use page objects for every page/component interaction. One page object per page or significant component. Test files import page objects, not selectors. When the AI generates an E2E test: generate or update the page object alongside the test.'

Test data factories: generate test data programmatically instead of hardcoding values. AI rule: 'Test data: use factories (factory-bot pattern). Factory generates a valid user with all required fields and sensible defaults. Tests override only the fields relevant to the test case. Factory: createUser({ role: "admin" }) generates a complete user with role=admin and random values for everything else. Never hardcode test data across multiple tests — it creates hidden dependencies.'

Test isolation: each test should be independent. No test should depend on another test's output or side effects. AI rule: 'Each test: sets up its own data (using factories), runs independently, and cleans up after itself (or runs in a transaction that is rolled back). Test order should not matter — shuffling test order should not cause failures. The AI must never generate tests that depend on execution order.'

⚠️ Hardcoded Test Data Creates Hidden Dependencies

Test A creates a user with email 'test@example.com'. Test B expects that user to exist. Test A runs first: both pass. Tests run in parallel or shuffled order: Test B fails. The dependency is invisible — nothing in the code says Test B needs Test A. Factories solve this: each test creates its own user with a unique random email. No shared data, no hidden dependencies, no mysterious failures when test order changes.

Flaky Test Prevention and CI Integration

Flaky test prevention: the #1 cause of flaky tests is timing issues (race conditions, animations, network delays). AI rule: 'No hardcoded sleeps (sleep(2000) is unreliable — too short on slow CI, too long on fast machines). Use explicit waits: Playwright: await page.waitForSelector(), await expect(locator).toBeVisible(). Cypress: cy.get() with automatic retry. Jest: waitFor() from testing-library. The AI must generate explicit wait conditions, never arbitrary sleeps.'

Other flaky test causes: shared state (test B fails because test A did not clean up), time-dependent tests (fail at midnight or on weekends), and external dependencies (test fails when a third-party API is slow). AI rule: 'Shared state: each test gets a fresh environment. Time: use fixed dates (vi.setSystemTime() or jest.useFakeTimers()). External APIs: mock in tests (never call real external APIs from automated tests). The AI generates isolated, deterministic tests by default.'

CI integration: tests run on every PR and block merge on failure. AI rule: 'CI pipeline: run unit tests first (fastest feedback). If unit tests pass: run integration tests. If integration tests pass: run E2E tests (slowest but highest confidence). Cache dependencies between runs. Parallelize test execution where possible. Generate CI configuration alongside test files.'

ℹ️ sleep(2000) Is the Root of All Flaky Tests

sleep(2000) means: wait 2 seconds and hope the element is ready. On a fast machine: it wastes 1.9 seconds. On a slow CI runner: 2 seconds is not enough, the test fails. The fix: await page.waitForSelector('#submit-button') — waits exactly until the element exists, whether that takes 50ms or 5 seconds. Explicit waits are faster (no wasted time) AND more reliable (adapt to actual conditions). The AI must never generate sleep() in automated tests.

QA Automation AI Rules Summary

Summary of AI rules for QA automation teams building maintainable test suites.

Test pyramid: 70% unit, 20% integration, 10% E2E. Default to lowest level that catches the bug
Test names: describe expected behavior. The name explains what broke when the test fails
Page objects: abstract selectors from tests. One POM per page. Update POM, not every test
Factories: generate test data programmatically. Override only relevant fields. No hardcoded values
Test isolation: independent tests. Own data setup. No execution order dependencies
No sleeps: explicit waits (waitForSelector, waitFor). Deterministic timing, not arbitrary delays
Mocking: mock external APIs. Fixed dates for time-dependent logic. Fresh state per test
CI: unit → integration → E2E pipeline. Cache dependencies. Parallelize. Block merge on failure