Tutorials

How to Create a Testing Ruleset

A step-by-step tutorial for creating testing AI rules: test pyramid enforcement, naming conventions, assertion patterns, mock strategies, and the rules that make AI-generated tests actually useful.

7 min read·July 5, 2025

12 testing rules. Specific assertions, behavior testing, smart mocking, and factory data. AI-generated tests that catch real bugs.

Test pyramid, naming conventions, assertion quality, mock strategy, test isolation, and deterministic execution

Why Testing Rules Produce Better AI Tests

Without testing rules: AI generates tests that look correct but are often: superficial (happy path only, no edge cases), weakly asserted (toBeTruthy instead of specific value checks), implementation-coupled (break when you refactor, even if behavior is preserved), and inconsistent (different test styles across the codebase). With testing rules: the AI generates tests that follow your team's testing patterns, assert meaningful values, cover edge cases, and maintain consistency.

The testing ruleset covers: test strategy (which level of test for which type of code), naming conventions (how tests are named and organized), assertion patterns (what to assert and how specifically), mock strategy (when to mock, what to mock, what not to mock), test isolation (each test independent, no shared state), and test data (factories over hardcoded values). These 6 categories: produce 10-12 rules that transform AI-generated test quality.

The investment: 30 minutes to write the testing rules. The return: every AI-generated test follows your patterns. No more reviewing tests that use the wrong framework, the wrong naming convention, or the wrong assertion style. The testing rules: the second-highest-impact rules after security.

Step 1: Test Strategy and Naming Rules (4 Rules)

Rule 1 — Test pyramid: 'Default to the lowest test level that verifies the behavior. Pure functions: unit test. API endpoints: integration test (with real database). User flows: E2E test (Playwright). Do not write E2E tests for logic that can be verified with a unit test.' This rule: prevents the AI from generating slow E2E tests for everything.

Rule 2 — Framework and location: 'Test framework: Vitest (or Jest/pytest/Go testing — specify yours). Test files: co-located with source (user-service.ts → user-service.test.ts). Not in a separate __tests__ directory. Import the test subject with a relative import.' This rule: ensures tests are where your team expects them and use the right framework.

Rule 3 — Naming convention: 'describe("FunctionName" or "ComponentName"). it("should [expected behavior] when [condition]"). Example: it("should return empty array when no users match the filter"). The test name: readable as a sentence, describes what and when.' Rule 4 — Test organization: 'Group related tests with describe. Group by: behavior category (describe("validation")), or by input type (describe("with valid input"), describe("with invalid input")). Each describe: independent, can run in isolation.'

💡 Co-Located Tests Are Found and Maintained

Tests in a separate __tests__/ directory: out of sight, out of mind. Developers add a new function to user-service.ts. The test is in __tests__/services/user-service.test.ts — 4 directories away. They forget to update it. Tests next to the source (user-service.test.ts next to user-service.ts): the developer sees the test file every time they edit the source. They are reminded to update it. Co-location: increases the probability that tests are maintained as the code evolves.

Step 2: Assertion and Mock Rules (4 Rules)

Rule 5 — Specific assertions: 'Assert specific values, not truthiness. Not: expect(result).toBeTruthy(). Instead: expect(result.email).toBe("alice@test.com"). Not: expect(users).toHaveLength(expect.any(Number)). Instead: expect(users).toHaveLength(3). Every assertion: verifies a specific, meaningful property.' This rule: the single most impactful testing rule. Prevents the most common AI test anti-pattern.

Rule 6 — Test behavior, not implementation: 'Tests verify what the function does (output for given input), not how it does it (which internal methods are called). Do not mock internal methods of the subject under test. Do not assert call counts on implementation details. Tests should not break when the implementation is refactored but the behavior is preserved.' Rule 7 — Mock strategy: 'Mock: external services (API calls, email, payment processors), time (use fake timers for time-dependent logic), and randomness (seed random generators for deterministic tests). Do not mock: the database in integration tests (use a test database), the subject under test, or simple utility functions.'

Rule 8 — Error path testing: 'Every test suite includes error path tests: invalid input (returns validation error), missing resource (returns 404), unauthorized access (returns 401/403), and external dependency failure (service returns error, network timeout). The AI generates error path tests alongside happy path tests — not as an afterthought.' AI rule: 'Four assertion and mock rules: cover the most common quality gaps in AI-generated tests. Specific assertions + behavior testing + smart mocking + error paths = tests that catch real bugs.'

⚠️ expect(result).toBeTruthy() Is the Most Common AI Test Flaw

The AI generates: expect(result).toBeTruthy(). This passes if result is: a user object with wrong data (truthy), an empty object {} (truthy), a string 'error' (truthy), or the number 42 (truthy). It only fails for: null, undefined, 0, '', false. A test that passes for almost every value: catches almost no bugs. Replace with: expect(result.email).toBe('alice@test.com'), expect(result.role).toBe('admin'). Specific assertions: catch specific bugs. Truthy: catches nothing.

Step 3: Isolation and Test Data Rules (4 Rules)

Rule 9 — Test isolation: 'Each test: sets up its own data, runs independently, and cleans up after itself. No test depends on another test's output or side effects. Test execution order: must not matter. Tests can run in parallel without interference.' This rule: prevents the most common cause of flaky tests (shared state between tests).

Rule 10 — Test data factories: 'Create test data with factory functions: createTestUser({ role: "admin" }). The factory: generates complete, valid objects with sensible defaults. Tests: override only the fields relevant to the test case. Never hardcode the same test data across multiple tests — it creates hidden dependencies.' Rule 11 — No hardcoded waits: 'No sleep(), setTimeout(), or hardcoded delays in tests. Use: explicit waits (waitFor(), waitForSelector()), event-based synchronization, or fake timers. Hardcoded waits: flaky on slow CI, wasteful on fast machines.'

Rule 12 — Deterministic tests: 'Tests produce the same result every time. Use: fixed dates (vi.setSystemTime(new Date("2026-01-01"))), seeded random values, and mocked external services (no real API calls from tests). A test that passes locally but fails in CI: violates this rule.' AI rule: 'Four isolation and data rules: eliminate flaky tests. Isolation + factories + no waits + determinism = a test suite that is reliable in every environment.'

ℹ️ Factory Functions Eliminate Hidden Test Dependencies

Test A creates user { email: 'test@example.com' }. Test B expects that user to exist. Test A runs first: both pass. Tests run shuffled: Test B fails. The dependency: invisible in the code. Factory functions: each test creates its own user with a unique email (createTestUser() generates a random email). No shared data. No hidden dependencies. Tests run in any order, in parallel, without interference.

Testing Ruleset Summary

The complete 12-rule testing ruleset.

  • Strategy (2 rules): test pyramid (unit > integration > E2E), framework and co-located files
  • Naming (2 rules): describe/it convention with behavior descriptions, grouped by category
  • Assertions (2 rules): specific values (not truthiness), test behavior (not implementation)
  • Mocking (2 rules): mock externals only, include error path tests for every suite
  • Isolation (2 rules): independent tests, factory-generated data (no hardcoded values across tests)
  • Reliability (2 rules): no hardcoded waits (explicit waits only), deterministic (fixed dates, seeded random)
  • Impact: AI-generated tests match your team's patterns. No more reviewing wrong-framework tests
  • Verification: prompt 'Write tests for createUser.' Check: naming, assertions, edge cases, factories
How to Create a Testing Ruleset — RuleSync Blog