The AI Testing Problem
Ask an AI to 'write tests for this function' and you'll get one of two extremes. Either the tests are trivially shallow — testing that a function returns its input, that true is truthy, that an empty array has length 0 — or they're painfully over-mocked, replacing every dependency with a mock so the test verifies nothing about real behavior.
Neither extreme is useful. Shallow tests give false confidence. Over-mocked tests break on every refactor without catching real bugs. What you want is meaningful tests that verify behavior through realistic scenarios — and the AI can write these, but only with the right rules.
Rule 1: What to Test (and What Not To)
The most impactful testing rule is defining what deserves a test. Without this rule, the AI either tests everything (including trivial getters and simple pass-through functions) or tests nothing beyond the happy path.
The rule: 'Write tests for business logic, data transformations, and edge cases. Do not test framework behavior, simple property access, or functions that only delegate to a single dependency. Test the behavior, not the implementation — tests should verify what the function does, not how it does it.'
Add specific guidance for your project: 'Test all functions in src/services/ and src/lib/. Do not test React component rendering unless it contains conditional logic. Test API route handlers through HTTP-level integration tests, not unit tests.'
AI tests are either trivially shallow (testing that true is true) or painfully over-mocked (replacing every dependency). Rules push the AI toward the useful middle ground: meaningful behavior tests.
Rule 2: Mock Boundaries, Not Internals
AI assistants default to mocking everything because it's the safe choice — mocked tests never fail due to environment issues. But they also never catch integration bugs, which are the bugs that matter most.
The rule: 'Mock external boundaries only: HTTP APIs, third-party services, and time-dependent operations. Never mock your own code (internal functions, services, repositories). For database tests, use a real test database — not mocked queries. Use dependency injection to make boundaries replaceable, not to mock internals.'
This rule prevents the 'all green, nothing works' anti-pattern where tests pass because they're testing mocks, not code.
Mock external services and APIs. Never mock your own code — it creates tests that pass while the real integration is broken. Use a real test database, not mocked queries.
Rule 3: Test File Organization
How tests are organized affects whether the AI generates tests at all. If your project has a clear pattern, the AI follows it. If there's no pattern, the AI invents one — and it'll be different every time.
The rule: 'Place test files adjacent to source files with a .test.ts (or _test.go, test_.py) suffix. Use describe blocks for the function/class being tested. Use it/test blocks for specific behaviors. Name tests as sentences: "returns user when email exists", not "test1" or "should work".'
'returns 404 when user not found' tells you exactly what broke. 'test1' or 'should work' tells you nothing. Sentence-style test names are self-documenting.
Testing Rules Template
Consolidated testing rules for any project. Adapt the framework-specific references to your stack.
- Test business logic and edge cases — not trivial wrappers or framework behavior
- Mock external boundaries only — never mock your own functions or database queries
- Test files adjacent to source with .test.ts suffix — one test file per source file
- Describe/it structure with sentence-style names: 'returns 404 when user not found'
- Integration tests for API routes — test through HTTP, not by calling handler directly
- No snapshot tests for dynamic content — only for stable UI components
- Test error paths explicitly — every try/catch should have a test that triggers the catch
- Use factories for test data — never hardcode user objects across multiple tests