TDD vs Test-After: AI Rules for Each Approach

Two Valid Testing Workflows with AI

Test-Driven Development (TDD) with AI: describe the behavior you want, ask the AI to write the test first (the test defines the expected behavior), then ask the AI to write the implementation that makes the test pass. The test is: the specification. The implementation is: whatever makes the spec pass. TDD with AI produces: well-defined APIs (the test forces you to think about the interface before the implementation), focused code (the implementation only does what the test requires), and high coverage (every behavior has a test by definition).

Test-after with AI: ask the AI to implement the feature (the code comes first), then ask the AI to write tests for the implementation (the tests verify the code works). Test-after is: the more common approach with AI coding (most developers describe the feature, not the tests). Test-after with AI produces: faster initial implementation (no test overhead during feature development), comprehensive test coverage (the AI can read the implementation and generate tests for every branch and edge case), and natural agentic flow (Claude Code implements, then tests, then fixes failures — the agentic loop).

Neither is objectively better with AI. TDD produces: better API design (the test forces interface thinking). Test-after produces: faster feature development (one generation step, not two). This article provides: the AI rules for each approach, when each works best, and the prompting strategies that maximize quality for both workflows.

TDD Rules: Test First, Then Implement

TDD AI rule: "When I describe a behavior, write the test first. Do not implement until the test exists. Test file: describe the expected behavior with it() blocks. Use placeholder implementations (throw new Error('not implemented')). After the test: implement the code that makes the test pass. Run the test to verify. If the test fails: fix the implementation, not the test (unless the test has a bug)."

TDD prompting strategy: "I want a function that calculates the total price of a cart with items, applying a percentage discount if a coupon is provided. Write the test first." The AI generates: describe('calculateTotal', () => { it('returns sum of item prices with no coupon', ...), it('applies percentage discount when coupon provided', ...), it('returns 0 for empty cart', ...), it('throws for negative discount', ...) }). Then: "Now implement calculateTotal to make these tests pass." The AI generates: the implementation that matches the test specification exactly.

Why TDD helps AI quality: the test constrains the implementation. Without a test: the AI generates whatever implementation it thinks is best (may over-engineer, may miss edge cases). With a test-first specification: the AI generates the minimum implementation that passes the tests (no over-engineering, edge cases are defined by the tests). TDD with AI is: specification-driven development. The test is the spec. The AI implements the spec. The result: code that does exactly what was specified, nothing more.

Rule: 'Describe behavior, AI writes test first, then implements to pass the test'
Prompt: 'I want X behavior. Write the test first.' Then: 'Now implement to pass.'
Test constrains implementation: no over-engineering, edge cases defined by tests
If test fails: fix implementation, not the test (unless the test has a bug)
TDD with AI = specification-driven development: test is the spec, AI implements the spec

💡 Test Constrains the AI to Spec

Without a test: the AI generates whatever implementation it thinks best (may over-engineer, may miss edge cases). With test-first: the AI generates the minimum implementation that passes the tests. No over-engineering, edge cases defined by tests. TDD with AI = specification-driven development.

Test-After Rules: Implement First, Then Test

Test-after AI rule: "When implementing a feature: generate the code first, then generate tests. Tests should: cover all public functions and exported APIs, include happy path (normal usage), edge cases (null, empty, boundary values), error cases (invalid input, missing data), and integration scenarios (how the code interacts with dependencies). Read the implementation to understand all branches and paths — generate a test for each."

Test-after prompting strategy: "Implement a user settings page with email change, password change, and notification preferences." The AI generates: SettingsPage.tsx, SettingsForm.tsx, settings-actions.ts, and the Zod schemas. Then: "Now write comprehensive tests for all the settings components and actions." The AI reads: the implementation it just generated, identifies every function, every branch, every error path, and generates: tests that cover everything. The AI has: perfect knowledge of the implementation (it just wrote it) which produces: comprehensive tests.

The AI-specific advantage of test-after: the AI reads its own implementation and generates tests that cover every branch and edge case. A human writing tests-after: may miss branches they forgot about or edge cases they did not consider. The AI: reads every line of the implementation and generates tests for every path. Test-after with AI produces: higher coverage than test-after with humans because the AI has perfect implementation knowledge. The trade-off: the tests verify the implementation, not a specification (if the implementation is wrong, the tests may verify wrong behavior).

Rule: 'Generate code first, then generate tests. Tests cover all branches and edge cases'
Prompt: 'Implement feature X.' Then: 'Write comprehensive tests for all the code you just generated.'
AI advantage: reads its own code, generates tests for every branch. Perfect implementation knowledge
Coverage: AI test-after produces higher coverage than human test-after (misses fewer branches)
Trade-off: tests verify the implementation, not a specification (wrong code = tests verify wrong behavior)

ℹ️ AI Has Perfect Implementation Knowledge

Human writing tests-after: may miss branches they forgot or edge cases they did not consider. AI writing tests-after: reads every line of its own implementation, identifies every branch and error path, generates tests for each. Test-after with AI produces: higher coverage than test-after with humans.

When Each Approach Works Best with AI

TDD works best with AI when: the API design matters (the test forces you to define the interface before the AI implements it — better for reusable libraries, public APIs, and shared components), the behavior is well-defined (you can specify the expected inputs and outputs before implementation — utility functions, validators, transformers), or you want to prevent over-engineering (the test defines the scope, the AI implements only what is needed). TDD with AI is: slower (two generation steps) but produces: better-designed, more focused code.

Test-after works best with AI when: you are building features quickly (implement first, test second — natural flow for feature development), the AI is in agentic mode (Claude Code: implement → test → fix failures → iterate — the agentic loop), the implementation is exploratory (you are not sure what the API should look like until you see the code), or you want maximum coverage with minimal effort (the AI generates comprehensive tests from its own implementation). Test-after with AI is: faster and produces: comprehensive tests, though the tests may not drive better API design.

The hybrid approach: TDD for critical API boundaries (the function signature and behavior are specified by tests) + test-after for implementation details (the AI implements and then tests the internals). Example: TDD the public API of a service (define what it accepts and returns), then test-after the internal helper functions (the AI implements and tests them). The hybrid gives: specification-driven public APIs + comprehensive internal coverage. Many teams: use TDD for libraries and test-after for features.

TDD: API design matters, behavior is well-defined, prevent over-engineering. Slower, better design
Test-after: feature velocity, agentic workflow, exploratory code, maximum coverage. Faster, comprehensive
Hybrid: TDD for public API boundaries, test-after for internal implementation. Best of both
Agentic loop: implement → test → fix → iterate = natural test-after flow for Claude Code
Libraries: TDD (API design critical). Features: test-after (velocity critical). Both: valid with AI

Rules for AI Agentic Testing Workflows

Claude Code agentic testing rule: "After implementing a feature: run existing tests to verify nothing is broken. Then: generate tests for the new code. Then: run all tests. If any test fails: read the error, determine whether the code or the test is wrong, fix the correct one, and re-run. Iterate until all tests pass. Do not mark the task complete until all tests pass."

The agentic testing loop: the AI implements code (following CLAUDE.md rules), generates tests (following testing rules), runs tests (pnpm test), reads failures (error output), fixes the issue (code bug or test bug), and re-runs. This loop: produces code with verified correctness. The rule ensures: the AI does not skip testing, does not mark incomplete work as done, and iterates until the tests actually pass. Without the rule: the AI may generate tests but not run them (the tests may have syntax errors or false assumptions).

The coverage rule for agentic mode: "After generating tests: check coverage (pnpm test:coverage). New code should have: minimum 80% line coverage. If under 80%: identify uncovered lines and add tests for them. Do not add trivial tests to reach coverage (testing that a constructor sets a property is not valuable) — add tests for uncovered logic paths and error handling." This rule: prevents the AI from gaming coverage with trivial tests while ensuring meaningful coverage of new code.

Agentic rule: implement → test → run → fix failures → re-run → all pass before marking done
The AI must RUN tests, not just generate them — generated tests may have syntax errors
Fix the correct thing: code bug = fix code. Test bug = fix test. Determine which is wrong
Coverage: minimum 80% on new code. No trivial tests for coverage gaming
Do not mark complete until all tests pass: the agentic loop ensures verified correctness

⚠️ Generate Tests AND Run Them

Without the agentic rule: the AI generates tests but does not run them. The tests may have syntax errors or false assertions. The rule: 'Run all tests. If any fail: fix code or test. Iterate until all pass. Do not mark complete until tests pass.' Generated-but-not-run tests: provide false confidence.

Testing Workflow Summary

Summary of TDD vs test-after AI rules.

TDD: test first, then implement. The test is the specification. Better API design, focused code
Test-after: implement first, then test. AI reads its own code for comprehensive coverage
TDD with AI: slower but prevents over-engineering. Specification-driven development
Test-after with AI: faster, higher coverage (AI has perfect implementation knowledge)
AI advantage of test-after: reads every branch and generates tests for each (humans miss branches)
Hybrid: TDD for public APIs (design matters), test-after for internals (velocity matters)
Agentic: implement → test → run → fix → re-run. All tests must pass before task is complete
Coverage: 80% minimum on new code. Meaningful tests, not trivial coverage gaming