Rule Compliance Check (Items 1-4)
Item 1: Error handling matches the project pattern. Does the AI-generated code use the project's error handling convention (Result pattern, typed exceptions, or error codes)? AI tools occasionally fall back to generic try-catch even when rules specify alternatives. Check every error boundary. Item 2: Import and export conventions followed. Named exports vs default exports, import ordering, barrel file usage โ the conventions most commonly violated because AI tools have strong defaults that may conflict with your rules.
Item 3: Naming conventions applied consistently. Variable naming (camelCase vs snake_case), file naming (kebab-case vs PascalCase), function naming (verb-first: getUser, createOrder). AI tools generally follow naming rules well, but check compound names and acronyms (userId vs userID, apiUrl vs apiURL). Item 4: File structure and organization. Is the generated code in the correct directory? Does it follow the project's module structure (vertical slices, layer-based, feature-based)? AI tools sometimes create files in incorrect locations when the rules do not specify the exact path pattern.
Why rule compliance is checked first: if the AI-generated code violates basic conventions, the review should stop. Request a regeneration with explicit rule references rather than manually fixing convention violations. Manual convention fixes: time-consuming and error-prone. Regeneration with clearer rules: produces compliant code and improves the rules for future generation. AI rule: 'Rule compliance is the gate check. If conventions are wrong: regenerate. If conventions are right: proceed to deeper review. Skipping the gate: leads to merged code with convention violations that propagate to future AI-generated code.'
Business Logic Verification (Items 5-8)
Item 5: Does the code solve the actual problem? AI-generated code can be syntactically correct and convention-compliant but solve the wrong problem. The AI: interpreted the prompt literally. The intent: was something different. Read the code with the user story in mind. Does the notification endpoint actually send notifications, or just store them? Does the search function handle partial matches, or only exact matches? The business logic: the most important review item because the AI cannot verify business correctness.
Item 6: Are all edge cases handled? AI tools tend to implement the happy path thoroughly and handle obvious error cases, but miss subtle edge cases. Check: empty collections (what happens when the user has zero notifications?), boundary values (what happens at exactly the rate limit?), null/undefined (what happens when an optional field is missing?), concurrent access (what happens when two requests modify the same resource simultaneously?). The edge cases: where AI-generated code most often fails.
Item 7: Is the data flow correct? Trace the data from input to output. Are transformations correct? Is data validation applied at the right boundary (API input, not after database write)? Are types preserved through the flow (no accidental string-to-number conversions)? Item 8: Are the tests testing the right things? AI-generated tests often test that the function was called (implementation detail) rather than that the function produces the correct output (behavior). Check: do tests verify behavior, not implementation? Do tests cover the edge cases identified in item 6? AI rule: 'Business logic verification is where human review adds the most value. The AI: handles syntax, conventions, and patterns (with rules). The human: verifies that the code solves the business problem. This division: makes both the AI and the reviewer more effective.'
AI-generated code typically handles the happy path perfectly and obvious errors (404 not found, 401 unauthorized) correctly. Where it fails: subtle edge cases. Empty collection (user has 0 items โ does the UI show an empty state or crash?). Boundary value (exactly at the rate limit โ does it allow or reject?). Concurrent access (two users edit the same resource โ does it merge, overwrite, or error?). The reviewer's highest-value activity: systematically checking every edge case the AI did not consider.
Security Verification (Items 9-11)
Item 9: Input validation present at every boundary. Check: API endpoints validate all input parameters. Form handlers sanitize user input. File upload handlers verify MIME type, size, and filename. URL parameters are validated before use. The AI: should add validation if security rules exist. If validation is missing: check whether the security rules are specific enough. 'Validate input' is too vague. 'All API endpoints validate input with zod schemas before processing' is specific enough for the AI to follow.
Item 10: Authentication and authorization applied correctly. Check: all protected routes include auth middleware. Role checks use the correct required role (not just 'any authenticated user'). Data queries filter by the current user's tenant/organization. Admin endpoints verify admin role. The AI: applies auth middleware if the rule specifies it. The reviewer: verifies that the correct auth LEVEL is applied (is this endpoint admin-only or any-user?).
Item 11: No sensitive data exposed. Check: API responses do not include password hashes, internal IDs, or system metadata. Error messages do not reveal stack traces or database details. Logging does not record sensitive user data. The AI: may include debug information that should not reach production. The reviewer: checks every response shape and every log statement for sensitive data leakage. AI rule: 'Security verification in AI-generated code focuses on completeness and correctness. The AI applies security patterns (from rules). The reviewer verifies: the patterns are applied everywhere they should be, and the specific security levels are correct for each endpoint.'
AI prompt: 'Fetch all orders with their items.' AI output: const orders = await db.query.orders.findMany(); for (const order of orders) { order.items = await db.query.items.findMany({ where: eq(items.orderId, order.id) }); }. For 100 orders: 101 database queries (1 for orders + 100 for items). Correct approach: const orders = await db.query.orders.findMany({ with: { items: true } }); โ 1 query with a join. The AI generated correct code (it works). The AI generated slow code (N+1). The reviewer: checks every loop that contains a database call.
Performance and Quality (Items 12-15)
Item 12: No N+1 queries. AI-generated database code frequently introduces N+1 patterns: fetch a list, then loop and fetch related data for each item. Check: are related data loaded in a single query (join or include)? Are list endpoints paginated? Are frequently accessed queries using appropriate indexes? The N+1 pattern: the most common performance issue in AI-generated code because the AI generates the simplest correct implementation, which is often the least performant.
Item 13: No unnecessary re-renders (frontend). AI-generated React/Vue components may trigger excessive re-renders through: inline function props, object literals in JSX, missing useMemo/useCallback, or state stored at too high a level. Check: are expensive computations memoized? Are callback props stable (not recreated on each render)? Is state local to the component that uses it? Item 14: No resource leaks. Check: event listeners cleaned up in useEffect return. Database connections released after use. File handles closed. Timers and intervals cleared. The AI: creates resources but may forget cleanup in some code paths.
Item 15: Code is appropriately simple. AI-generated code sometimes over-engineers: unnecessary abstractions, premature generalization, or patterns more complex than the problem requires. Check: is there a simpler way to achieve the same result? Are there abstractions that have only one implementation? Are there utility functions that are called only once? Simple code: easier to maintain, easier to debug, and less likely to contain hidden bugs. AI rule: 'Performance and quality review catches the issues that rules cannot prevent: N+1 queries (context-dependent), unnecessary complexity (judgment-dependent), and resource leaks (path-dependent). These require human review because the AI cannot evaluate them without understanding the full execution context.'
AI prompt: 'Create a function to format a user's name.' AI output: a NameFormatter class with a Strategy pattern, a FormatterRegistry, a NameParts interface, and 3 formatting strategies. Total: 80 lines. The actual need: function formatName(first: string, last: string): string { return first + ' ' + last; }. Total: 1 line. The AI: generated an extensible, pattern-rich solution. The project: needed a one-liner. Review item 15 (appropriate simplicity): catches this pattern. The question: 'Is there a simpler way?' often has the answer 'yes.'
Code Review Checklist Quick Reference
The complete 15-item AI code review checklist.
- Items 1-4 (Rule compliance): error handling, imports, naming, file structure โ gate check, regenerate if wrong
- Item 5: Does code solve the actual problem? Read with user story in mind
- Item 6: Edge cases โ empty collections, boundaries, null/undefined, concurrency
- Item 7: Data flow โ trace input to output, verify transformations and types
- Item 8: Test quality โ tests verify behavior not implementation, cover edge cases
- Item 9: Input validation at every boundary โ zod schemas, sanitization, upload checks
- Item 10: Auth applied correctly โ right middleware, right role level, tenant filtering
- Item 11: No sensitive data exposed โ check responses, errors, and logs
- Item 12: No N+1 queries โ joins/includes instead of loops, pagination on lists
- Item 13: No unnecessary re-renders โ stable props, memoization, local state
- Items 14-15: No resource leaks, appropriate simplicity โ cleanup paths, no over-engineering