AI Rules for SQL and Database Queries

Why SQL Rules Are the Highest-ROI AI Rules You Can Write

SQL is where AI coding rules have the highest security and performance ROI. A single AI-generated query with string interpolation creates an injection vulnerability. A single missing index on a frequently queried column creates a performance bottleneck that grows linearly with your data. And AI assistants generate both of these problems constantly.

The root cause is that AI optimizes for correctness, not security or performance. A query that concatenates user input into a WHERE clause is correct — it returns the right results. It's also a textbook SQL injection vector. The AI doesn't distinguish between 'works' and 'works safely' without explicit rules.

These rules apply regardless of your ORM, language, or database. Whether you use Drizzle, Prisma, SQLAlchemy, ActiveRecord, or raw SQL — the principles are universal. The implementation varies but the rules don't.

Rule 1: Parameterized Queries — No Exceptions

The rule: 'Every database query that includes external data must use parameterized statements or your ORM's query builder. Never construct SQL strings with template literals, string concatenation, f-strings, or any form of string interpolation. If the ORM can't express the query, use raw parameterized queries with placeholder syntax ($1, ?, :param). This rule has zero exceptions.'

What the AI generates without this rule: `SELECT * FROM users WHERE email = '${email}'`. What it should generate: `SELECT * FROM users WHERE email = $1` with `[email]` as parameters. The difference is the entire SQL injection vulnerability class.

For ORMs: 'Use the ORM's query builder for all standard operations. The ORM parameterizes automatically. Only drop to raw SQL when the ORM genuinely can't express the query — and when you do, use the ORM's raw query method with parameter binding, not string construction.'

⚠️ Zero Exceptions

The parameterized query rule has zero exceptions. Not 'unless it's an internal tool.' Not 'unless the input is validated.' Always parameterize. The cost is zero and the protection is complete.

Rule 2: Prevent N+1 Queries

N+1 queries are the most common performance problem in AI-generated code that interacts with databases. The pattern: fetch a list of items (1 query), then for each item, fetch related data (N queries). The page loads fine with 10 items and crawls with 1,000.

The rule: 'Never query the database inside a loop. For related data, use JOIN or your ORM's eager loading (include/preload/with). When you need data from multiple tables, fetch it in as few queries as possible — one query with a JOIN or two queries with an IN clause, never N queries in a loop.'

For ORMs specifically: 'Drizzle: use .with() for relations. Prisma: use include: {}. SQLAlchemy: use joinedload() or selectinload(). ActiveRecord: use includes(). Django: use select_related() for ForeignKey, prefetch_related() for ManyToMany. The syntax differs but the pattern is the same — load related data in the initial query, not lazily.'

Never query inside a loop — use JOINs or eager loading
Drizzle: .with() | Prisma: include: {} | SQLAlchemy: joinedload()
ActiveRecord: includes() | Django: select_related() / prefetch_related()
Use IN clauses for batch lookups — WHERE id IN ($1, $2, $3)
Monitor with query logging in development — catch N+1 before production

💡 Never Query in a Loop

If you see a database call inside a for/forEach/map, it's an N+1. Replace with a JOIN, eager load, or IN clause. This one pattern causes more production performance issues than any other.

Rule 3: Indexing Strategy

AI assistants create queries but almost never suggest indexes. They generate a WHERE clause on a column that has no index, and the query works perfectly on your 100-row dev database but table-scans on your million-row production database.

The rule: 'Every column used in WHERE, JOIN ON, ORDER BY, or GROUP BY should have an index unless the table is small (<1000 rows). Create composite indexes for queries that filter on multiple columns — column order matters (most selective first). Add unique indexes for columns with uniqueness constraints. Never create an index on every column — indexes slow writes and consume storage.'

For migrations: 'When adding a new query pattern, add the supporting index in the same migration. When reviewing AI-generated queries, always ask: is there an index on the filtered/joined columns? Use EXPLAIN ANALYZE to verify query plans before deploying new queries.'

Rule 4: Migration Safety

AI assistants generate migrations that work on an empty database but can lock tables or cause downtime on production databases with millions of rows. Adding a column with a default value, creating an index without CONCURRENTLY, and renaming columns are all operations that can lock tables for minutes.

The rule: 'All schema changes go through migrations — never manual DDL. For PostgreSQL: use CREATE INDEX CONCURRENTLY for indexes on large tables. Add new columns as nullable first, then backfill, then add the NOT NULL constraint. Never rename or drop columns in the same deployment that stops reading them — use a two-phase deployment.'

For rollback safety: 'Every migration should be reversible. Write both up and down migrations. Test migrations on a copy of production data before deploying. Never write a migration that can't be rolled back without data loss.'

All schema changes through migrations — never manual DDL
CREATE INDEX CONCURRENTLY for large tables (PostgreSQL)
New columns: nullable first → backfill → add NOT NULL constraint
Two-phase deployment for column renames/drops
Every migration must be reversible — test up and down

ℹ️ Migration Safety

Adding a NOT NULL column with a default locks the table on large PostgreSQL tables. Pattern: add nullable → backfill → add constraint. Test migrations on a production data copy first.

Rule 5: Query Patterns and Anti-Patterns

The rule: 'Use COUNT(*) not COUNT(column) unless you specifically need to count non-NULL values. Use EXISTS instead of COUNT > 0 for existence checks — it short-circuits. Use LIMIT for pagination, never fetch all rows and paginate in application code. Use transactions for multi-step operations that must be atomic. Never use SELECT * in production queries — select only the columns you need.'

For bulk operations: 'Use INSERT ... ON CONFLICT for upserts. Use batch INSERT with multiple value sets, not individual INSERT in a loop. Use UPDATE ... FROM for bulk updates based on a subquery. Use DELETE with a subquery or CTE, not DELETE in a loop.'

For connection management: 'Use connection pooling in production (PgBouncer, built-in ORM pooling). Set query timeouts to prevent runaway queries. Use read replicas for heavy read operations. Never hold a database connection across an HTTP request boundary — release connections promptly.'

Complete SQL Rules Template

Consolidated SQL rules that apply to any language, ORM, or database.

Parameterized queries only — zero exceptions, never string interpolation
No queries inside loops — JOIN or eager load for related data
Index every WHERE/JOIN/ORDER BY column on tables >1000 rows
Migrations for all schema changes — reversible, tested on production data copy
CREATE INDEX CONCURRENTLY on large tables — nullable first for new columns
SELECT specific columns — never SELECT * in production
EXISTS over COUNT for existence — LIMIT for pagination — transactions for atomicity
Connection pooling — query timeouts — read replicas for heavy reads