AI Maps Objects to Tables Without Thinking
AI generates data models with: one-to-one class-to-table mapping (every TypeScript interface becomes a database table regardless of relationships), no cardinality analysis (is it one-to-one, one-to-many, or many-to-many? AI does not ask), no polymorphic strategy (different entity types crammed into one table or spread across many with no clear pattern), hard deletes (DELETE FROM users — data gone, audit trail broken, foreign keys cascading), and no temporal awareness (only current state stored, history lost on every UPDATE).
Modern data modeling is: relationship-aware (ER diagrams before schema, cardinality defined and constrained), polymorphism-patterned (single-table inheritance, class-table inheritance, or discriminated unions — chosen intentionally), soft-delete-capable (deleted_at timestamp, filtered by default, restorable), temporally-aware (valid_from/valid_to for historical queries, or event sourcing for full history), and evolution-planned (additive changes, backward compatibility, migration-first). AI generates none of these.
These rules cover: entity-relationship modeling, cardinality constraints, polymorphic data patterns, soft delete strategies, temporal data modeling, and schema evolution planning.
Rule 1: Entity-Relationship Analysis Before Schema
The rule: 'Draw an ER diagram before writing any CREATE TABLE. Identify: entities (nouns — User, Order, Product), relationships (verbs — User places Order, Order contains Product), cardinality (one User has many Orders, many Products belong to many Categories), and attributes (which fields belong to which entity — not which class they were in). The ER diagram is the blueprint; the schema implements it.'
For cardinality notation: 'One-to-one (1:1): User has one Profile — consider merging into one table unless they change at different rates. One-to-many (1:N): User has many Orders — foreign key on the many side (orders.user_id). Many-to-many (M:N): Product belongs to many Categories, Category has many Products — junction table (product_categories with product_id + category_id). Every relationship has a cardinality; every cardinality has a schema pattern.'
AI generates: tables based on the TypeScript interfaces with no relationship analysis. A User interface has an orders array — AI creates a users table with a JSON orders column instead of a separate orders table with a foreign key. The ER diagram would have caught this: User 1:N Order requires a separate table. Five minutes of diagramming prevents five hours of refactoring.
- ER diagram before schema: entities, relationships, cardinality, attributes
- 1:1 — consider merging unless entities change at different rates
- 1:N — foreign key on the many side (orders.user_id references users.id)
- M:N — junction table (product_categories with composite primary key)
- JSON columns for truly unstructured data — not for relationships
AI creates a users table with a JSON orders column instead of a separate orders table with a foreign key. An ER diagram would catch this: User 1:N Order requires a separate table. Five minutes of diagramming prevents five hours of schema refactoring.
Rule 2: Polymorphic Data Patterns
The rule: 'When multiple entity types share common fields but have type-specific fields, choose a polymorphic pattern. Single-Table Inheritance (STI): one table, discriminator column (type), nullable type-specific columns. Best for: few types, mostly shared fields. Class-Table Inheritance (CTI): base table with shared fields, type-specific tables with foreign key to base. Best for: many type-specific fields, frequent type-specific queries. Discriminated union in application: base type with type field, type narrowing in code.'
For STI example: 'A notifications table with: id, user_id, type (email, sms, push), message, email_subject (null for sms/push), phone_number (null for email/push), device_token (null for email/sms). Three types in one table. Queries: all notifications for a user (one query, no joins), all email notifications (WHERE type = email). Trade-off: nullable columns for type-specific fields, wasted space, but simple queries and no joins.'
AI generates: separate email_notifications, sms_notifications, and push_notifications tables with duplicated shared columns. Querying all notifications for a user requires UNION across three tables. Adding a new notification type means a new table and updating every query. With STI: add a new type value and its specific columns. One table, one query, extensible without schema restructuring.
Rule 3: Soft Delete Patterns
The rule: 'Use soft deletes for data that may need recovery, has audit requirements, or is referenced by other records. Pattern: add deleted_at TIMESTAMP DEFAULT NULL to the table. Active records: WHERE deleted_at IS NULL. Deleted records: WHERE deleted_at IS NOT NULL. Restore: UPDATE SET deleted_at = NULL. The application default query filter excludes soft-deleted records. Admin queries can include them explicitly.'
For implementation: 'ORM/query builder level: create a default scope that filters deleted_at IS NULL. Drizzle: create a wrapper function: const activeUsers = () => db.select().from(users).where(isNull(users.deletedAt)). Every query uses activeUsers() instead of db.select().from(users). Deleted records are invisible by default but accessible when needed. Foreign key considerations: soft-deleted parents should not cascade — the child records remain active and may reference the deleted parent.'
AI generates: DELETE FROM users WHERE id = 123 — the user is gone. Their orders reference a non-existent user. Their activity history is lost. Compliance requires: who deleted what, when, and the ability to restore. Soft delete: the user record stays with deleted_at set. Orders still reference a valid row. History is preserved. Restoration is one UPDATE away.
- deleted_at TIMESTAMP DEFAULT NULL — null means active, non-null means deleted
- Default query filter: WHERE deleted_at IS NULL on every read query
- Restore: UPDATE SET deleted_at = NULL — one query recovery
- Index on deleted_at for efficient filtering of active records
- Hard delete only for: GDPR erasure, test data cleanup, truly ephemeral data
DELETE FROM users WHERE id = 123: the user is gone, orders reference nothing, activity history is lost, compliance is violated. Soft delete (deleted_at timestamp): record stays, references valid, history preserved, restoration is one UPDATE. Hard delete only for GDPR erasure.
Rule 4: Temporal Data Modeling
The rule: 'For data that changes over time and where history matters, use temporal columns: valid_from TIMESTAMP NOT NULL, valid_to TIMESTAMP DEFAULT NULL (null means current). A price change: INSERT new row with valid_from = now, UPDATE previous row with valid_to = now. Current price: WHERE valid_to IS NULL. Price on a specific date: WHERE valid_from <= date AND (valid_to IS NULL OR valid_to > date). Full price history: ORDER BY valid_from.'
For slowly changing dimensions: 'Type 1 (overwrite): UPDATE the value, lose history. Type 2 (add row): INSERT new row with version dates, keep full history. Type 3 (add column): add previous_value column, keep one level of history. Type 2 is the most common for: pricing (what was the price when the order was placed?), addresses (where did we ship to?), and roles (when was the user promoted to admin?). The temporal query answers: what was the state at time T?'
AI generates: UPDATE products SET price = 29.99 — the previous price is overwritten. A customer disputes a charge: what was the price when they ordered? Unknown. With temporal data: the price at order time is preserved. The current price is in the row with valid_to = NULL. Historical prices are in rows with valid_to set. Every question about past state has an answer.
UPDATE products SET price = 29.99 overwrites the previous price. Customer disputes a charge: what was the price when they ordered? Unknown. Temporal columns (valid_from/valid_to): every past state is preserved. Every historical question has an answer.
Rule 5: Schema Evolution Planning
The rule: 'Design schemas for evolution: prefer additive changes (new tables, new nullable columns) over destructive changes (drop column, rename column, change type). Every schema change should pass the backward compatibility test: can the previous version of the application still work with this schema? If yes, deploy the schema first, then the code. If no, use expand-then-contract to make it backward-compatible.'
For naming conventions: 'Establish naming conventions from day one: snake_case for all identifiers (tables, columns, indexes), plural table names (users, orders, products), _id suffix for foreign keys (user_id, order_id), _at suffix for timestamps (created_at, updated_at, deleted_at), _count suffix for denormalized counts (comment_count), and is_ prefix for booleans (is_active, is_verified). Conventions prevent: naming inconsistencies that accumulate over years, confusion between similar fields, and schema documentation overhead.'
AI generates: inconsistent naming (userId in one table, user_id in another, UserID in a third), no convention for timestamps (created vs createdAt vs creation_date), and no evolution strategy (direct ALTER TABLE in production). Conventions established on day one and enforced in migrations: every developer writes consistent schemas, every query is predictable, and schema diffs in PRs are easy to review.
Complete Data Modeling Rules Template
Consolidated rules for data modeling.
- ER diagram before schema: entities, relationships, cardinality, attributes
- Polymorphic patterns: STI for few types + mostly shared fields, CTI for many type-specific fields
- Soft delete: deleted_at timestamp, default filter, restorable, audit-preserving
- Temporal data: valid_from/valid_to for historical queries, Type 2 SCD for full history
- Additive schema evolution: new tables, nullable columns — backward compatible by default
- Naming conventions from day one: snake_case, plural tables, _id/_at/_count suffixes
- JSON columns for truly unstructured data — not a substitute for proper relationships
- Five minutes of ER diagramming prevents five hours of refactoring