AI Stores Sensitive Data in Plain Text
AI generates code that stores: API keys as plain text (readable by anyone with database access), PII without encryption (names, emails, addresses, SSNs in clear text), payment information unprotected (card numbers, CVVs stored directly), secrets in configuration files (hardcoded in source code, committed to git), and no key management (encryption keys alongside the data they protect). A database breach exposes everything. Encryption at rest means: even if the database is stolen, the data is unreadable without the encryption keys.
Modern encryption at rest is: field-level (encrypt sensitive columns, not the entire database), algorithm-standard (AES-256-GCM with authenticated encryption), key-managed (keys in a KMS like AWS KMS, not in the codebase), envelope-encrypted (data encrypted with a data key, data key encrypted with a master key), and rotation-ready (keys rotate without re-encrypting all data). AI generates none of these.
These rules cover: field-level encryption with AES-256-GCM, KMS key management, envelope encryption pattern, automated key rotation, and data classification for deciding what to encrypt.
Rule 1: Field-Level Encryption with AES-256-GCM
The rule: 'Encrypt sensitive fields individually, not the entire database. Use AES-256-GCM (Galois/Counter Mode) — it provides both confidentiality (data is unreadable) and integrity (tampered data is detected). Encryption: const cipher = crypto.createCipheriv('aes-256-gcm', key, iv); Decryption: const decipher = crypto.createDecipheriv('aes-256-gcm', key, iv). Store: the encrypted data, the IV (initialization vector, unique per encryption), and the auth tag (for integrity verification).'
For what to encrypt: 'Encrypt: PII (name, email, phone, address, SSN), API keys and secrets, payment information, health records, and any field subject to compliance requirements (GDPR, HIPAA, PCI DSS). Do not encrypt: primary keys (you need to query by them), foreign keys (you need to join on them), timestamps (you need to sort by them), and status fields (you need to filter by them). Encrypt the sensitive data, leave the structural data queryable.'
AI generates: db.insert({ ssn: req.body.ssn, email: req.body.email }) — plain text. A SQL injection, database backup leak, or insider threat exposes every record. Field-level encryption means: the database stores ciphertext. Even with full database access, the attacker sees encrypted blobs, not plain text. The keys are in a separate system (KMS).
- AES-256-GCM: confidentiality (unreadable) + integrity (tamper detection)
- Unique IV per encryption — never reuse IVs with the same key
- Store: ciphertext + IV + auth tag — all three needed for decryption
- Encrypt PII, secrets, payment info — not primary keys, foreign keys, or timestamps
- Field-level, not full-database — keep structural data queryable
Encrypt PII, secrets, and payment info (sensitive data). Leave primary keys, foreign keys, and timestamps unencrypted (structural data you need to query and sort). Field-level encryption protects the sensitive data while keeping the database usable.
Rule 2: KMS-Managed Encryption Keys
The rule: 'Store encryption keys in a Key Management Service (KMS), never in code or configuration files. AWS KMS, Google Cloud KMS, Azure Key Vault, or HashiCorp Vault. The KMS provides: secure key storage (hardware-backed, tamper-resistant), access control (IAM policies control who can use which keys), audit logging (every key usage is logged), and key rotation (automated, transparent to your application).'
For why not environment variables: 'Environment variables are better than hardcoded keys but still problematic: visible in process listings, logged by some monitoring tools, included in crash dumps, and shared across all code running in the process. A KMS call returns the decrypted key only when authorized — the key material lives in secured hardware, not in memory for the entire process lifetime.'
AI generates: const ENCRYPTION_KEY = 'my-secret-key-123' — hardcoded in source code, committed to git, visible to every developer. Even const ENCRYPTION_KEY = process.env.ENCRYPTION_KEY is better but not ideal. KMS: the key never leaves the KMS hardware. Your application requests encryption/decryption operations; the KMS performs them and returns the result. The key material never enters your application memory.
Rule 3: Envelope Encryption Pattern
The rule: 'Use envelope encryption: generate a data encryption key (DEK) for each record or batch, encrypt the data with the DEK, then encrypt the DEK with the master key (KEK) from KMS. Store the encrypted DEK alongside the encrypted data. This pattern means: the master key encrypts only small DEKs (fast KMS calls), the DEK encrypts the actual data (fast local encryption), and key rotation only re-encrypts DEKs (not all data).'
For the performance benefit: 'Direct KMS encryption: every encrypt/decrypt operation calls the KMS API (network latency + rate limits). Envelope encryption: generate DEK (one KMS call), use DEK locally for thousands of encrypt operations (zero network calls), then encrypt the DEK with KMS (one more call). For a batch of 10,000 records: direct = 10,000 KMS calls. Envelope = 2 KMS calls + 10,000 local operations. Orders of magnitude faster.'
AI generates no envelope encryption — either plain text (no encryption) or direct encryption with a single key for all data. A single key means: if the key is compromised, all data is compromised. Envelope encryption with per-record DEKs means: compromising one DEK exposes one record. The master key in KMS is the highest-value target, protected by hardware and access controls.
Direct KMS encryption for 10,000 records = 10,000 API calls (slow, rate-limited, expensive). Envelope encryption: 2 KMS calls (generate DEK + encrypt DEK) + 10,000 local AES operations. Same security, orders of magnitude faster.
Rule 4: Automated Key Rotation
The rule: 'Rotate encryption keys on a schedule (annually at minimum, quarterly for high-security data). With envelope encryption, rotation re-encrypts DEKs with the new master key — the data itself is not re-encrypted (the DEK still decrypts it). AWS KMS supports automatic annual rotation. For manual rotation: generate a new key version, re-encrypt active DEKs with the new version, keep the old version available for decrypting historical data, retire the old version after all DEKs are re-encrypted.'
For backward compatibility: 'Tag encrypted data with the key version: { data: encryptedBlob, keyVersion: 3, dek: encryptedDEK }. On decryption: look up the master key by version, decrypt the DEK, decrypt the data. This allows gradual migration — old records use old key versions, new records use the current version. Background jobs can re-encrypt old records to the latest key version over time.'
AI generates: one key, used forever, no rotation plan. If the key is compromised, all data is exposed and will remain exposed until the key is changed and all data is re-encrypted. Rotation limits the exposure window: data encrypted with a rotated-out key is safe even if the current key is compromised (the old key is different).
Rule 5: Data Classification Policy
The rule: 'Classify every data field into sensitivity tiers: Tier 1 (public — no encryption needed): article titles, public profiles, published content. Tier 2 (internal — access controlled): user emails, preferences, activity logs. Tier 3 (confidential — encrypted at rest): PII, API keys, payment info, health records. Tier 4 (restricted — encrypted + access audited): SSNs, financial data, legal documents. The tier determines: encryption requirements, access controls, retention policies, and audit logging.'
For compliance mapping: 'GDPR: Tier 3+ data requires encryption and right-to-erasure support. HIPAA: health records (Tier 4) require encryption at rest and in transit. PCI DSS: payment card data (Tier 4) requires field-level encryption, key management, and quarterly key rotation. SOC 2: audit trail for all Tier 3+ data access. The classification drives which rules apply — not every field needs the same protection level.'
AI classifies nothing — every field gets the same treatment (plain text). When a compliance audit asks 'which fields contain PII and how are they protected?', the answer is: we do not know and they are not protected. Data classification answers both questions before the auditor asks. The classification document is as important as the encryption implementation.
When the compliance auditor asks 'which fields contain PII and how are they protected?' — without classification, the answer is 'we don't know.' A 4-tier data classification policy answers both questions before the auditor asks.
Complete Encryption at Rest Rules Template
Consolidated rules for encryption at rest.
- AES-256-GCM for field-level encryption — confidentiality + integrity
- Unique IV per encryption operation — never reuse with the same key
- KMS-managed keys (AWS KMS, Vault) — never hardcoded, never in env vars ideally
- Envelope encryption: DEK encrypts data, KEK encrypts DEK — fast + rotatable
- Key rotation annually minimum — re-encrypt DEKs, not all data
- Key versioning: tag encrypted data with key version for backward compatibility
- Data classification: 4 tiers mapping to encryption, access, retention, and audit requirements
- Compliance-driven: GDPR, HIPAA, PCI DSS, SOC 2 map to classification tiers