AI Does Everything in the Request Handler
AI generates request handling with: synchronous processing of everything (resize image, send email, update analytics, generate PDF — all in the POST handler), no decoupling (the API server does the work of 5 different services), no retry on failure (email sending fails, the user gets a 500 error for the entire request), no ordering guarantees (concurrent requests process in unpredictable order), and no backpressure (1000 concurrent requests all try to process simultaneously — server runs out of memory). The user waits 10 seconds while the handler does work that should happen in the background.
Message queues solve this by: decoupling producers from consumers (the API handler enqueues work and returns immediately), enabling async processing (consumers process work at their own pace, not at the pace of incoming requests), providing retry and dead letter handling (failed messages are retried, permanently failed messages go to a DLQ), guaranteeing ordering when needed (FIFO queues process messages in order), and controlling concurrency (process 10 messages at a time, not 1000). AI generates none of these.
These rules cover: queue selection criteria, producer/consumer patterns, dead letter queues, message ordering, delivery guarantees, and backpressure handling.
Rule 1: Queue Selection Criteria
The rule: 'Choose a queue based on your infrastructure and requirements. BullMQ (Redis-backed): best for Node.js applications, rich job features (delayed jobs, rate limiting, priorities, repeatable jobs), local development friendly. AWS SQS: best for AWS-native applications, managed (zero infrastructure), scales to unlimited throughput, integrates with Lambda. RabbitMQ: best for complex routing (topic exchanges, headers routing, fanout), self-hosted, supports multiple protocols (AMQP, MQTT, STOMP). For most Node.js web applications: BullMQ is the default choice.'
For the decision matrix: 'Under 10 queues, Node.js: BullMQ (simplest, Redis you probably already have). AWS-native with Lambda consumers: SQS (native integration, zero management). Complex routing needs (messages to multiple consumers based on content): RabbitMQ (exchange routing). High throughput event streaming (millions of messages, replay, consumer groups): Kafka or Redpanda. Do not over-engineer: if you need to send emails in the background, BullMQ with Redis is the right choice. Kafka for an email queue is a Lamborghini for a grocery run.'
AI generates: synchronous processing or a custom in-memory queue (array.push / array.shift) that loses all messages on server restart. BullMQ: persistent (messages survive restart), retryable (failed jobs are retried automatically), observable (job progress, events, metrics), and concurrent (process N jobs simultaneously with controlled parallelism). A real queue replaces fragile custom code with a battle-tested system.
- BullMQ: Node.js + Redis, rich features, local dev friendly — default for web apps
- AWS SQS: managed, unlimited scale, Lambda integration — AWS-native applications
- RabbitMQ: complex routing, exchanges, self-hosted — content-based routing needs
- Kafka/Redpanda: event streaming, replay, consumer groups — millions of messages
- Match queue to need: BullMQ for email, not Kafka — avoid over-engineering
BullMQ with Redis for background emails: simple, proven, Redis you already have. Kafka for an email queue: massive operational overhead for a simple use case. Match the queue to the need. Most web apps need BullMQ, not a distributed streaming platform.
Rule 2: Producer/Consumer Decoupling
The rule: 'The producer (API handler) creates a message and returns immediately. The consumer (worker process) processes the message independently. The producer does not know or care: when the message is processed, how long it takes, whether it succeeds on the first try, or which consumer instance handles it. This decoupling means: the API response time is independent of the processing time (user gets 202 Accepted in 50ms, email sends in 5 seconds), the API server and worker scale independently, and the worker can be in a different language or service.'
For BullMQ implementation: 'Producer: const queue = new Queue("email"); await queue.add("send-welcome", { to: user.email, template: "welcome" }); return res.status(202).json({ jobId }). Consumer: const worker = new Worker("email", async (job) => { await sendEmail(job.data.to, job.data.template); }, { concurrency: 5 }). The producer adds to the queue and returns. The worker processes up to 5 emails concurrently. If the worker crashes: the message stays in Redis, another worker picks it up. No messages lost.'
AI generates: await sendEmail(user.email, 'welcome') in the registration handler. Email service is slow (3 seconds): user waits 3 seconds to see the registration success page. Email service is down: registration fails with 500 (even though the user was created successfully). With a queue: registration returns in 50ms, email sends in the background. Email service down? The message waits in the queue and sends when the service recovers. The user experience is decoupled from the email service health.
Rule 3: Dead Letter Queues for Failed Messages
The rule: 'Configure a dead letter queue (DLQ) for messages that fail after max retries. BullMQ: on job failure after attempts: 3, the job moves to the failed state (queryable, removable, retryable). SQS: configure a DLQ with maxReceiveCount: 3 — after 3 failed receives, the message moves to the DLQ. Monitor the DLQ: alert when messages arrive (a message in the DLQ means something is persistently failing), inspect message content (understand why it failed), and replay after fixing (move messages back to the main queue).'
For DLQ monitoring: 'Alert on DLQ depth > 0: this is not a normal state. A message in the DLQ means: the consumer crashed on this message 3 times (bug in the consumer), the downstream service is permanently failing (not transient), or the message is malformed (bad data that will never process successfully). Investigate immediately: check consumer logs for the error, check the message content for anomalies, and fix the root cause before replaying. Replaying without fixing = the same messages fail again and return to the DLQ.'
AI generates: no DLQ, no retry tracking. A message fails: it is gone. The email never sends, the webhook never fires, the report never generates — and nobody knows. With a DLQ: the failed message sits in a monitored queue. An alert fires. The team investigates, fixes the bug, replays the messages. Every failed message is recoverable. Nothing is silently lost.
- DLQ for messages that fail after max retries (3-5 attempts)
- Alert on DLQ depth > 0: not normal, investigate immediately
- Inspect message content: understand the failure before replaying
- Fix root cause first, then replay: replaying without fixing = same failures
- BullMQ failed jobs are queryable and retryable from the dashboard
Without a DLQ: failed message disappears. Email never sends, nobody knows. With DLQ: failed message sits in a monitored queue, alert fires, team investigates, fixes the bug, replays. Every failed operation is recoverable. Zero silent data loss.
Rule 4: Message Ordering Guarantees
The rule: 'Most queues provide at-least-once delivery with no ordering guarantee. Messages may be processed out of order, especially with concurrent consumers. When ordering matters: SQS FIFO queues (guaranteed order within a MessageGroupId), BullMQ with concurrency: 1 (single consumer, sequential processing), or RabbitMQ with single consumer per queue. When ordering does not matter (most cases): use standard queues with parallel consumers for maximum throughput.'
For when ordering matters: 'User actions on the same resource must be ordered: create order, then update order, then cancel order. If cancel processes before create: the cancel fails (order does not exist). Use a message group ID matching the resource ID: all messages for order-123 are processed in order. Messages for different orders process in parallel. BullMQ: use named queues per resource or FIFO groups. SQS: MessageGroupId = orderId ensures per-order ordering while allowing cross-order parallelism.'
AI generates: concurrent consumers with no ordering consideration. Two messages: create-user and update-user-role arrive. If update-user-role processes first (concurrent consumer picks it up faster): the user does not exist yet, the update fails. With FIFO ordering per user ID: create-user always processes before update-user-role. Cross-user messages still process in parallel. Ordering where needed, parallelism everywhere else.
Rule 5: Backpressure and Concurrency Control
The rule: 'Control consumer concurrency to prevent resource exhaustion. BullMQ: new Worker("queue", processor, { concurrency: 10 }) — process at most 10 jobs simultaneously. Without concurrency limits: 1000 messages in the queue, the worker tries to process all 1000 at once, each opening a database connection, each allocating memory — the worker crashes. With concurrency: 10: processes 10 at a time, takes 100 batches to clear the queue, steady resource usage throughout.'
For rate limiting: 'BullMQ rate limiter: new Queue("api-calls", { limiter: { max: 100, duration: 60000 } }) — process at most 100 jobs per minute. Use for: external API calls with rate limits (Stripe: 100 requests/second, SendGrid: 600/minute), resource-intensive operations (image processing, PDF generation), and database-heavy operations (batch imports). The queue absorbs bursts (1000 messages arrive at once), the rate limiter smooths the processing (100/minute steady output). The burst is absorbed, not rejected.'
AI generates: process.on('message', async (msg) => { await processMessage(msg); }) with no concurrency limit. 1000 messages arrive: 1000 concurrent processMessage calls. Each opens a database connection (1000 connections — database limit exceeded), each allocates memory (1000 in-flight operations — OOM). With concurrency: 10 and rate limiting: steady processing, predictable resource usage, no crashes, and the queue absorbs the burst. The system processes at its capacity, not at the ingress rate.
1000 messages arrive instantly. Without queuing: 1000 concurrent operations, OOM crash. With BullMQ concurrency: 10 + rate limit 100/min: steady processing, predictable resources, queue drains in 10 minutes. The burst is absorbed, not rejected.
Complete Message Queue Rules Template
Consolidated rules for message queues.
- BullMQ for Node.js web apps, SQS for AWS-native, RabbitMQ for complex routing
- Producer/consumer decoupling: API returns 202, worker processes independently
- Dead letter queue: failed after max retries, monitored, inspectable, replayable
- FIFO ordering per resource ID: create before update before delete — parallel across resources
- At-least-once delivery: consumers must be idempotent (same message processed twice = same result)
- Concurrency control: process N at a time, not all at once — prevent resource exhaustion
- Rate limiting: smooth bursts to steady output matching downstream capacity
- Queue absorbs bursts: 1000 messages arrive instantly, process at 100/minute steadily