AI Rules for Streaming SSR

AI Waits for Everything Before Sending Anything

AI generates SSR that: blocks on all data before sending the first byte (TTFB = slowest database query), renders the entire page server-side as one chunk (all or nothing — user sees blank until complete), shows no intermediate states (white screen for 2-3 seconds, then full page), uses getServerSideProps-style blocking (entire page waits for every data dependency), and sends the complete HTML document in one response. The user stares at a blank page while the server waits for a slow API call that only affects the comments section.

Modern streaming SSR is: shell-first (send the page layout immediately, stream data-dependent sections as they resolve), Suspense-driven (each Suspense boundary is a streaming checkpoint), out-of-order (fast sections render first regardless of document position), TTFB-optimized (first byte sent in under 200ms with the shell), and fallback-rich (skeleton loading states visible while sections stream in). AI generates none of these.

These rules cover: Suspense-based HTML streaming, shell-first architecture, out-of-order streaming, TTFB optimization, and meaningful fallback strategies.

Rule 1: Suspense-Based HTML Streaming

The rule: 'Wrap each async data dependency in a Suspense boundary. React streams the HTML progressively: the shell (layout, navigation, static content) sends immediately, each Suspense boundary streams its content when its data resolves. The browser receives and renders content incrementally — not waiting for the slowest query to finish before showing anything.'

For how streaming works: 'React sends the initial HTML with Suspense fallbacks in place. As each async component resolves, React sends a small inline script that replaces the fallback with the actual content. The browser executes these scripts as they arrive, progressively updating the page. From the user perspective: the page appears instantly (shell), then sections fill in over the next few hundred milliseconds.'

AI generates: async function Page() { const data = await slowQuery(); return <FullPage data={data} />; } — the entire page blocks on slowQuery. With streaming: the shell sends immediately, <Suspense fallback={<Skeleton />}><SlowSection /></Suspense> streams when ready. TTFB drops from 2 seconds to 50ms. The user sees the page structure instantly.

Each Suspense boundary = one streaming checkpoint — resolves independently
Shell (layout, nav, static) sends immediately — TTFB under 200ms
Inline scripts replace fallbacks as data resolves — progressive update
Browser renders incrementally — no waiting for complete HTML document
TTFB = shell render time, not slowest data fetch time

💡 TTFB: 2s to 50ms

Without streaming: TTFB = slowest data fetch (2 seconds of blank screen). With streaming: TTFB = shell render time (50ms). The shell sends immediately with Suspense fallbacks. Data streams in as it resolves. Same total time, user sees the page 2 seconds earlier.

Rule 2: Shell-First Page Architecture

The rule: 'Design pages as a static shell with dynamic holes. The shell includes: navigation, page layout, headings, sidebar structure, and footer — everything that does not depend on data. The dynamic holes are Suspense boundaries: article content, comments, related articles, user-specific data. The shell renders in single-digit milliseconds; the holes stream as data arrives.'

For layout component design: 'In Next.js App Router: layout.tsx is the shell (never async, no data fetching), page.tsx contains the data-dependent content (async, fetches data), and loading.tsx is the automatic Suspense fallback (shown while page.tsx resolves). This three-file pattern implements shell-first streaming without any explicit Suspense boundaries — the framework handles it.'

AI generates: data fetching in layout.tsx — the shell itself blocks on data. The navigation, sidebar, and footer wait for a database query they do not need. Move data fetching out of the layout and into the page or specific components. The layout is the shell; it must render instantly. Data dependencies belong in page.tsx or deeper components wrapped in Suspense.

Rule 3: Out-of-Order Streaming

The rule: 'React streaming supports out-of-order resolution — a Suspense boundary lower in the document can resolve and stream before one higher up. The sidebar (fast query, 100ms) streams before the main content (slow query, 800ms) even though the sidebar is below the content in the HTML. The browser inserts each chunk at the correct position regardless of arrival order.'

For optimization: 'Place fast-resolving Suspense boundaries around quick data to let them stream immediately, even if they are below slower sections in the DOM. Do not nest Suspense boundaries that should resolve independently — a nested boundary waits for its parent to resolve first. Sibling Suspense boundaries stream independently. Structure your component tree so independent data dependencies are siblings, not parent-child.'

AI generates: everything in a single data fetch — no opportunity for out-of-order streaming. The fast sidebar query (100ms) waits for the slow content query (800ms) because they are in the same await. Separate Suspense boundaries: sidebar streams at 100ms, content at 800ms. The user sees the sidebar 700ms earlier — and can start navigating while the main content loads.

⚠️ Sidebar Waits for Nothing

Sidebar query: 100ms. Content query: 800ms. In one big await: sidebar waits 800ms for no reason. Separate Suspense boundaries: sidebar streams at 100ms, content at 800ms. The user sees the sidebar 700ms earlier and can start navigating while content loads.

Rule 4: TTFB Under 200ms

The rule: 'Time to First Byte should be under 200ms. With streaming SSR, TTFB = shell render time (not data fetch time). Optimize the shell: no async operations in the layout, no database queries before the first flush, minimal server-side computation before streaming begins. The first chunk contains: DOCTYPE, head (with critical CSS), and the page shell with Suspense fallbacks. Data-dependent content streams later.'

For critical CSS: 'Inline critical CSS in the initial chunk so the shell renders with correct styling immediately. Next.js does this automatically with its CSS extraction. For custom setups: extract above-the-fold CSS and inline it in the <head>. Non-critical CSS loads asynchronously via <link rel="stylesheet" media="print" onload="this.media=all">. The shell looks correct from the first paint — no flash of unstyled content.'

AI generates: SSR that computes everything server-side before sending the first byte. TTFB = 2-3 seconds (sum of all data fetches). With streaming: TTFB = 50ms (shell only). Total page complete time is the same, but perceived performance is dramatically better. The user sees and interacts with the page 2 seconds earlier.

TTFB target: under 200ms — shell only, no data fetching before first flush
First chunk: DOCTYPE + head + critical CSS + shell with fallbacks
Inline critical CSS — shell renders with correct styling immediately
No async operations in layout components — layout is the zero-data shell
Same total time, 2 seconds earlier perceived load — streaming advantage

Rule 5: Meaningful Fallback Loading States

The rule: 'Every Suspense fallback should be a layout-matching skeleton, not a generic spinner. The skeleton preserves the page layout during streaming — when content replaces it, there is zero layout shift. Design skeletons that match: the content dimensions (same height, same width), the content structure (text lines, image placeholders, button shapes), and the content density (correct number of list items, card count).'

For loading.tsx in Next.js: 'The loading.tsx file is the automatic Suspense fallback for a route segment. Design it to match the page.tsx layout exactly: same grid structure, same card count, same section heights. When the page resolves and replaces loading.tsx, the transition is seamless. The user sees: shell + skeletons (instant), then content replacing skeletons (progressive). Zero layout shift throughout.'

AI generates: fallback={<div>Loading...</div>} — a text string that takes up one line, then gets replaced by 500px of content (massive layout shift). Or fallback={null} — nothing shown, then content pops in. Skeleton fallbacks: the page structure is visible from the first frame. Content fills in progressively. CLS stays at zero. The user knows exactly what is loading and where it will appear.

ℹ️ Skeleton = Zero CLS

fallback={<div>Loading...</div>} takes one line, then 500px of content pops in — massive layout shift. Skeleton matching the content layout: same height, same structure. Content replaces skeleton with zero shift. The page structure is visible from the first frame.

Complete Streaming SSR Rules Template

Consolidated rules for streaming SSR.

Suspense boundaries for each data dependency — independent streaming checkpoints
Shell-first: layout sends instantly, data streams into Suspense holes
Out-of-order: fast sections stream first regardless of DOM position
Sibling Suspense for independent data — avoid nesting independent boundaries
TTFB under 200ms: no data fetching before first flush, shell only
Critical CSS inlined in first chunk — shell renders styled immediately
Skeleton fallbacks matching content layout — zero CLS during streaming
loading.tsx matches page.tsx layout — seamless skeleton-to-content transition