AI Generates Pages That Search Engines Cannot Read
AI generates web pages that look great in the browser but are invisible to search engines: client-side rendered content (Googlebot may not execute JavaScript), no meta description (search results show random page text), no OpenGraph tags (social shares show a blank preview), no structured data (no rich snippets in search results), and no sitemap (crawlers do not discover all pages). Every missing element costs organic traffic.
SEO in code is not about content — it is about making content accessible to crawlers, social platforms, and AI assistants. The best blog post in the world ranks nowhere if it is client-rendered with no meta tags. The technical foundation must be correct for the content to have any chance.
These rules cover the code-level SEO patterns that AI must follow: server-side rendering, metadata management, structured data, sitemaps, canonical URLs, and Core Web Vitals optimization.
Rule 1: Server-Side Rendering for Indexable Content
The rule: 'All content pages must be server-rendered: the HTML returned by the server contains the full page content. Googlebot executes JavaScript but with: delays (hours to days for initial indexing), resource limits (may not execute all JS), and rendering budget (complex SPAs may timeout). Server-rendered HTML is indexed immediately, reliably, and completely.'
For framework support: 'Next.js: use Server Components and generateStaticParams for static pages, server components for dynamic pages. Astro: static by default. Nuxt: server-rendered by default. SvelteKit: server-rendered by default. React SPA (Vite): use prerendering or a separate SSR layer for SEO-critical pages. Pure client-side rendering is acceptable only for authenticated app pages (dashboards, settings) that should not be indexed.'
AI generates React SPAs where blog posts, product pages, and landing pages are client-rendered. These pages may eventually be indexed by Google (hours or days later, maybe) but are completely invisible to: Bing, social media crawlers, AI assistants, and any crawler that does not execute JavaScript.
- Server-render all public content pages — HTML contains full content
- Client rendering only for authenticated pages (dashboard, settings)
- Googlebot executes JS but with delays, limits, and timeouts
- Other crawlers (Bing, social, AI) may not execute JS at all
- Next.js RSC, Astro, Nuxt, SvelteKit — all server-render by default
Pure client-rendered pages may take hours to days for Google to index — and are completely invisible to Bing, social crawlers, and AI assistants that do not execute JavaScript. Server-render all public content.
Rule 2: Dynamic Metadata on Every Page
The rule: 'Every page has unique metadata: <title> (under 60 characters, includes primary keyword), <meta name="description"> (150-160 characters, compelling summary), OpenGraph tags (og:title, og:description, og:image — for social shares), and Twitter Card tags (twitter:card, twitter:title, twitter:image). Use the framework metadata API — never hardcode in the HTML template.'
For Next.js: 'Use generateMetadata for dynamic pages: export async function generateMetadata({ params }) { const post = await getPost(params.slug); return { title: post.title, description: post.excerpt, openGraph: { title: post.title, images: [post.image] } }; }. For static metadata: export const metadata: Metadata = { title: "...", description: "..." }.'
AI generates pages with: no title (browser tab shows the URL), no description (Google shows random page text), and no OG tags (social shares show a blank preview). Three meta tags — title, description, og:image — determine how your page appears in: search results, social media shares, chat app link previews, and AI assistant citations.
Rule 3: JSON-LD Structured Data for Rich Snippets
The rule: 'Add JSON-LD structured data to content pages: <script type="application/ld+json">{JSON.stringify(structuredData)}</script>. Use Article schema for blog posts, Product for e-commerce, FAQ for FAQ pages, HowTo for tutorials, BreadcrumbList for navigation, and Organization for the about page. Structured data enables: rich snippets (star ratings, prices, FAQ accordion), knowledge panels, and AI assistant citations.'
For common schemas: 'Article: @type, headline, author, datePublished, dateModified, image, publisher. Product: @type, name, description, image, offers (price, availability, currency). FAQ: @type, mainEntity (array of Question + acceptedAnswer). Use Google Rich Results Test to validate your structured data before deploying.'
AI generates zero structured data — pages appear as plain blue links in search results. Competitors with structured data get: star ratings, product prices, FAQ expandable sections, and recipe cards. Rich snippets have 20-40% higher click-through rates than plain results — from code that takes 10 minutes to add.
Pages with structured data get star ratings, prices, FAQ accordions in search results. Rich snippets have 20-40% higher click-through rates than plain blue links. 10 minutes of JSON-LD code, measurable traffic increase.
Rule 4: Sitemaps, Canonical URLs, and Robots
The rule: 'Generate a sitemap.xml listing all public pages: use next-sitemap (Next.js), @astrojs/sitemap (Astro), or generate manually from your CMS/database. Include: loc (URL), lastmod (last modification date), changefreq (how often it changes), and priority (relative importance). Submit to Google Search Console and Bing Webmaster Tools.'
For canonical URLs: 'Set <link rel="canonical" href="https://example.com/page"> on every page. Canonical URLs prevent duplicate content issues: if the same content is accessible at /page, /page?ref=twitter, and /page?utm_source=newsletter, the canonical tells Google which URL to index. Without it, link equity is diluted across multiple URLs.'
For robots: 'Create a robots.txt: allow all crawlers on public pages, disallow on authenticated pages (/dashboard/, /api/, /admin/). Set <meta name="robots" content="noindex"> on pages that should not be indexed: search results pages, filtered views, and paginated pages beyond page 1. Never noindex your important content pages — it is irreversible damage to SEO.'
- sitemap.xml: all public pages — loc, lastmod, changefreq, priority
- Submit to Google Search Console and Bing Webmaster Tools
- Canonical URL on every page — prevents duplicate content dilution
- robots.txt: allow public, disallow dashboard/api/admin
- noindex on: search results, filtered views, paginated beyond page 1
Rule 5: Core Web Vitals Optimization
The rule: 'Optimize for the three Core Web Vitals — they are a Google ranking signal. LCP (Largest Contentful Paint) < 2.5s: optimize the largest visible element (hero image, heading). INP (Interaction to Next Paint) < 200ms: keep the main thread unblocked — no long tasks. CLS (Cumulative Layout Shift) < 0.1: specify dimensions on images/ads, use font-display: swap for web fonts.'
For LCP: 'Preload the hero image: <link rel="preload" as="image" href="hero.webp">. Use next/image (Next.js) or <Image /> (Astro) for automatic optimization. Serve WebP/AVIF format. Set fetchpriority="high" on the hero image. Avoid lazy-loading above-the-fold images — they should load immediately.'
For CLS: 'Set explicit width and height on all images: <img width="800" height="600">. Use aspect-ratio CSS for responsive containers. Reserve space for ads and dynamic content with min-height. Use font-display: swap and preload web fonts to prevent layout shift from font loading. Never inject content above existing content after load.'
- LCP < 2.5s: preload hero image, WebP/AVIF, fetchpriority='high'
- INP < 200ms: no long main thread tasks — split heavy computation
- CLS < 0.1: width/height on images, font-display: swap, reserve ad space
- Measure: Lighthouse, PageSpeed Insights, CrUX dashboard
- Monitor real-user data: Vercel Analytics, web-vitals npm package
The single most common CLS cause: images without width/height attributes. The browser reserves zero space, then shifts everything when the image loads. <img width='800' height='600'> — two attributes prevent the most common layout shift.
Complete SEO in Code Rules Template
Consolidated rules for SEO optimization in code.
- Server-render all public content — client rendering only for authenticated pages
- Unique metadata per page: title (<60 chars), description (150-160 chars), og:image
- JSON-LD structured data: Article, Product, FAQ — validate with Rich Results Test
- sitemap.xml for all public pages — submit to Search Console
- Canonical URL on every page — robots.txt allow/disallow — noindex for filtered views
- LCP < 2.5s: preload hero, WebP, fetchpriority — INP < 200ms: no long tasks
- CLS < 0.1: image dimensions, font-display: swap, reserved space for dynamic content
- next-sitemap / @astrojs/sitemap — generateMetadata for dynamic pages