The Div Soup Problem in AI-Generated HTML
AI assistants generate HTML that's structurally flat — divs inside divs inside divs with class names as the only hint of purpose. There's no <nav> for navigation, no <main> for primary content, no <article> for articles, no <aside> for sidebars. The page looks right visually but is meaningless to screen readers, search engines, and assistive technology.
This isn't a minor issue. Over 1 billion people worldwide have some form of disability. Web accessibility isn't optional — it's a legal requirement in many jurisdictions (ADA, EAA, Section 508) and a fundamental aspect of quality software. AI-generated div soup fails every accessibility audit.
The fix is simple: a few rules that tell the AI to use HTML the way it was designed to be used. Semantic elements exist for a reason — they carry meaning that div never will.
Rule 1: Semantic HTML Elements
The rule: 'Use semantic HTML elements for their intended purpose. <nav> for navigation. <main> for primary content (one per page). <article> for self-contained content. <section> for thematic groupings with a heading. <aside> for tangential content. <header> and <footer> for page or section headers/footers. <figure> and <figcaption> for media with captions. Never use <div> when a semantic element fits.'
For interactive elements: 'Use <button> for clickable actions — never a styled <div> or <span> with an onClick handler. Use <a> for navigation — links go somewhere, buttons do something. Use <input>, <select>, <textarea> for form inputs — never custom divs that mimic form controls. The browser provides focus management, keyboard handling, and screen reader announcements for free with native elements.'
The semantic vs div decision is simple: if the element has a purpose (navigation, button, article, header), use the semantic element. If it's purely a styling container with no semantic meaning, use div. AI defaults to div for everything — this rule inverts that default.
- <nav> for navigation — <main> for primary content (one per page)
- <article> for self-contained content — <section> for thematic groups
- <button> for actions — <a> for navigation — never clickable divs
- <header>/<footer> for page/section boundaries
- <figure>/<figcaption> for media — <time> for dates
- <div> only when no semantic element fits — it's the last resort
Use <button> for actions, <a> for navigation. Never make a div clickable. Native elements give you focus management, keyboard handling, and screen reader announcements for free.
Rule 2: Heading Hierarchy and Document Outline
The rule: 'Use headings in order: <h1> once per page (the page title), <h2> for major sections, <h3> for subsections, and so on. Never skip heading levels — don't jump from <h1> to <h3>. Never use headings for visual styling — if you want big text that isn't a heading, use CSS. The heading hierarchy creates the document outline that screen readers use for navigation.'
For components: 'Components that render headings should accept a heading level prop so they can be composed at any level in the document hierarchy. A Card component that always renders <h3> breaks the outline when used in a context that expects <h4>.'
AI assistants frequently use headings based on visual size rather than document structure. The rule forces correct hierarchy — if the AI wants to change the visual size, it uses CSS, not the wrong heading level.
Rule 3: ARIA Attributes — Use Sparingly and Correctly
The rule: 'The first rule of ARIA is don't use ARIA. Native HTML elements (button, input, nav, dialog) carry built-in ARIA roles. Only add ARIA when native elements can't express the semantics — custom widgets, dynamic content regions, and application states. When ARIA is needed, use the correct role, and always pair it with the required states and properties.'
For images: 'All <img> elements must have an alt attribute. Decorative images use alt="" (empty, not missing). Informative images use descriptive alt text that conveys the information the image provides, not what the image looks like. Complex images (charts, diagrams) need both alt text and a longer description.'
For dynamic content: 'Use aria-live="polite" for content that updates without user action (notifications, status messages). Use aria-expanded for collapsible sections. Use aria-hidden="true" for decorative elements that should be hidden from screen readers. Use role="alert" for important messages that need immediate attention.'
Don't use ARIA. Native HTML elements carry built-in roles. Only add ARIA for custom widgets where no native element exists. Wrong ARIA is worse than no ARIA — it actively misleads assistive technology.
Rule 5: WCAG 2.1 AA Compliance Checklist
The rule: 'All generated HTML must meet WCAG 2.1 Level AA. This means: color contrast ratio of at least 4.5:1 for normal text and 3:1 for large text. Text resizable up to 200% without loss of content. All functionality available via keyboard. No content flashes more than 3 times per second. All form inputs have associated labels. Error messages are descriptive and programmatically associated with inputs.'
For forms: 'Every input must have a visible <label> with a for/htmlFor attribute matching the input's id. Use <fieldset> and <legend> for related input groups. Error messages use aria-describedby to associate with the input. Required fields are marked with aria-required="true" and a visual indicator.'
For testing: 'Use axe-core in CI to catch accessibility violations. Use Lighthouse accessibility audit as a baseline. Test with a keyboard (unplug the mouse for 10 minutes). Test with a screen reader (VoiceOver on Mac, NVDA on Windows) at least once per major feature.'
axe-core catches ~60% of detectable accessibility issues automatically. Add it to CI as a gate — if it fails, the build fails. Then supplement with manual keyboard and screen reader testing.
Complete HTML & Accessibility Rules Template
Consolidated rules for accessible HTML. These apply regardless of your framework — React, Vue, Angular, Svelte, or static HTML.
- Semantic elements: nav, main, article, section, button, a — div only as last resort
- Heading hierarchy: h1 once, h2-h6 in order, never skip levels
- ARIA sparingly: native elements first, ARIA only for custom widgets
- alt on all images: descriptive for informative, empty for decorative
- Keyboard accessible: native elements, visible focus, modal focus traps
- WCAG 2.1 AA: 4.5:1 contrast, 200% text resize, labeled inputs, skip links
- Form labels: visible <label> + for, fieldset/legend for groups, aria-describedby for errors
- axe-core in CI — Lighthouse audit — manual keyboard + screen reader testing