How to Monitor AI Rule Compliance

Visibility, Not Enforcement

Rule compliance monitoring serves two purposes: visibility (leadership knows which teams have adopted current rules and which have not) and improvement (the platform team identifies: which rules are causing friction, which repos are falling behind, and which patterns the team needs to address). Monitoring is NOT enforcement: it does not block deployments, punish teams, or create compliance tickets. It provides data that informs decisions.

The monitoring mindset: compliance data helps the platform team ask better questions. A repo with outdated rules: is the team aware of the update? Do they need help? Is the update causing a conflict with their project-specific rules? A rule with high override rate: is the rule too rigid? Does it need an exception? Is there a misunderstanding? The data: triggers investigation, not punishment. AI rule: 'Monitoring data answers: where are we? Where should we be? What is blocking progress? It does not answer: who should be punished for non-compliance.'

What to monitor: rule freshness (are repos using the current rule version?), rule adoption (do repos have rules assigned and pulled?), override rate (which rules are frequently overridden?), and developer satisfaction (do developers find the rules helpful?). These four dimensions: provide a complete picture of rule health across the organization.

Step 1: Rule Freshness Tracking

Freshness: is the CLAUDE.md in each repo generated from the current ruleset version? If the current version is v2.5.0 and a repo has v2.3.0: the repo is 2 versions behind. Freshness tracking: compares each repo's rule version against the current published version. Implementation: rulesync status --json (outputs the current version and the repo's version as JSON). A script: collects this from all repos and generates a freshness report.

Freshness dashboard: a simple dashboard showing: green (current version), yellow (1 version behind — acceptable, team may be testing before adopting), red (2+ versions behind — needs attention), and gray (no rules assigned — the project may need onboarding). The dashboard: updated daily (from scheduled CI runs) or in real-time (from RuleSync webhook events). Display: at the organization level (how many repos are green/yellow/red) and per-team (which specific repos need attention).

Freshness alerts: for repos that fall 2+ versions behind: send a Slack notification to the team's channel. Message: 'Your project [name] is using AI rules v2.3.0. The current version is v2.5.0. Run rulesync pull to update. Changes since your version: [link to changelog].' The alert: informational, not blocking. The team: updates when ready. If the same repo is 3+ versions behind for more than 2 weeks: escalate to the EM. AI rule: 'Alerts for 2+ versions behind. Escalation for 3+ versions over 2 weeks. The goal: awareness, not enforcement. Most teams update promptly when notified.'

💡 Green/Yellow/Red Is All Leadership Needs

The executive asks: 'How is AI rule adoption going?' Answer: '85% green (current), 10% yellow (1 version behind, updating this sprint), 5% red (2+ behind, investigating).' The executive: understands immediately. No technical details needed. The color coding: universally understood. If the executive wants to drill down: the team view shows which teams are yellow/red. But most of the time: the single percentage + color distribution is the complete answer.

Step 2: Override Rate Monitoring

Override tracking: when developers override an AI rule (ignoring the AI's suggestion, adding a rule-specific exception), the frequency reveals rule health. A rule overridden 3% of the time: working well (overrides are genuine edge cases). A rule overridden 25% of the time: needs revision (the rule does not fit real-world usage). Override tracking: identifies the specific rules that cause the most friction.

Implementation: if the AI tool supports override logging (Claude Code, Cursor may provide analytics): collect override data per rule. If not: use a quarterly survey asking developers: 'Which rules do you override most often and why?' The survey: less precise than automated tracking but still valuable. Combine: automated data (if available) + survey data (for context — why developers override). AI rule: 'Override rate per rule: the most actionable compliance metric. High override rate: the rule is the problem, not the developers. Low override rate: the rule is working as intended.'

Acting on override data: for rules with >20% override rate: investigate. Common causes: the rule is too rigid for certain contexts (add an exception clause), the rule conflicts with another rule (resolve the conflict), developers misunderstand the rule (clarify the wording), or the rule is genuinely wrong for the codebase (revise or remove). The investigation: always leads to a rule improvement, not a developer reprimand. AI rule: 'High override rate → improve the rule. Never: high override rate → stricter enforcement. Stricter enforcement of a bad rule: increases resentment without improving quality.'

⚠️ Never Use the Dashboard to Rank or Punish Teams

Dashboard showing: 'Team A: 100%. Team B: 95%. Team C: 60%.' If used for ranking: Team C feels ashamed, avoids the platform team, and may fake compliance (empty CLAUDE.md that passes the freshness check). If used for support: 'Team C is at 60% — let us investigate. Do they have a technical blocker? A ruleset conflict? A new project without assigned rules?' Investigation: identifies the obstacle. Support: removes it. Ranking: creates resistance. Support: creates adoption.

Step 3: Building the Compliance Dashboard

Dashboard architecture: data sources (rulesync status from each repo, override tracking, developer surveys) → data aggregation (a script or service that collects and normalizes data) → dashboard display (a web page, Grafana dashboard, or RuleSync's built-in dashboard). For most teams: the RuleSync dashboard provides: adoption metrics, freshness status, and rule version distribution. For custom metrics: export data and build a Grafana dashboard.

Dashboard views: executive view (one number: 85% of repos are on current rules. Trend: up from 72% last month), team view (per-team compliance: Team A: 100%, Team B: 80%, Team C: 60% — Team C needs support), and detailed view (per-repo: which specific repos are outdated, when they were last updated, and which ruleset version they are on). Each view: serves a different audience. Executives: the number. Managers: per-team. Platform team: per-repo.

Dashboard anti-patterns: displaying individual developer compliance (creates a blame culture), blocking deployments based on dashboard status (creates bypass culture), and using the dashboard as a ranking system (teams compete on metrics rather than on actual code quality). AI rule: 'The dashboard shows organizational health, not individual compliance. Per-team granularity is the lowest level that is productive. Per-developer: counterproductive.'

ℹ️ High Override Rate = The Rule Is the Problem

Rule #7 (error handling): 25% override rate. The instinct: 'Developers are not following the rules. We need stricter enforcement.' The reality: the rule does not account for Express middleware error handling (which requires a different pattern). 25% of developers work on Express middleware. They override because the rule is wrong for their context. Fix: add an exception clause for Express middleware. Override rate drops to 3%. The rule was the problem. Not the developers.

Compliance Monitoring Summary

Summary of monitoring AI rule compliance.

Purpose: visibility and improvement, not enforcement and punishment
Freshness: green (current), yellow (1 behind), red (2+ behind), gray (no rules)
Alerts: Slack notification at 2+ versions behind. EM escalation at 3+ for 2+ weeks
Override rate: per-rule tracking. >20% = investigate the rule. Never: stricter enforcement of bad rules
Acting on data: high override → improve the rule. Outdated repo → notify and support the team
Dashboard views: executive (one number), team (per-team %), detailed (per-repo status)
Anti-patterns: no individual compliance tracking, no deployment blocking, no team ranking
Data sources: rulesync status, override logs, quarterly developer surveys