AI Rules for Platform Engineering Teams

Platform Engineering: Scaling Developer Experience

Platform engineering teams build the Internal Developer Platform (IDP): the set of tools, templates, and self-service capabilities that enable product teams to build, deploy, and operate software independently. The platform team's goal: make the right thing the easy thing. Golden paths: pre-configured, opinionated workflows that guide developers toward best practices. The AI generating code for platform services must follow and enforce these golden paths.

AI rules for platform engineering encode: infrastructure patterns (how services are provisioned, configured, and connected), service standards (naming conventions, API patterns, observability requirements), and automation conventions (CI/CD templates, deployment strategies, rollback procedures). When a developer asks the AI to create a new service: the AI should generate code that follows the platform's golden path, not a custom approach.

The platform team's relationship with AI rules: the platform team both consumes AI rules (using them to build platform services) and produces AI rules (the golden path patterns become AI rules for product teams). AI rule: 'Platform team AI rules serve two purposes: guide the platform team's own development AND define the patterns that product teams should follow when using the platform.'

Golden Path Patterns as AI Rules

Service creation golden path: when a developer creates a new microservice, the golden path defines: language and framework (e.g., Go with Echo, TypeScript with Fastify), project structure (cmd/, internal/, pkg/ for Go; src/routes/, src/services/ for TypeScript), configuration management (environment variables via the platform's config service), database provisioning (self-service database creation with standard naming), and CI/CD pipeline (standardized GitHub Actions or equivalent). AI rule: 'New service: follow the golden path template. Generate the standard project structure, standard configuration loading, standard health check endpoint, and standard CI/CD pipeline. Do not invent a custom project structure.'

Deployment golden path: the platform defines how services are deployed. Containerization (Dockerfile following the platform's base image and multi-stage build pattern), Kubernetes manifests (Helm chart or Kustomize following platform standards), deployment strategy (blue-green, canary, rolling — as defined by the platform), and observability (standard metrics, logging, and tracing configuration). AI rule: 'Deployment configuration: use the platform's base image, follow the Helm chart template, configure the standard deployment strategy, and include observability hooks. Do not create custom Dockerfiles or deployment scripts that diverge from the platform standard.'

The golden path is opinionated by design. It trades flexibility for consistency and speed. AI rule: 'The golden path is the default. Deviations require justification. The AI should generate golden-path-compliant code unless the developer explicitly requests a deviation and provides a reason. This is how platform engineering makes the right thing the easy thing.'

💡 The Golden Path Trades Flexibility for Speed

A developer who follows the golden path: creates a production-ready service in hours (template, deploy, monitor — done). A developer who deviates: spends days configuring custom infrastructure, debugging non-standard deployments, and missing observability hooks. The golden path is not a restriction — it is an acceleration. The AI should default to the golden path and only deviate when the developer explicitly explains why the standard approach does not work for their use case.

Self-Service Infrastructure Patterns

Self-service database: product teams request a database through the platform (API, CLI, or portal). The platform provisions it with: standard naming (team-service-environment), standard configuration (connection pooling, backup schedule, monitoring), access credentials (injected via secrets management), and the connection string (available through the config service). AI rule: 'Database creation: use the platform's self-service API or Terraform module. Do not create databases manually or with custom Terraform. The platform module handles: naming, security groups, backup, monitoring, and credential rotation.'

Self-service messaging: teams request message queues, event buses, or pub-sub topics through the platform. Standard configuration: dead letter queues, retry policies, monitoring, and alerting. AI rule: 'Messaging infrastructure: use the platform's queue/topic provisioning. Standard patterns: request-response (synchronous with timeout), event-driven (publish to topic, consumers subscribe), and command-query (separate write and read paths). Follow the platform's message schema conventions.'

Self-service networking: the platform manages service mesh, ingress, DNS, and certificates. Teams register their services with the platform and get: internal DNS name, mTLS certificates, load balancing, and rate limiting. AI rule: 'Networking: register with the platform's service registry. Use the provided service mesh for inter-service communication. Do not configure custom load balancers or certificates — the platform manages these.'

⚠️ No Manual Infrastructure Provisioning

When a developer manually creates a database in the AWS console: it is not tracked by the platform, not backed up by the standard schedule, not monitored by the standard alerting, and not included in the service catalog. It is invisible infrastructure that will cause problems during an incident. The AI must always use the platform's self-service provisioning (Terraform modules, platform API, or CLI). Manual infrastructure is a platform anti-pattern.

Service Catalog and Observability Standards

Service catalog: every service in the organization is registered in the catalog with: owner (team), tier (critical, standard, best-effort), dependencies (upstream and downstream services), SLOs (latency, availability, error rate targets), runbooks (how to respond to alerts), and on-call rotation. AI rule: 'New service: register in the service catalog before deploying to production. Include all required metadata. The AI should generate the catalog registration (Backstage entity file, service descriptor) alongside the service code.'

Observability standards: the platform defines how services are monitored. Metrics (RED metrics: Rate, Errors, Duration for every endpoint), Logging (structured JSON logs with correlation IDs, standard fields: service, timestamp, level, message, trace_id), Tracing (OpenTelemetry instrumentation for all inter-service calls). AI rule: 'Every service: RED metrics on every endpoint, structured logging with correlation IDs, and OpenTelemetry tracing. The platform provides libraries and middleware that implement these — the AI should use the platform's observability libraries, not vendor-specific SDKs.'

SLO-driven development: every service has SLOs (Service Level Objectives). The AI should generate code that is designed to meet the SLOs. AI rule: 'Before generating a new endpoint: understand the SLO targets (p99 latency < 200ms, availability > 99.9%). Design the implementation to meet these targets: use caching for read-heavy endpoints, async processing for long operations, and circuit breakers for external dependencies. The AI generates performance-aware code, not just functionally correct code.'

ℹ️ Register Before You Deploy

The service catalog is the organization's inventory of what is running. A service deployed to production without catalog registration: cannot be found during an incident (who owns this?), has no SLO (is it degraded or is this normal?), has no runbook (how do we fix it?), and has no on-call rotation (who do we page?). The AI should generate the catalog registration (Backstage entity file) as part of the service creation, not as a post-deployment afterthought.

Platform Engineering AI Rules Summary

Summary of AI rules for platform engineering teams building internal developer platforms.

Golden path: opinionated defaults for service creation, deployment, and configuration. AI follows them
Project structure: standard templates per language/framework. No custom project layouts
Deployment: platform base images, standard Helm charts, defined deployment strategies
Self-service: databases, messaging, networking through platform APIs/modules. No manual provisioning
Service catalog: every service registered with owner, tier, SLOs, dependencies, runbooks
Observability: RED metrics, structured JSON logs, OpenTelemetry tracing. Use platform libraries
SLO-driven: code designed to meet latency and availability targets. Performance-aware by default
Platform rules: both guide platform development AND define product team patterns