Why OCaml Needs Specific AI Rules
OCaml has one of the most powerful type systems in any mainstream language — Hindley-Milner type inference, algebraic data types, parametric polymorphism, and a module system with functors. AI assistants, trained predominantly on imperative languages, generate OCaml that ignores all of these strengths: mutable references where immutable let-bindings work, string matching instead of variant types, and monolithic modules instead of OCaml's signature-based abstraction.
The gap between 'OCaml that compiles' and 'OCaml that leverages the type system for safety' is enormous. A well-typed OCaml program encodes invariants in the type system — making entire categories of bugs impossible at compile time. AI-generated OCaml that relies on runtime checks instead of type-level safety misses the entire point of the language.
OCaml is used in finance (Jane Street), systems (MirageOS), formal verification (Coq), and developer tooling (Flow, Reason). These rules target practical OCaml with an emphasis on type-driven design.
Rule 1: Type-Driven Design with Algebraic Types
The rule: 'Model domain concepts with algebraic data types. Use variant types (sum types) for values that can be one of several forms: type shape = Circle of float | Rectangle of float * float | Triangle of float * float * float. Use record types for compound data: type user = { name: string; age: int; email: string }. Use the type system to make illegal states unrepresentable.'
For option and result: 'Use Option for values that might not exist — never use exceptions or sentinel values for expected absence. Use Result for operations that can fail: type ("a, "e) result = Ok of "a | Error of "e. Pattern match on both Option and Result exhaustively — the compiler catches missing cases.'
For phantom types: 'Use phantom types for compile-time state tracking: type "a file_handle (where "a can be open_state or closed_state). This prevents calling read on a closed file handle at compile time — no runtime check needed.'
- Variant types for domain modeling — make illegal states unrepresentable
- Record types for compound data — named fields over tuples for clarity
- Option for absence — never exceptions for expected 'not found'
- Result for fallible operations — never exceptions for expected failures
- Phantom types for compile-time state tracking when applicable
OCaml's variant types let you make illegal states unrepresentable at the type level. If a function can return a user or an error, encode it as a Result — not an exception. The compiler enforces handling of both cases.
Rule 2: Exhaustive Pattern Matching
The rule: 'Use pattern matching for all control flow involving variant types. Match exhaustively — never use wildcard (_) to suppress incomplete match warnings unless you can prove the remaining cases are impossible. Nested patterns are fine: match expr with Add (Int a, Int b) -> Int (a + b). Use when guards sparingly — prefer encoding conditions in the type when possible.'
For destructuring: 'Destructure in let bindings: let { name; age; _ } = user. Destructure in function parameters: let greet { name; _ } = Printf.printf "Hello, %s" name. Use as for binding the whole and parts: match list with (x :: _ as whole) -> ...'
The compiler's exhaustiveness checker is OCaml's superpower — it guarantees every case is handled. Wildcard patterns silence the checker, hiding unhandled cases. Your rule should force the AI to handle every variant explicitly.
Wildcard patterns (_) silence the exhaustiveness checker — hiding unhandled cases that become runtime crashes. Force the AI to handle every variant explicitly. The compiler is your ally, not an obstacle.
Rule 3: Module System and Signatures
The rule: 'Define module signatures (interfaces) for all public modules: module type USER_SERVICE = sig val find : int -> User.t option val create : string -> int -> User.t end. Hide implementation details behind signatures — only expose what consumers need. Use .mli files for module interfaces. Keep modules focused — one responsibility per module.'
For functors: 'Use functors for parameterized modules: module Make (DB : DATABASE) : USER_SERVICE. Functors enable dependency injection at the module level — the database implementation is a parameter, not a hardcoded dependency. This is OCaml's answer to interfaces and dependency injection.'
For first-class modules: 'Use first-class modules (module packing) for runtime polymorphism when needed: let db = (module PostgresDB : DATABASE). Prefer static module composition (functors) over first-class modules — use first-class only when the module choice is genuinely determined at runtime.'
Rule 4: Immutability and Effect Isolation
The rule: 'Prefer immutable values (let) over mutable references (ref). Use immutable data structures from the standard library: List, Map, Set. When mutation is needed, isolate it: mutable state lives in a small, well-defined scope — not spread across the program. Use the State monad or explicit state passing for stateful computations.'
For I/O: 'Separate pure computation from I/O. Pure functions take data and return data — no side effects. I/O happens at the program boundary: reading files, network calls, database access. Use Lwt or Async for concurrent I/O — never blocking I/O in concurrent contexts.'
AI assistants reach for mutable references because they're familiar from other languages. Immutable let-bindings are the default in OCaml — mutation is the exception, not the rule. Code with less mutation is easier to reason about, test, and parallelize.
OCaml's let bindings are immutable. Mutable ref is available but should be the exception. Less mutation = easier to reason about, test, and parallelize. AI reaches for ref from imperative habits — rules redirect to let.
Rule 5: Build System and Dependencies
The rule: 'Use dune for builds — every project has a dune-project file and dune files in source directories. Use opam for dependency management: create an .opam file for the project, pin dependencies in opam lock files. Use ocamlformat for formatting: commit a .ocamlformat file with the project's style. Run dune build, dune test, and dune fmt in CI.'
For project structure: 'lib/ for library code (compiled as a library), bin/ for executables, test/ for tests. Each directory has a dune file. The dune-project file at the root defines the project name and OCaml version. Use libraries and public_name in dune for module organization.'
For testing: 'Use Alcotest or OUnit for unit tests. Use ppx_expect for inline expect tests (snapshot-style). Use ppx_inline_test for lightweight inline tests. Run tests with dune runtest. Use bisect_ppx for code coverage.'
- dune for builds — dune-project + dune files in each directory
- opam for dependencies — .opam file, lock for reproducibility
- ocamlformat for formatting — .ocamlformat committed to repo
- lib/ for libraries, bin/ for executables, test/ for tests
- Alcotest or ppx_expect for testing — bisect_ppx for coverage
- dune build + dune test + dune fmt in CI
Complete OCaml Rules Template
Consolidated rules for OCaml projects.
- Algebraic types for domain modeling — make illegal states unrepresentable
- Exhaustive pattern matching — no wildcard suppression of incomplete matches
- Option for absence, Result for failures — never exceptions for expected cases
- Module signatures (.mli) for all public modules — functors for parameterization
- Immutable by default — ref only in isolated, well-defined scopes
- Pure computation separated from I/O — Lwt/Async for concurrency
- dune for builds — opam for deps — ocamlformat for style
- Alcotest/ppx_expect for testing — bisect_ppx for coverage — CI runs all