Rule Writing

CLAUDE.md for Jupyter Notebooks

Notebooks are the worst environment for reproducibility. AI rules for cell ordering, state management, parameterization, and converting notebooks to production code.

7 min read·December 22, 2025

If Restart & Run All fails, the notebook is broken — period

Linear execution, parameterization, output management, and notebook-to-production patterns

Why Jupyter Notebooks Need AI Rules

Jupyter notebooks are the most popular environment for data science and ML — and the worst environment for reproducibility. Notebooks allow out-of-order execution, hidden global state, and tangled dependencies between cells. AI assistants make these problems worse by generating cells that depend on invisible state from previous executions, creating circular dependencies, and producing notebooks that only work when cells are run in the exact right order.

The core challenge: notebooks mix exploration (trying things, visualizing data, iterating) with production intent (reproducible analysis, deployable models). AI rules can't prevent exploration — that's notebooks' value. But they can ensure that when the exploration is done, the notebook is in a state that can be reproduced by anyone.

These rules apply to Jupyter, JupyterLab, Google Colab, Kaggle notebooks, VS Code notebooks, and any IPython-based environment.

Rule 1: Linear Execution — Top to Bottom

The rule: 'Notebooks must execute correctly from top to bottom with Kernel → Restart & Run All. If a notebook doesn't produce correct results after a full restart and run, it's broken. Never rely on cells being run in a specific non-linear order. Never rely on state from a previous kernel session. Test with restart-and-run-all before committing.'

For cell ordering: 'Imports in the first cell. Configuration and constants in the second cell. Data loading in the third cell. Then transformation, analysis, and visualization in order. Each cell should be independently understandable — it reads its inputs from preceding cells, not from global state set by a cell 50 lines above.'

AI assistants generate cells that work when you run them interactively but fail on restart-and-run-all. The most common issue: a cell that depends on a variable defined in a cell below it, or a cell that was run twice and the second execution produces different state.

  • Restart & Run All must succeed — if it doesn't, the notebook is broken
  • Cell order: imports → config → data loading → transforms → analysis → viz
  • Each cell reads from preceding cells — never from invisible prior state
  • No circular dependencies between cells
  • Clear all outputs before committing (reduces diff noise in git)
⚠️ The Restart Test

If Restart & Run All doesn't produce correct results, the notebook is broken. Full stop. This is the single test that separates a reproducible analysis from a pile of cells that happened to work once.

Rule 2: Cell Design and Size

The rule: 'One concept per cell. A cell either: loads data, transforms data, creates a visualization, trains a model, or evaluates results — never more than one. Keep cells under 20 lines of code. Use markdown cells between code cells to explain what the next cell does and why. Every notebook section (data loading, EDA, modeling) starts with a markdown header cell.'

For naming: 'Define variables with descriptive names in the cell that creates them. Don't reuse variable names across cells for different purposes — raw_data in cell 3 should not be overwritten with different data in cell 15. Use suffixes for transformed data: raw_df → cleaned_df → features_df → results_df.'

AI assistants generate large cells with 50+ lines that load data, transform it, and create a plot all at once. This is impossible to debug, impossible to reuse, and impossible to understand when you come back to it a week later.

Rule 3: Parameterize Everything

The rule: 'All configurable values go in a parameters cell at the top of the notebook: file paths, date ranges, model hyperparameters, random seeds, output directories. Use papermill-compatible parameter cells (tagged with parameters) for automated execution. Never hardcode file paths, dates, or magic numbers in analysis cells.'

For file paths: 'Use pathlib.Path for all file paths. Define data directories as parameters: DATA_DIR = Path("data/raw"). Never use absolute paths. Use environment variables for paths that differ between machines: DATA_DIR = Path(os.environ.get("DATA_DIR", "data/raw")).'

For reproducibility: 'Set random seeds in the parameters cell: RANDOM_SEED = 42. Pass the seed to every random operation: np.random.seed(RANDOM_SEED), train_test_split(..., random_state=RANDOM_SEED). Document all parameters in a markdown cell: what each one controls and what reasonable values are.'

  • Parameters cell at top: file paths, dates, hyperparameters, seeds
  • papermill-compatible tagging for automated notebook execution
  • pathlib.Path for all file paths — never string concatenation
  • Random seeds set once, passed everywhere — documented in markdown
  • No magic numbers in analysis cells — all configurable from parameters
💡 Parameters Cell

All configurable values in one cell at the top: file paths, date ranges, random seeds, hyperparameters. Tag it for papermill. Changing the analysis is changing one cell, not hunting through 50 cells.

Rule 4: Output and Artifact Management

The rule: 'Clear all cell outputs before committing to version control — outputs inflate repository size and create merge conflicts. Save important figures explicitly: fig.savefig("figures/analysis_plot.png", dpi=300, bbox_inches="tight"). Save important dataframes: results.to_csv("outputs/results.csv", index=False). The notebook runs code; artifacts are saved to files.'

For version control: 'Use nbstripout as a pre-commit hook to automatically strip outputs before commit. Or use jupytext to save notebooks as .py files (percent format) alongside the .ipynb. The .py file diffs cleanly and is the source of truth; the .ipynb is generated.'

For large outputs: 'Never display full dataframes in cells — use .head(), .describe(), or .sample(). Never display high-resolution images inline — save to files and display thumbnails. Never load datasets larger than memory — use dask, vaex, or polars for out-of-core processing.'

Rule 5: From Notebook to Production Code

The rule: 'Notebooks are for exploration. Production code lives in .py files. When an analysis is finalized: extract reusable functions into a Python module (src/ directory), import them into the notebook, and have the notebook call functions rather than contain logic. The notebook becomes a thin orchestration layer that demonstrates the analysis.'

For the transition: 'Identify functions that are reusable: data loading, cleaning, feature engineering, model training, evaluation metrics. Move them to appropriately named modules: src/data.py, src/features.py, src/models.py. Add type hints and docstrings. Write tests for each function. The notebook imports and calls these modules.'

For deployment: 'Notebooks are never deployed directly. If the analysis needs to run on a schedule, convert it to a Python script (jupyter nbconvert --to script) or a Prefect/Dagster pipeline. If it needs to serve predictions, extract the model into a FastAPI endpoint. Notebooks are development environments, not production environments.'

Notebooks → Modules

When an analysis is done, extract reusable functions to .py modules. The notebook becomes a thin orchestration layer: import, call, display. Functions are testable, reusable, and version-controlled cleanly.

Complete Jupyter Notebook Rules Template

Consolidated rules for Jupyter notebooks in data science projects.

  • Restart & Run All must succeed — test before every commit
  • Linear execution: imports → config → data → transforms → analysis → output
  • One concept per cell — under 20 lines — markdown headers between sections
  • Parameters cell at top: paths, dates, seeds, hyperparameters — papermill-compatible
  • Clear outputs before commit — nbstripout hook or jupytext .py pairing
  • Save artifacts to files — never rely on cell output for important results
  • Extract reusable logic to .py modules — notebook calls functions, doesn't define them
  • Never deploy notebooks — convert to scripts or pipelines for production