Notebooks¶
Interactive notebooks live in notebooks/ at the repo root. They are the primary narrative
format for investigations — code, charts, and commentary together.
Viewing notebooks¶
GitHub renders .ipynb files natively. Click any link below to read the notebook
in your browser without cloning.
Execution note
These notebooks load artifacts from specific experiment runs (run IDs and artifact paths
are hardcoded to the original research environment). To re-run them end-to-end, first
reproduce the relevant sweep with mech sweep --execute, then update the artifact path
constants at the top of each notebook.
Notebook index¶
| Notebook | Investigation | Status |
|---|---|---|
| 06 — SAE Replication Crisis | Investigation #3 | Complete — pairwise cosine heatmaps, distribution histograms, 4-condition comparison |
| 07 — GPT-2 Factual Recall Story | Investigation #2 | Complete — logit lens, DLA, circuit patching, SAE feature cluster |
| 08 — Feature Splitting | Investigation #4 | Complete — 4-size comparison, fidelity bar chart |
| 09 — Refusal Audit | Investigation #1 | Complete — direction quality sweep, CAA compliance plots |
| 10 — SAE at Scale | Investigation #5 | Complete — 2048-feature dictionary, feature cluster analysis |
| 05 — Research Walkthrough | General | Tutorial — walks through a full experiment from spec to artifact |
| 01 — Circuit Patching | General | Tutorial — activation patching basics |
| 02 — Polysemanticity SAE | General | Tutorial — train and inspect a small SAE |
| 03 — ACDC Lite | General | Tutorial — automated circuit discovery |
| 04 — Agentic Loop | General | Tutorial — orchestration and iterate loop |
Reproducing a notebook from scratch¶
SAE Replication Crisis (notebook 06)¶
# 1. Train 5 SAEs at layer 0
mech sweep \
--base experiments/polysemanticity.yaml \
--axis "parameters.seed=1,2,3,4,5" \
--output experiments/sweeps/sae_seed_stability_layer0.yaml \
--execute
# 2. Generate the stability report
mech analyze-sae-stability \
--sweep experiments/sweeps/sae_seed_stability_layer0.yaml \
--output artifacts/stability_layer0_128.json \
--live-only
# 3. Update REPORT_PATH in notebook cell 1 to point at your new artifact
# 4. jupyter nbconvert --execute --to notebook --inplace notebooks/06_sae_replication_crisis.ipynb
GPT-2 Factual Recall (notebook 07)¶
# Reproduce the six runs (adjust run IDs after execution)
mech run --name logit-lens-factual
mech run --name direct-logit-attribution-factual
mech run --name attribution-patching-factual
mech run --name circuit-patching-factual
mech run --name polysemanticity-sae-layer9-factual
mech run --name causal-scrubbing-factual
# Then update run ID constants at top of notebook 07