Cockpit¶
The cockpit is a local FastAPI + Jinja2 web UI for browsing experiment runs, artifacts, and reports. It runs entirely on your machine — no data leaves your environment.
Launch¶
The cockpit is implemented in src/mech_interp/cockpit.py.
Views¶
Dashboard¶
/ — Summary cards: total runs, runs today, active queue items, last-run family.
Runs¶
/runs — Paginated table of all experiment runs with filtering by family, status,
and date range. Click any run to open the detail view.
Run detail¶
/runs/<run_id> — Full run metadata, metrics, spec, and artifact browser.
Shows environment provenance (Python version, package versions, git SHA).
Reports¶
/reports — Aggregate reports generated by mech report-runs. Each report is a
markdown-rendered HTML page with metric summaries and per-family findings.
Queue¶
/queue — The experiment queue: pending, running, completed, and failed items.
Queue items can be paused, resumed, and requeued from this view.
Timeline¶
/timeline — Chronological view of all runs with per-family color coding.
Useful for tracking a long sweep's progress.
Compare¶
/compare?a=<run_id>&b=<run_id> — Side-by-side metric diff between two runs.
Highlights changed hyperparameters and metric deltas.
SAE features¶
/runs/<run_id>/sae-features — For polysemanticity_sae runs: live feature count,
dead feature heatmap, top-5 activating prompts per feature, decoder cosine matrix.
Refusal direction¶
/runs/<run_id>/refusal-direction — For refusal_direction and caa_steering runs:
direction quality score, per-layer sweep plots, compliance-rate comparison.
Artifact browser¶
Every run's detail page includes a file browser for the artifacts/run-<id>/ directory.
JSON files are rendered as collapsible trees; .png files are shown inline.
Implementation¶
| File | Purpose |
|---|---|
src/mech_interp/cockpit.py |
FastAPI app, route handlers, template context |
src/mech_interp/cockpit_timeline.py |
Timeline view logic |
src/mech_interp/cockpit_compare.py |
Compare view logic |
src/mech_interp/templates/ |
Jinja2 HTML templates |