Research Program →

Mechanistic Validity

A four-paper philosophical program asking what a mechanism is, when the claim is warranted, when the term refers across systems, and when individual discoveries compose into understanding — plus a companion paper testing when those criteria are met across independent methods.

Mechanistic Views → Mechanistic Validity → Mechanistic Reference → Mechanistic Knowledge

What is a mechanism? → Is the claim warranted? → Does the term refer across systems? → When do discoveries compose into understanding?


Mechanistic Views

Mechanistic Views: A Five-Axis Ontology for Mechanistic Interpretability

Every mechanistic claim rests on an implicit answer to a prior question: what kind of thing is a mechanism? A mechanistic view answers five questions: ontology, identity, evidence, formalism, and target. The five axes form a dependency chain. An atlas of nine views classifies common methods by their implicit commitments. Of 17 surveyed open problems, 8 dissolve once the view is specified and 7 arise structurally from ceilings the field’s default operating point cannot transcend.


Mechanistic Validity

Mechanistic Validity: A Framework for Evaluating Mechanistic Claims About Neural Networks

Five validity lenses — construct, internal, external, measurement, and interpretive. Applied to 13 published results. The consistent finding: most work skips construct validity entirely.


Mechanistic Reference

Mechanistic Reference: When Does a Mechanism Term Pick Out the Same Thing?

A transport hierarchy specifies what evidence is required for a mechanism term to refer across systems. Inferential reach is strictly ordered. Five reference failure modes ordered by severity. The IOI circuit does not refer to the same object in GPT-2 and Pythia. Fifteen worked examples across interpretability, neuroscience, pharmacology, and genetics.


Mechanistic Knowledge

When Do Circuit Discoveries Compose Into Understanding?

Three gaps prevent composition of individual circuit discoveries into system-level understanding: no parcellation theory, no composition theory, no coverage metric. The IOI literature — ~20 published analyses with 78% head overlap — fails the composition criterion on all three dimensions.


Cross-View Invariance

Cross-View Invariance as a Realism Criterion for Mechanistic Interpretability

A mechanism claim’s invariance depth counts how many independent views support it. Validity is view-relative: the atomic unit is the triple (claim, view, evidence tier). The IOI literature provides quality-weighted coverage of ~5% of GPT-2 Small for a single task.

Causal Geometry

The central claim: the causal structure of a trained system is encoded in the geometry of its internal representations. Subspaces live on a Grassmannian. The boundary between linear and nonlinear causal variables is sharp.


Grokking and the Grassmannian Boundary

When Does Linear Causal Abstraction Work? Mapping the Boundary on the Grassmannian

14 modular arithmetic operations partition into Always / Stochastic / Never Grassmannian. Grokking determines which side you land on. Structured pi-SAE achieves IIA = 0.98 on IOI with cross-task transfer (IIA 0.82–0.96 on unseen templates).

Neuroscience

Population geometry and causal subspaces in brain-wide recordings. Standard similarity metrics give contradictory answers; geometric methods cut through the contradiction.


Bracket Norm

Bracket Norm Identifies Causally Important Brain Regions From Population Geometry

All 19 existing metrics collapse after controlling for neuron count. Bracket norm does not — BN/sqrt(n) stable across 25x range (CV = 2.0%). Top-3 and bottom-3 regions match optogenetic silencing (p = 0.0006). Validated on human ECoG.


Neural Population Geometry

Neural Geometry Is Not Metric-Neutral: Dimensionality, Dissociation, and Causal Subspaces in Brain-Wide Neuropixels Recordings

CKA and Procrustes anti-correlate at rho = -0.90 (IBL replication: -0.94). Dimensionality mediates the dissociation. Structured VAE finds causal subspaces 3.4x stronger than LDA. LDA is anti-correlated with optogenetic importance (rho = -0.73).

Clinical Epidemiology

Boundary conditions for geometric causal methods in clinical data — where they add signal over standard epidemiological tools and where they don’t.


Clinical Epidemiology

When Does Geometry Help Causal Inference? Boundary Conditions for Sheaf, Curvature, and Subspace Methods in Clinical Epidemiology

Berry phase = 1.85 (p < 0.001) reveals global inconsistency invisible to pairwise tests. H1 classifier: 14/17 Mendelian randomization pairs correct, zero false positives. Three-way DAG classification from ADNI longitudinal Alzheimer’s data. The boundary depends on dimensionality of confounding structure and nonlinearity of causal pathways.