Mechanistic Interpretability Benchmark

NeuroMIB: Benchmarking Causal Interpretability in Neural Dynamics

Official benchmark portal with EvalAI-hosted evaluation.

NeuroMIB evaluates whether interpretability methods recover causal latent variables, mechanism classes, computational support, and intervention effects from synthetic neural population dynamics with known ground truth.

Get Started Submission Guide Leaderboard Repository

Why NeuroMIB

NeuroMIB checks whether explanation methods recover latent structure and mechanisms, not just predictive signals.

Mechanism-aware synthetic generators with hidden causal metadata
EvalAI-hosted public/private phases for anti-overfitting evaluation
Intervention-heavy scoring to reward causal validity

Website Sections

Benchmark: tasks, metric weighting, and mechanism families
Data: modalities and instance/schema contract
Leaderboard: EvalAI-backed rankings with public/private phases
Docs: submission workflow, validation, and EvalAI deployment notes

Suggested flow: Benchmark -> Generate -> Submit -> Evaluate -> Compare.

Quick Start

1. Explore the Benchmark

Review tasks, families, and metric weighting before designing your interpretability method.

Open Benchmark

2. Prepare a Submission

Follow the schema and validation commands to ensure your artifacts are accepted.

Open Docs

3. Compare Performance

Track method quality across latent, mechanism, support, and intervention criteria.

Open Leaderboard