Mechanistic Interpretability Benchmark

NeuroMIB: Benchmarking Causal Interpretability in Neural Dynamics

Official benchmark portal with EvalAI-hosted evaluation.

NeuroMIB evaluates whether interpretability methods recover causal latent variables, mechanism classes, computational support, and intervention effects from synthetic neural population dynamics with known ground truth.

Why NeuroMIB

NeuroMIB checks whether explanation methods recover latent structure and mechanisms, not just predictive signals.

  • Mechanism-aware synthetic generators with hidden causal metadata
  • EvalAI-hosted public/private phases for anti-overfitting evaluation
  • Intervention-heavy scoring to reward causal validity

Website Sections

  • Benchmark: tasks, metric weighting, and mechanism families
  • Data: modalities and instance/schema contract
  • Leaderboard: EvalAI-backed rankings with public/private phases
  • Docs: submission workflow, validation, and EvalAI deployment notes

Suggested flow: Benchmark -> Generate -> Submit -> Evaluate -> Compare.

Quick Start

1. Explore the Benchmark

Review tasks, families, and metric weighting before designing your interpretability method.

Open Benchmark

2. Prepare a Submission

Follow the schema and validation commands to ensure your artifacts are accepted.

Open Docs

3. Compare Performance

Track method quality across latent, mechanism, support, and intervention criteria.

Open Leaderboard