Research Unit Tests
Structured quality checks for academic research papers — analogous to unit tests in software engineering.
Tests range from deterministic checks — does the replication package run? — to judgment calls — is the contribution interesting? Each test specifies what to check, how an agent should reason about it, and what constitutes a pass.
Universal (10)
| Test | Severity | Clarity | Scope |
|---|---|---|---|
| Abstract, introduction, and results internally consistent | blocker | heuristic | paper |
| Project is feasible given stated resources and timeline | blocker | judgment | proposal |
| Contribution is interesting to the target audience | blocker | judgment | paper, proposal |
| Contribution is new relative to existing literature | blocker | judgment | paper, proposal |
| Effect sizes reported with economic significance assessment | blocker | heuristic | paper |
| OLS/correlational papers address omitted variable bias | blocker | judgment | paper |
| Replication package reproduces all main results | blocker | deterministic | replication |
| Main results accompanied by robustness checks | warning | heuristic | paper |
| Standard errors clustered at the right level | blocker | heuristic | paper |
| Regression tables report number of observations | blocker | deterministic | paper |
Difference-in-Differences (3)
| Test | Severity | Clarity | Scope |
|---|---|---|---|
| DiD: Pre-trends visualization shown and plausible | blocker | heuristic | paper |
| DiD: Placebo/falsification test reported | warning | heuristic | paper |
| DiD: Staggered adoption uses heterogeneity-robust estimator | blocker | heuristic | paper |
Regression Discontinuity (3)
| Test | Severity | Clarity | Scope |
|---|---|---|---|
| RDD: Estimates robust to bandwidth choice | blocker | heuristic | paper |
| RDD: Pre-determined covariates smooth at cutoff | blocker | heuristic | paper |
| RDD: No manipulation of running variable (density test) | blocker | heuristic | paper |
Instrumental Variables (3)
| Test | Severity | Clarity | Scope |
|---|---|---|---|
| IV: Exclusion restriction explicitly argued | blocker | judgment | paper, proposal |
| IV: First-stage F-statistic reported and sufficient | blocker | deterministic | paper |
| IV: Reduced form reported alongside IV estimates | warning | deterministic | paper |
Synthetic Control (3)
| Test | Severity | Clarity | Scope |
|---|---|---|---|
| Synth: Donor pool selection justified | blocker | judgment | paper |
| Synth: In-space and/or in-time placebo tests reported | blocker | heuristic | paper |
| Synth: Single-unit design limitations acknowledged | warning | heuristic | paper |
Lab & Online Experiments (3)
| Test | Severity | Clarity | Scope |
|---|---|---|---|
| Experiment: Attrition and differential attrition tested | blocker | heuristic | paper |
| Experiment: Baseline covariate balance table reported | blocker | deterministic | paper |
| Experiment: Power calculation reported or MDE stated | warning | heuristic | paper, proposal |
Field Experiments (4)
| Test | Severity | Clarity | Scope |
|---|---|---|---|
| Field experiment: Spillover effects addressed | blocker | judgment | paper |
| Experiment: Attrition and differential attrition tested | blocker | heuristic | paper |
| Experiment: Baseline covariate balance table reported | blocker | deterministic | paper |
| Experiment: Power calculation reported or MDE stated | warning | heuristic | paper, proposal |
Theory (3)
| Test | Severity | Clarity | Scope |
|---|---|---|---|
| Theory: All model assumptions stated explicitly | blocker | heuristic | paper |
| Theory: Economic intuition provided for main results | warning | judgment | paper |
| Theory: Main results formally proven, not just stated | blocker | deterministic | paper |