Imputation DiD (Borusyak et al. 2024)#

Efficient imputation estimator for staggered Difference-in-Differences.

This module implements the methodology from Borusyak, Jaravel & Spiess (2024), “Revisiting Event-Study Designs: Robust and Efficient Estimation”, Review of Economic Studies.

The estimator:

  1. Runs OLS on untreated observations to estimate unit + time fixed effects

  2. Imputes counterfactual Y(0) for treated observations

  3. Aggregates imputed treatment effects with researcher-chosen weights

Inference uses the conservative clustered variance estimator from Theorem 3.

When to use ImputationDiD:

  • Staggered adoption settings where treatment effects may be homogeneous across cohorts and time — produces ~50% shorter CIs than Callaway-Sant’Anna

  • When you want to use all untreated observations (never-treated + not-yet-treated) for maximum efficiency

  • As a complement to Callaway-Sant’Anna or Sun-Abraham: if all three agree, results are robust; if they disagree, investigate heterogeneity

Reference: Borusyak, K., Jaravel, X., & Spiess, J. (2024). Revisiting Event-Study Designs: Robust and Efficient Estimation. Review of Economic Studies, 91(6), 3253-3285.

ImputationDiD#

Main estimator class for imputation DiD estimation.

class diff_diff.ImputationDiD[source]

Bases: ImputationDiDBootstrapMixin

Borusyak-Jaravel-Spiess (2024) imputation DiD estimator.

This is the efficient estimator for staggered Difference-in-Differences under parallel trends. It produces shorter confidence intervals than Callaway-Sant’Anna (~50% shorter) and Sun-Abraham (2-3.5x shorter) under homogeneous treatment effects.

The estimation procedure: 1. Run OLS on untreated observations to estimate unit + time fixed effects 2. Impute counterfactual Y(0) for treated observations 3. Aggregate imputed treatment effects with researcher-chosen weights

Inference uses the conservative clustered variance estimator from Theorem 3 of the paper.

Parameters:
  • anticipation (int, default=0) – Number of periods before treatment where effects may occur.

  • alpha (float, default=0.05) – Significance level for confidence intervals.

  • cluster (str, optional) – Column name for cluster-robust standard errors. If None, clusters at the unit level by default.

  • n_bootstrap (int, default=0) – Number of bootstrap iterations. If 0, uses analytical inference (conservative variance from Theorem 3).

  • bootstrap_weights (str, default="rademacher") – Type of bootstrap weights: “rademacher”, “mammen”, or “webb”.

  • seed (int, optional) – Random seed for reproducibility.

  • rank_deficient_action (str, default="warn") – Action when design matrix is rank-deficient: - “warn”: Issue warning and drop linearly dependent columns - “error”: Raise ValueError - “silent”: Drop columns silently

  • horizon_max (int, optional) – Maximum event-study horizon. If set, event study effects are only computed for abs(h) <= horizon_max.

  • aux_partition (str, default="cohort_horizon") – Controls the auxiliary model partition for Theorem 3 variance: - “cohort_horizon”: Groups by cohort x relative time (tightest SEs) - “cohort”: Groups by cohort only (more conservative) - “horizon”: Groups by relative time only (more conservative)

  • pretrends (bool, default=False) – If True, event study includes pre-treatment horizons for visual pre-trends assessment. Pre-period effects should be ~0 under parallel trends. Only affects event_study aggregation; overall ATT and group aggregation are unchanged.

results_

Estimation results after calling fit().

Type:

ImputationDiDResults

is_fitted_

Whether the model has been fitted.

Type:

bool

Examples

Basic usage:

>>> from diff_diff import ImputationDiD, generate_staggered_data
>>> data = generate_staggered_data(n_units=200, seed=42)
>>> est = ImputationDiD()
>>> results = est.fit(data, outcome='outcome', unit='unit',
...                   time='time', first_treat='first_treat')
>>> results.print_summary()

With event study:

>>> est = ImputationDiD()
>>> results = est.fit(data, outcome='outcome', unit='unit',
...                   time='time', first_treat='first_treat',
...                   aggregate='event_study')
>>> from diff_diff import plot_event_study
>>> plot_event_study(results)

Notes

The imputation estimator uses ALL untreated observations (never-treated + not-yet-treated periods of eventually-treated units) to estimate the counterfactual model. There is no control_group parameter because this is fundamental to the method’s efficiency.

References

Borusyak, K., Jaravel, X., & Spiess, J. (2024). Revisiting Event-Study Designs: Robust and Efficient Estimation. Review of Economic Studies, 91(6), 3253-3285.

Methods

fit(data, outcome, unit, time, first_treat)

Fit the imputation DiD estimator.

get_params()

Get estimator parameters (sklearn-compatible).

set_params(**params)

Set estimator parameters (sklearn-compatible).

__init__(anticipation=0, alpha=0.05, cluster=None, n_bootstrap=0, bootstrap_weights='rademacher', seed=None, rank_deficient_action='warn', horizon_max=None, aux_partition='cohort_horizon', pretrends=False)[source]
Parameters:
  • anticipation (int)

  • alpha (float)

  • cluster (str | None)

  • n_bootstrap (int)

  • bootstrap_weights (str)

  • seed (int | None)

  • rank_deficient_action (str)

  • horizon_max (int | None)

  • aux_partition (str)

  • pretrends (bool)

anticipation: int
alpha: float
n_bootstrap: int
bootstrap_weights: str
seed: int | None
horizon_max: int | None
results_: ImputationDiDResults | None
fit(data, outcome, unit, time, first_treat, covariates=None, aggregate=None, balance_e=None, survey_design=None)[source]

Fit the imputation DiD estimator.

Parameters:
  • data (pd.DataFrame) – Panel data with unit and time identifiers.

  • outcome (str) – Name of outcome variable column.

  • unit (str) – Name of unit identifier column.

  • time (str) – Name of time period column.

  • first_treat (str) – Name of column indicating when unit was first treated. Use 0 (or np.inf) for never-treated units.

  • covariates (list of str, optional) – List of covariate column names.

  • aggregate (str, optional) – Aggregation mode: None/”simple” (overall ATT only), “event_study”, “group”, or “all”.

  • balance_e (int, optional) – When computing event study, restrict to cohorts observed at all relative times in [-balance_e, max_h].

  • survey_design (SurveyDesign, optional) – Survey design specification for design-based inference. Supports pweight only (aweight/fweight raise ValueError). Supports strata, PSU, and FPC for design-based variance via compute_survey_if_variance(). Strata enters survey df for t-distribution inference. Both analytical (n_bootstrap=0) and bootstrap inference are supported.

Returns:

Object containing all estimation results.

Return type:

ImputationDiDResults

Raises:

ValueError – If required columns are missing or data validation fails.

get_params()[source]

Get estimator parameters (sklearn-compatible).

Return type:

Dict[str, Any]

set_params(**params)[source]

Set estimator parameters (sklearn-compatible).

Return type:

ImputationDiD

summary()[source]

Get summary of estimation results.

Return type:

str

print_summary()[source]

Print summary to stdout.

Return type:

None

ImputationDiDResults#

Results container for imputation DiD estimation.

class diff_diff.ImputationDiDResults[source]

Bases: object

Results from Borusyak-Jaravel-Spiess (2024) imputation DiD estimation.

treatment_effects

Unit-level treatment effects with columns: unit, time, tau_hat, weight.

Type:

pd.DataFrame

overall_att

Overall average treatment effect on the treated.

Type:

float

overall_se

Standard error of overall ATT.

Type:

float

overall_t_stat

T-statistic for overall ATT.

Type:

float

overall_p_value

P-value for overall ATT.

Type:

float

overall_conf_int

Confidence interval for overall ATT.

Type:

tuple

event_study_effects

Dictionary mapping relative time h to effect dict with keys: ‘effect’, ‘se’, ‘t_stat’, ‘p_value’, ‘conf_int’, ‘n_obs’.

Type:

dict, optional

group_effects

Dictionary mapping cohort g to effect dict.

Type:

dict, optional

groups

List of treatment cohorts.

Type:

list

time_periods

List of all time periods.

Type:

list

n_obs

Total number of observations.

Type:

int

n_treated_obs

Number of treated observations (\(|\Omega_1|\)).

Type:

int

n_untreated_obs

Number of untreated observations (\(|\Omega_0|\)).

Type:

int

n_treated_units

Number of ever-treated units.

Type:

int

n_control_units

Number of units contributing to Omega_0.

Type:

int

alpha

Significance level used.

Type:

float

pretrend_results

Populated by pretrend_test().

Type:

dict, optional

bootstrap_results

Bootstrap inference results.

Type:

ImputationBootstrapResults, optional

Methods

summary([alpha])

Generate formatted summary of estimation results.

print_summary([alpha])

Print summary to stdout.

to_dataframe([level])

Convert results to DataFrame.

pretrend_test([n_leads])

Run a pre-trend test (Equation 9 of Borusyak et al. 2024).

treatment_effects: DataFrame
overall_att: float
overall_se: float
overall_t_stat: float
overall_p_value: float
overall_conf_int: Tuple[float, float]
event_study_effects: Dict[int, Dict[str, Any]] | None
group_effects: Dict[Any, Dict[str, Any]] | None
groups: List[Any]
time_periods: List[Any]
n_obs: int
n_treated_obs: int
n_untreated_obs: int
n_treated_units: int
n_control_units: int
alpha: float = 0.05
anticipation: int = 0
pretrend_results: Dict[str, Any] | None = None
bootstrap_results: ImputationBootstrapResults | None = None
survey_metadata: Any | None = None
property att: float
property se: float
property conf_int: Tuple[float, float]
property p_value: float
property t_stat: float
__repr__()[source]

Concise string representation.

Return type:

str

property coef_var: float

SE / abs(overall ATT). NaN when ATT is 0 or SE non-finite.

Type:

Coefficient of variation

summary(alpha=None)[source]

Generate formatted summary of estimation results.

Parameters:

alpha (float, optional) – Significance level. Defaults to alpha used in estimation.

Returns:

Formatted summary.

Return type:

str

print_summary(alpha=None)[source]

Print summary to stdout.

Parameters:

alpha (float | None)

Return type:

None

to_dataframe(level='observation')[source]

Convert results to DataFrame.

Parameters:

level (str, default="observation") – Level of aggregation: - “observation”: Unit-level treatment effects - “event_study”: Event study effects by relative time - “group”: Group (cohort) effects

Returns:

Results as DataFrame.

Return type:

pd.DataFrame

pretrend_test(n_leads=None)[source]

Run a pre-trend test (Equation 9 of Borusyak et al. 2024).

Adds pre-treatment lead indicators to the Step 1 OLS and tests their joint significance via a Wald F-test (cluster-robust, or design-based survey VCV when survey_design was provided at fit).

Parameters:

n_leads (int, optional) – Number of pre-treatment leads to include. If None, uses all available pre-treatment periods minus one (for the reference period).

Returns:

Dictionary with keys: ‘f_stat’, ‘p_value’, ‘df’, ‘n_leads’, ‘lead_coefficients’.

Return type:

dict

property is_significant: bool

Check if overall ATT is significant.

property significance_stars: str

Significance stars for overall ATT.

__init__(treatment_effects, overall_att, overall_se, overall_t_stat, overall_p_value, overall_conf_int, event_study_effects, group_effects, groups, time_periods, n_obs, n_treated_obs, n_untreated_obs, n_treated_units, n_control_units, alpha=0.05, anticipation=0, pretrend_results=None, bootstrap_results=None, _estimator_ref=None, survey_metadata=None)
Parameters:
Return type:

None

ImputationBootstrapResults#

Bootstrap inference results.

class diff_diff.ImputationBootstrapResults[source]

Bases: object

Results from ImputationDiD bootstrap inference.

Bootstrap is a library extension beyond Borusyak et al. (2024), which proposes only analytical inference via the conservative variance estimator. Provided for consistency with CallawaySantAnna and SunAbraham.

n_bootstrap

Number of bootstrap iterations.

Type:

int

weight_type

Type of bootstrap weights: “rademacher”, “mammen”, or “webb”.

Type:

str

alpha

Significance level used for confidence intervals.

Type:

float

overall_att_se

Bootstrap standard error for overall ATT.

Type:

float

overall_att_ci

Bootstrap confidence interval for overall ATT.

Type:

tuple

overall_att_p_value

Bootstrap p-value for overall ATT.

Type:

float

event_study_ses

Bootstrap SEs for event study effects.

Type:

dict, optional

event_study_cis

Bootstrap CIs for event study effects.

Type:

dict, optional

event_study_p_values

Bootstrap p-values for event study effects.

Type:

dict, optional

group_ses

Bootstrap SEs for group effects.

Type:

dict, optional

group_cis

Bootstrap CIs for group effects.

Type:

dict, optional

group_p_values

Bootstrap p-values for group effects.

Type:

dict, optional

bootstrap_distribution

Full bootstrap distribution of overall ATT.

Type:

np.ndarray, optional

n_bootstrap: int
weight_type: str
alpha: float
overall_att_se: float
overall_att_ci: Tuple[float, float]
overall_att_p_value: float
event_study_ses: Dict[int, float] | None = None
event_study_cis: Dict[int, Tuple[float, float]] | None = None
event_study_p_values: Dict[int, float] | None = None
group_ses: Dict[Any, float] | None = None
group_cis: Dict[Any, Tuple[float, float]] | None = None
group_p_values: Dict[Any, float] | None = None
bootstrap_distribution: ndarray | None = None
__init__(n_bootstrap, weight_type, alpha, overall_att_se, overall_att_ci, overall_att_p_value, event_study_ses=None, event_study_cis=None, event_study_p_values=None, group_ses=None, group_cis=None, group_p_values=None, bootstrap_distribution=None)
Parameters:
Return type:

None

Convenience Function#

diff_diff.imputation_did(data, outcome, unit, time, first_treat, covariates=None, aggregate=None, balance_e=None, survey_design=None, **kwargs)[source]#

Convenience function for imputation DiD estimation.

This is a shortcut for creating an ImputationDiD estimator and calling fit().

Parameters:
  • data (pd.DataFrame) – Panel data.

  • outcome (str) – Outcome variable column name.

  • unit (str) – Unit identifier column name.

  • time (str) – Time period column name.

  • first_treat (str) – Column indicating first treatment period (0 for never-treated).

  • covariates (list of str, optional) – Covariate column names.

  • aggregate (str, optional) – Aggregation mode: None, “simple”, “event_study”, “group”, “all”.

  • balance_e (int, optional) – Balance event study to cohorts observed at all relative times.

  • survey_design (SurveyDesign, optional) – Survey design specification for design-based inference. Supports pweight only (aweight/fweight raise ValueError). Supports strata, PSU, and FPC for design-based variance. Strata enters survey df for t-distribution inference. Both analytical (n_bootstrap=0) and bootstrap inference are supported.

  • **kwargs – Additional keyword arguments passed to ImputationDiD constructor.

Returns:

Estimation results.

Return type:

ImputationDiDResults

Examples

>>> from diff_diff import imputation_did, generate_staggered_data
>>> data = generate_staggered_data(seed=42)
>>> results = imputation_did(data, 'outcome', 'unit', 'time', 'first_treat',
...                          aggregate='event_study')
>>> results.print_summary()

Example Usage#

Basic usage:

from diff_diff import ImputationDiD, generate_staggered_data

data = generate_staggered_data(n_units=200, seed=42)
est = ImputationDiD()
results = est.fit(data, outcome='outcome', unit='unit',
                  time='period', first_treat='first_treat')
results.print_summary()

Event study with visualization:

from diff_diff import ImputationDiD, plot_event_study

est = ImputationDiD()
results = est.fit(data, outcome='outcome', unit='unit',
                  time='period', first_treat='first_treat',
                  aggregate='event_study')
plot_event_study(results)

Pre-trend test:

results = est.fit(data, outcome='outcome', unit='unit',
                  time='period', first_treat='first_treat')
pt = results.pretrend_test(n_leads=3)
print(f"F-stat: {pt['f_stat']:.3f}, p-value: {pt['p_value']:.4f}")

Comparison with other estimators:

from diff_diff import ImputationDiD, CallawaySantAnna, SunAbraham

# All three should agree under homogeneous effects
imp = ImputationDiD().fit(data, ...)
cs = CallawaySantAnna().fit(data, ...)
sa = SunAbraham().fit(data, ...)

print(f"Imputation ATT: {imp.overall_att:.3f} (SE: {imp.overall_se:.3f})")
print(f"CS ATT: {cs.overall_att:.3f} (SE: {cs.overall_se:.3f})")
print(f"SA ATT: {sa.overall_att:.3f} (SE: {sa.overall_se:.3f})")