Imputation DiD (Borusyak et al. 2024)#

Efficient imputation estimator for staggered Difference-in-Differences.

This module implements the methodology from Borusyak, Jaravel & Spiess (2024), “Revisiting Event-Study Designs: Robust and Efficient Estimation”, Review of Economic Studies.

The estimator:

Runs OLS on untreated observations to estimate unit + time fixed effects
Imputes counterfactual Y(0) for treated observations
Aggregates imputed treatment effects with researcher-chosen weights

Inference uses the conservative clustered variance estimator from Theorem 3.

When to use ImputationDiD:

Staggered adoption settings where treatment effects may be homogeneous across cohorts and time — produces ~50% shorter CIs than Callaway-Sant’Anna
When you want to use all untreated observations (never-treated + not-yet-treated) for maximum efficiency
As a complement to Callaway-Sant’Anna or Sun-Abraham: if all three agree, results are robust; if they disagree, investigate heterogeneity

Reference: Borusyak, K., Jaravel, X., & Spiess, J. (2024). Revisiting Event-Study Designs: Robust and Efficient Estimation. Review of Economic Studies, 91(6), 3253-3285.

ImputationDiD#

Main estimator class for imputation DiD estimation.

class diff_diff.ImputationDiD[source]

Bases: ImputationDiDBootstrapMixin

Borusyak-Jaravel-Spiess (2024) imputation DiD estimator.

This is the efficient estimator for staggered Difference-in-Differences under parallel trends. It produces shorter confidence intervals than Callaway-Sant’Anna (~50% shorter) and Sun-Abraham (2-3.5x shorter) under homogeneous treatment effects.

The estimation procedure: 1. Run OLS on untreated observations to estimate unit + time fixed effects 2. Impute counterfactual Y(0) for treated observations 3. Aggregate imputed treatment effects with researcher-chosen weights

Inference uses the conservative clustered variance estimator from Theorem 3 of the paper.

Parameters:

anticipation (int, default=0) – Number of periods before treatment where effects may occur.
alpha (float, default=0.05) – Significance level for confidence intervals.
cluster (str, optional) – Column name for cluster-robust standard errors. If None, clusters at the unit level by default.
n_bootstrap (int, default=0) – Number of bootstrap iterations. If 0, uses analytical inference (conservative variance from Theorem 3).
bootstrap_weights (str, default="rademacher") – Type of bootstrap weights: “rademacher”, “mammen”, or “webb”.
seed (int, optional) – Random seed for reproducibility.
rank_deficient_action (str, default="warn") – Action when design matrix is rank-deficient: - “warn”: Issue warning and drop linearly dependent columns - “error”: Raise ValueError - “silent”: Drop columns silently
horizon_max (int, optional) – Maximum event-study horizon. If set, event study effects are only computed for abs(h) <= horizon_max.
aux_partition (str, default="cohort_horizon") – Controls the auxiliary model partition for Theorem 3 variance: - “cohort_horizon”: Groups by cohort x relative time (tightest SEs) - “cohort”: Groups by cohort only (more conservative) - “horizon”: Groups by relative time only (more conservative)
pretrends (bool, default=False) – If True, event study includes pre-treatment horizons for visual pre-trends assessment. Pre-period effects should be ~0 under parallel trends. Only affects event_study aggregation; overall ATT and group aggregation are unchanged.

results_

Estimation results after calling fit().

Type:: ImputationDiDResults

is_fitted_

Whether the model has been fitted.

Type:: bool

Examples

Basic usage:

>>> from diff_diff import ImputationDiD, generate_staggered_data
>>> data = generate_staggered_data(n_units=200, seed=42)
>>> est = ImputationDiD()
>>> results = est.fit(data, outcome='outcome', unit='unit',
...                   time='time', first_treat='first_treat')
>>> results.print_summary()

With event study:

>>> est = ImputationDiD()
>>> results = est.fit(data, outcome='outcome', unit='unit',
...                   time='time', first_treat='first_treat',
...                   aggregate='event_study')
>>> from diff_diff import plot_event_study
>>> plot_event_study(results)

Notes

The imputation estimator uses ALL untreated observations (never-treated + not-yet-treated periods of eventually-treated units) to estimate the counterfactual model. There is no control_group parameter because this is fundamental to the method’s efficiency.

References

Borusyak, K., Jaravel, X., & Spiess, J. (2024). Revisiting Event-Study Designs: Robust and Efficient Estimation. Review of Economic Studies, 91(6), 3253-3285.

Methods

`fit`(data, outcome, unit, time, first_treat)	Fit the imputation DiD estimator.
`get_params`()	Get estimator parameters (sklearn-compatible).
`set_params`(**params)	Set estimator parameters (sklearn-compatible).

__init__(anticipation=0, alpha=0.05, cluster=None, n_bootstrap=0, bootstrap_weights='rademacher', seed=None, rank_deficient_action='warn', horizon_max=None, aux_partition='cohort_horizon', pretrends=False)[source]

Parameters:

anticipation (int)
alpha (float)
cluster (str | None)
n_bootstrap (int)
bootstrap_weights (str)
seed (int | None)
rank_deficient_action (str)
horizon_max (int | None)
aux_partition (str)
pretrends (bool)

anticipation: int

alpha: float

n_bootstrap: int

bootstrap_weights: str

seed: int | None

horizon_max: int | None

results_: ImputationDiDResults | None

fit(data, outcome, unit, time, first_treat, covariates=None, aggregate=None, balance_e=None, survey_design=None)[source]

Fit the imputation DiD estimator.

Parameters:

data (pd.DataFrame) – Panel data with unit and time identifiers.
outcome (str) – Name of outcome variable column.
unit (str) – Name of unit identifier column.
time (str) – Name of time period column.
first_treat (str) – Name of column indicating when unit was first treated. Use 0 (or np.inf) for never-treated units.
covariates (list of str, optional) – List of covariate column names.
aggregate (str, optional) – Aggregation mode: None/”simple” (overall ATT only), “event_study”, “group”, or “all”.
balance_e (int, optional) – When computing event study, restrict to cohorts observed at all relative times in [-balance_e, max_h].
survey_design (SurveyDesign, optional) – Survey design specification for design-based inference. Supports pweight only (aweight/fweight raise ValueError). Supports strata, PSU, and FPC for design-based variance via compute_survey_if_variance(). Strata enters survey df for t-distribution inference. Both analytical (n_bootstrap=0) and bootstrap inference are supported.

Returns:

Object containing all estimation results.

Return type:

ImputationDiDResults

Raises:

ValueError – If required columns are missing or data validation fails.

get_params()[source]

Get estimator parameters (sklearn-compatible).

Return type:: Dict[str, Any]

set_params(**params)[source]

Set estimator parameters (sklearn-compatible).

Return type:: ImputationDiD

summary()[source]

Get summary of estimation results.

Return type:: str

print_summary()[source]

Print summary to stdout.

Return type:: None

ImputationDiDResults#

Results container for imputation DiD estimation.

class diff_diff.ImputationDiDResults[source]

Bases: object

Results from Borusyak-Jaravel-Spiess (2024) imputation DiD estimation.

treatment_effects

Unit-level treatment effects with columns: unit, time, tau_hat, weight.

Type:: pd.DataFrame

overall_att

Overall average treatment effect on the treated.

Type:: float

overall_se

Standard error of overall ATT.

Type:: float

overall_t_stat

T-statistic for overall ATT.

Type:: float

overall_p_value

P-value for overall ATT.

Type:: float

overall_conf_int

Confidence interval for overall ATT.

Type:: tuple

event_study_effects

Dictionary mapping relative time h to effect dict with keys: ‘effect’, ‘se’, ‘t_stat’, ‘p_value’, ‘conf_int’, ‘n_obs’.

Type:: dict, optional

group_effects

Dictionary mapping cohort g to effect dict.

Type:: dict, optional

groups

List of treatment cohorts.

Type:: list

time_periods

List of all time periods.

Type:: list

n_obs

Total number of observations.

Type:: int

n_treated_obs

Number of treated observations (\(|\Omega_1|\)).

Type:: int

n_untreated_obs

Number of untreated observations (\(|\Omega_0|\)).

Type:: int

n_treated_units

Number of ever-treated units.

Type:: int

n_control_units

Number of units contributing to Omega_0.

Type:: int

alpha

Significance level used.

Type:: float

pretrend_results

Populated by pretrend_test().

Type:: dict, optional

bootstrap_results

Bootstrap inference results.

Type:: ImputationBootstrapResults, optional

Methods

`summary`([alpha])	Generate formatted summary of estimation results.
`print_summary`([alpha])	Print summary to stdout.
`to_dataframe`([level])	Convert results to DataFrame.
`pretrend_test`([n_leads])	Run a pre-trend test (Equation 9 of Borusyak et al. 2024).

treatment_effects: DataFrame

overall_att: float

overall_se: float

overall_t_stat: float

overall_p_value: float

overall_conf_int: Tuple[float, float]

event_study_effects: Dict[int, Dict[str, Any]] | None

group_effects: Dict[Any, Dict[str, Any]] | None

groups: List[Any]

time_periods: List[Any]

n_obs: int

n_treated_obs: int

n_untreated_obs: int

n_treated_units: int

n_control_units: int

alpha: float = 0.05

anticipation: int = 0

pretrend_results: Dict[str, Any] | None = None

bootstrap_results: ImputationBootstrapResults | None = None

survey_metadata: Any | None = None

property att: float

property se: float

property conf_int: Tuple[float, float]

property p_value: float

property t_stat: float

__repr__()[source]

Concise string representation.

Return type:: str

property coef_var: float

SE / abs(overall ATT). NaN when ATT is 0 or SE non-finite.

Type:: Coefficient of variation

summary(alpha=None)[source]

Generate formatted summary of estimation results.

Parameters:: alpha (float, optional) – Significance level. Defaults to alpha used in estimation.
Returns:: Formatted summary.
Return type:: str

print_summary(alpha=None)[source]

Print summary to stdout.

Parameters:: alpha (float | None)
Return type:: None

to_dataframe(level='observation')[source]

Convert results to DataFrame.

Parameters:: level (str, default="observation") – Level of aggregation: - “observation”: Unit-level treatment effects - “event_study”: Event study effects by relative time - “group”: Group (cohort) effects
Returns:: Results as DataFrame.
Return type:: pd.DataFrame

pretrend_test(n_leads=None)[source]

Run a pre-trend test (Equation 9 of Borusyak et al. 2024).

Adds pre-treatment lead indicators to the Step 1 OLS and tests their joint significance via a Wald F-test (cluster-robust, or design-based survey VCV when survey_design was provided at fit).

Parameters:: n_leads (int, optional) – Number of pre-treatment leads to include. If None, uses all available pre-treatment periods minus one (for the reference period).
Returns:: Dictionary with keys: ‘f_stat’, ‘p_value’, ‘df’, ‘n_leads’, ‘lead_coefficients’.
Return type:: dict

property is_significant: bool: Check if overall ATT is significant.

property significance_stars: str: Significance stars for overall ATT.

__init__(treatment_effects, overall_att, overall_se, overall_t_stat, overall_p_value, overall_conf_int, event_study_effects, group_effects, groups, time_periods, n_obs, n_treated_obs, n_untreated_obs, n_treated_units, n_control_units, alpha=0.05, anticipation=0, pretrend_results=None, bootstrap_results=None, _estimator_ref=None, survey_metadata=None)

Parameters:

treatment_effects (DataFrame)
overall_att (float)
overall_se (float)
overall_t_stat (float)
overall_p_value (float)
overall_conf_int (Tuple[float, float])
event_study_effects (Dict[int, Dict[str, Any]] | None)
group_effects (Dict[Any, Dict[str, Any]] | None)
groups (List[Any])
time_periods (List[Any])
n_obs (int)
n_treated_obs (int)
n_untreated_obs (int)
n_treated_units (int)
n_control_units (int)
alpha (float)
anticipation (int)
pretrend_results (Dict[str, Any] | None)
bootstrap_results (ImputationBootstrapResults | None)
_estimator_ref (Any | None)
survey_metadata (Any | None)

Return type:

None

ImputationBootstrapResults#

Bootstrap inference results.

class diff_diff.ImputationBootstrapResults[source]

Bases: object

Results from ImputationDiD bootstrap inference.

Bootstrap is a library extension beyond Borusyak et al. (2024), which proposes only analytical inference via the conservative variance estimator. Provided for consistency with CallawaySantAnna and SunAbraham.

n_bootstrap

Number of bootstrap iterations.

Type:: int

weight_type

Type of bootstrap weights: “rademacher”, “mammen”, or “webb”.

Type:: str

alpha

Significance level used for confidence intervals.

Type:: float

overall_att_se

Bootstrap standard error for overall ATT.

Type:: float

overall_att_ci

Bootstrap confidence interval for overall ATT.

Type:: tuple

overall_att_p_value

Bootstrap p-value for overall ATT.

Type:: float

event_study_ses

Bootstrap SEs for event study effects.

Type:: dict, optional

event_study_cis

Bootstrap CIs for event study effects.

Type:: dict, optional

event_study_p_values

Bootstrap p-values for event study effects.

Type:: dict, optional

group_ses

Bootstrap SEs for group effects.

Type:: dict, optional

group_cis

Bootstrap CIs for group effects.

Type:: dict, optional

group_p_values

Bootstrap p-values for group effects.

Type:: dict, optional

bootstrap_distribution

Full bootstrap distribution of overall ATT.

Type:: np.ndarray, optional

n_bootstrap: int

weight_type: str

alpha: float

overall_att_se: float

overall_att_ci: Tuple[float, float]

overall_att_p_value: float

event_study_ses: Dict[int, float] | None = None

event_study_cis: Dict[int, Tuple[float, float]] | None = None

event_study_p_values: Dict[int, float] | None = None

group_ses: Dict[Any, float] | None = None

group_cis: Dict[Any, Tuple[float, float]] | None = None

group_p_values: Dict[Any, float] | None = None

bootstrap_distribution: ndarray | None = None

__init__(n_bootstrap, weight_type, alpha, overall_att_se, overall_att_ci, overall_att_p_value, event_study_ses=None, event_study_cis=None, event_study_p_values=None, group_ses=None, group_cis=None, group_p_values=None, bootstrap_distribution=None)

Parameters:

n_bootstrap (int)
weight_type (str)
alpha (float)
overall_att_se (float)
overall_att_ci (Tuple[float, float])
overall_att_p_value (float)
event_study_ses (Dict[int, float] | None)
event_study_cis (Dict[int, Tuple[float, float]] | None)
event_study_p_values (Dict[int, float] | None)
group_ses (Dict[Any, float] | None)
group_cis (Dict[Any, Tuple[float, float]] | None)
group_p_values (Dict[Any, float] | None)
bootstrap_distribution (ndarray | None)

Return type:

None

Convenience Function#

diff_diff.imputation_did(data, outcome, unit, time, first_treat, covariates=None, aggregate=None, balance_e=None, survey_design=None, **kwargs)[source]#

Convenience function for imputation DiD estimation.

This is a shortcut for creating an ImputationDiD estimator and calling fit().

Parameters:

data (pd.DataFrame) – Panel data.
outcome (str) – Outcome variable column name.
unit (str) – Unit identifier column name.
time (str) – Time period column name.
first_treat (str) – Column indicating first treatment period (0 for never-treated).
covariates (list of str, optional) – Covariate column names.
aggregate (str, optional) – Aggregation mode: None, “simple”, “event_study”, “group”, “all”.
balance_e (int, optional) – Balance event study to cohorts observed at all relative times.
survey_design (SurveyDesign, optional) – Survey design specification for design-based inference. Supports pweight only (aweight/fweight raise ValueError). Supports strata, PSU, and FPC for design-based variance. Strata enters survey df for t-distribution inference. Both analytical (n_bootstrap=0) and bootstrap inference are supported.
**kwargs – Additional keyword arguments passed to ImputationDiD constructor.

Returns:

Estimation results.

Return type:

ImputationDiDResults

Examples

>>> from diff_diff import imputation_did, generate_staggered_data
>>> data = generate_staggered_data(seed=42)
>>> results = imputation_did(data, 'outcome', 'unit', 'time', 'first_treat',
...                          aggregate='event_study')
>>> results.print_summary()

Example Usage#

Basic usage:

from diff_diff import ImputationDiD, generate_staggered_data

data = generate_staggered_data(n_units=200, seed=42)
est = ImputationDiD()
results = est.fit(data, outcome='outcome', unit='unit',
                  time='period', first_treat='first_treat')
results.print_summary()

Event study with visualization:

from diff_diff import ImputationDiD, plot_event_study

est = ImputationDiD()
results = est.fit(data, outcome='outcome', unit='unit',
                  time='period', first_treat='first_treat',
                  aggregate='event_study')
plot_event_study(results)

Pre-trend test:

results = est.fit(data, outcome='outcome', unit='unit',
                  time='period', first_treat='first_treat')
pt = results.pretrend_test(n_leads=3)
print(f"F-stat: {pt['f_stat']:.3f}, p-value: {pt['p_value']:.4f}")

Comparison with other estimators:

from diff_diff import ImputationDiD, CallawaySantAnna, SunAbraham

# All three should agree under homogeneous effects
imp = ImputationDiD().fit(data, ...)
cs = CallawaySantAnna().fit(data, ...)
sa = SunAbraham().fit(data, ...)

print(f"Imputation ATT: {imp.overall_att:.3f} (SE: {imp.overall_se:.3f})")
print(f"CS ATT: {cs.overall_att:.3f} (SE: {cs.overall_se:.3f})")
print(f"SA ATT: {sa.overall_att:.3f} (SE: {sa.overall_se:.3f})")