diff_diff.TwoStageDiD#

class diff_diff.TwoStageDiD[source]#

Bases: TwoStageDiDBootstrapMixin

Gardner (2022) two-stage Difference-in-Differences estimator.

This estimator addresses TWFE bias under heterogeneous treatment effects by: 1. Estimating unit + time FEs on untreated observations only 2. Residualizing ALL outcomes using estimated FEs 3. Regressing residualized outcomes on treatment indicators

Point estimates are identical to ImputationDiD (Borusyak et al. 2024). The key difference is the variance estimator: TwoStageDiD uses a GMM sandwich variance that accounts for first-stage estimation uncertainty, while ImputationDiD uses the conservative variance from Theorem 3.

Parameters:

anticipation (int, default=0) – Number of periods before treatment where effects may occur.
alpha (float, default=0.05) – Significance level for confidence intervals.
cluster (str, optional) – Column name for cluster-robust standard errors. If None, clusters at the unit level by default.
n_bootstrap (int, default=0) – Number of bootstrap iterations. If 0, uses analytical GMM sandwich inference.
bootstrap_weights (str, default="rademacher") – Type of bootstrap weights: “rademacher”, “mammen”, or “webb”.
seed (int, optional) – Random seed for reproducibility.
rank_deficient_action (str, default="warn") – Action when design matrix is rank-deficient: - “warn”: Issue warning and drop linearly dependent columns - “error”: Raise ValueError - “silent”: Drop columns silently
horizon_max (int, optional) – Maximum event-study horizon. If set, event study effects are only computed for abs(h) <= horizon_max.
pretrends (bool, default=False) – If True, event study includes pre-treatment horizons for visual pre-trends assessment. Pre-period effects should be ~0 under parallel trends. Only affects event_study aggregation; overall ATT and group aggregation are unchanged.
vcov_type (str, default="hc1") – Variance estimator family. Permanently narrow to {"hc1"} — the Gardner (2022) two-stage GMM cluster-sandwich. Analytical-sandwich families {"classical", "hc2", "hc2_bm"} and "conley" are rejected at __init__ / fit() because the GMM-corrected meat folds first-stage estimation uncertainty into the score, leaving no single hat matrix on which hat-matrix leverage or Bell-McCaffrey Satterthwaite DOF can be defined. Use cluster=<col> to select the cluster level; cluster=None (the default) clusters at the unit level, so the summary renders the unit-cluster CR1 label.

results_#

Estimation results after calling fit().

Type:: TwoStageDiDResults

is_fitted_#

Whether the model has been fitted.

Type:: bool

Examples

Basic usage:

>>> from diff_diff import TwoStageDiD, generate_staggered_data
>>> data = generate_staggered_data(n_units=200, seed=42)
>>> est = TwoStageDiD()
>>> results = est.fit(data, outcome='outcome', unit='unit',
...                   time='period', first_treat='first_treat')
>>> results.print_summary()

With event study:

>>> est = TwoStageDiD()
>>> results = est.fit(data, outcome='outcome', unit='unit',
...                   time='period', first_treat='first_treat',
...                   aggregate='event_study')
>>> from diff_diff import plot_event_study
>>> plot_event_study(results)

Notes

The two-stage estimator uses ALL untreated observations (never-treated + not-yet-treated periods of eventually-treated units) to estimate the counterfactual model.

References

Gardner, J. (2022). Two-stage differences in differences.: arXiv:2207.05943.
Butts, K. & Gardner, J. (2022). did2s: Two-Stage: Difference-in-Differences. R Journal, 14(1), 162-173.

Methods

`__init__`([anticipation, alpha, cluster, ...])
`fit`(data, outcome, unit, time, first_treat)	Fit the two-stage DiD estimator.
`get_params`()	Get estimator parameters (sklearn-compatible).
`print_summary`()	Print summary to stdout.
`set_params`(**params)	Set estimator parameters (sklearn-compatible).
`summary`()	Get summary of estimation results.

Attributes

`n_bootstrap`
`bootstrap_weights`
`alpha`
`seed`
`horizon_max`
`pretrends`

__init__(anticipation=0, alpha=0.05, cluster=None, n_bootstrap=0, bootstrap_weights='rademacher', seed=None, rank_deficient_action='warn', horizon_max=None, pretrends=False, vcov_type='hc1')[source]#

Parameters:

anticipation (int)
alpha (float)
cluster (str | None)
n_bootstrap (int)
bootstrap_weights (str)
seed (int | None)
rank_deficient_action (str)
horizon_max (int | None)
pretrends (bool)
vcov_type (str)

classmethod __new__(*args, **kwargs)#