diff_diff.StackedDiD#

class diff_diff.StackedDiD[source]#

Bases: object

Stacked Difference-in-Differences estimator.

Implements Wing, Freedman & Hollingsworth (2024). Builds a stacked dataset of sub-experiments (one per adoption cohort), applies corrective Q-weights to address implicit weighting bias in naive stacked regressions, and runs a weighted event-study regression.

Parameters:

kappa_pre (int, default=1) – Number of pre-treatment event-time periods in the event window. The event window spans [-kappa_pre, …, kappa_post].
kappa_post (int, default=1) – Number of post-treatment event-time periods.
weighting (str, default="aggregate") – Target estimand weighting scheme per Table 1 of the paper: - “aggregate”: Equal weight per adoption event (trimmed aggregate ATT) - “population”: Weight by population size of treated cohort - “sample_share”: Weight by sample share of each sub-experiment
clean_control (str, default="not_yet_treated") – How to define clean controls per Appendix A of the paper: - “not_yet_treated”: Units with A_s > a + kappa_post - “strict”: Units with A_s > a + kappa_post + kappa_pre - “never_treated”: Only units with A_s = infinity
cluster (str, default="unit") – Clustering level for standard errors: - “unit”: Cluster on original unit identifier - “unit_subexp”: Cluster on (unit, sub_experiment) pairs
alpha (float, default=0.05) – Significance level for confidence intervals.
anticipation (int, default=0) – Number of anticipation periods. When anticipation > 0: - Reference period shifts from e=-1 to e=-1-anticipation - Post-treatment includes anticipation periods (e >= -anticipation) - Event window expands by anticipation pre-periods Consistent with ImputationDiD, TwoStageDiD, SunAbraham.
rank_deficient_action (str, default="warn") – Action when design matrix is rank-deficient: - “warn”: Issue warning and drop linearly dependent columns - “error”: Raise ValueError - “silent”: Drop columns silently
vcov_type ({"classical","hc1","hc2","hc2_bm"}, default="hc1") –
Analytical variance family for the stacked WLS regression. StackedDiD is intrinsically clustered (cluster is required, no cluster=None opt-out), so one-way families that don’t compose with cluster_ids are rejected at __init__:
- "hc1" (default): CR1 Liang-Zeger cluster-robust on the Q-weighted design via solve_ols(weights=composed_weights, vcov_type="hc1"). Bit-equal to the prior bake-Q-into-X output up to float64 multiplication ordering at machine precision (HC1 WLS sandwich is algebraically invariant between the two forms). Matches clubSandwich::vcovCR(lm(weights=Q,...), cluster=~unit, type="CR1S") at atol=1e-10 (target is CR1S — Stata-style G/(G-1) * (n-1)/(n-p) finite-sample correction — NOT plain CR1 which omits the (n-1)/(n-p) factor and would diverge by ~1.4%).
- "hc2_bm": CR2 Bell-McCaffrey via solve_ols(weights=composed_weights, vcov_type="hc2_bm"), routed through the clubSandwich WLS-CR2 port (matches clubSandwich::vcovCR(lm(weights=Q,...), cluster=~unit, type="CR2") + coef_test()$df_Satt at atol=1e-10). See REGISTRY.md Phase 1a hc2_bm + weights row for the algebra (W not √W in hat matrix, W² in bias term, unweighted residuals in score).
- "classical" and "hc2" are REJECTED at __init__ with a cluster-incompatibility ValueError: StackedDiD requires a cluster structure, so one-way families don’t compose with the linalg validator. Use "hc1" or "hc2_bm".
- "conley" is REJECTED at __init__ for a methodology reason (NOT plumbing): the stacked design replicates units across sub-experiments, so Conley would see same-unit copies at distance 0; no conleyreg anchor; paper-gated. Tracked in TODO.md.
Survey-design precedence: when survey_design= is supplied to fit() with vcov_type != "hc1", a NotImplementedError is raised — the survey Taylor-series linearization (or replicate-weight refit) variance overrides the analytical sandwich. Use the default vcov_type="hc1" for survey designs.
balance ({"none", "entropy"}, default="none") – Within-sub-experiment covariate balancing (Covariate-Balanced Weighted Stacked DID; Ustyuzhanin 2026). With "entropy" and a fit(..., covariates=[...]) list, each clean-control group is reweighted by entropy balancing (Hainmueller 2012) so its covariate means match the treated cohort’s (measured at the last pre-treatment period), and the resulting design weights b_sa are composed with the Wing corrective weights via the effective control mass into the final stacked weights W_sa. This is control-only reweighting, so it preserves the trimmed-aggregate-ATT estimand (it changes only how untreated trends are estimated, not the treated-cohort weights); at b_sa=1 it reduces to the paper’s unit-count weighted stacked DID, equal to weighting="aggregate" on balanced event windows. v1 requires weighting="aggregate" and balanced event windows (ragged windows raise a ValueError), and does not support survey_design=; matching-based balancing and the repeated-treatment extension are out of scope. Default "none" reproduces plain weighted stacked DID.

results_#

Estimation results after calling fit().

Type:: StackedDiDResults

is_fitted_#

Whether the model has been fitted.

Type:: bool

Examples

Basic usage:

>>> from diff_diff import StackedDiD, generate_staggered_data
>>> data = generate_staggered_data(n_units=200, seed=42)
>>> est = StackedDiD(kappa_pre=2, kappa_post=2)
>>> results = est.fit(data, outcome='outcome', unit='unit',
...                   time='period', first_treat='first_treat')
>>> results.print_summary()

With event study:

>>> results = est.fit(data, outcome='outcome', unit='unit',
...                   time='period', first_treat='first_treat',
...                   aggregate='event_study')
>>> from diff_diff import plot_event_study
>>> plot_event_study(results)

Notes

The stacked estimator addresses TWFE bias by: 1. Creating one sub-experiment per adoption cohort with clean controls 2. Applying Q-weights to reweight the stacked regression 3. Running a single event-study WLS regression on the weighted stack

References

Wing, C., Freedman, S. M., & Hollingsworth, A. (2024). Stacked: Difference-in-Differences. NBER Working Paper 32054.

Methods

`__init__`([kappa_pre, kappa_post, weighting, ...])
`fit`(data, outcome, unit, time, first_treat)	Fit the stacked DiD estimator.
`get_params`()	Get estimator parameters (sklearn-compatible).
`print_summary`()	Print summary to stdout.
`set_params`(**params)	Set estimator parameters (sklearn-compatible).
`summary`()	Get summary of estimation results.

__init__(kappa_pre=1, kappa_post=1, weighting='aggregate', clean_control='not_yet_treated', cluster='unit', alpha=0.05, anticipation=0, rank_deficient_action='warn', vcov_type='hc1', balance='none')[source]#

Parameters:

kappa_pre (int)
kappa_post (int)
weighting (str)
clean_control (str)
cluster (str)
alpha (float)
anticipation (int)
rank_deficient_action (str)
vcov_type (str)
balance (str)

classmethod __new__(*args, **kwargs)#