diff_diff.WooldridgeDiD#
- class diff_diff.WooldridgeDiD[source]#
Bases:
objectExtended Two-Way Fixed Effects (ETWFE) DiD estimator.
Implements the Wooldridge (2025) saturated cohort×time regression (Empirical Economics 69(5), 2545-2587; DOI 10.1007/s00181-025-02807-z) and Wooldridge (2023) nonlinear extensions (logit, Poisson). Produces all four
jwdid_estataggregation types: simple, group, calendar, event. Opt-in surfaces include paper W2025 Section 7 cohort-share aggregation (aggregate(weights="cohort_share"), Eqs. 7.4 + 7.6) and paper W2025 Section 8 heterogeneous cohort-specific linear trends (cohort_trends=True, Eq. 8.1; OLS path only).- Parameters:
method ({"ols", "logit", "poisson"}) – Estimation method.
"ols"is the linear baseline — valid for any response (Wooldridge 2023) and the usual choice for continuous outcomes;"logit"for binary or fractional outcomes;"poisson"for count data. Whenmethod="ols"is used on a binary ({0, 1}) or non-negative integer-count outcome, aUserWarningnotes that a matching nonlinear model (logit / Poisson) is often the more appropriate specification — it imposes parallel trends on the link scale rather than in levels, and Wooldridge’s (2023) simulations show the linear model both biased and less precise for such outcomes when the nonlinear mean holds. It rests on a different identifying assumption than linear OLS, so it is a recommended comparison, not an automatic switch; suppress viawarnings.filterwarnings.control_group ({"not_yet_treated", "never_treated"}) – Which units serve as the comparison group. “not_yet_treated” (jwdid default) uses all untreated observations at each time period; “never_treated” uses only units never treated throughout the sample.
anticipation (int) – Number of periods before treatment onset to include as treatment cells (anticipation effects). 0 means no anticipation.
demean_covariates (bool) – If True (jwdid default),
xtvarcovariates are demeaned within each cohort×period cell before entering the regression. Set to False to replicate jwdid’sxasisoption.alpha (float) – Significance level for confidence intervals.
cluster (str or None) – Column name to use for cluster-robust SEs. Defaults to the
unitidentifier passed tofit().n_bootstrap (int) – Number of bootstrap replications. 0 disables bootstrap.
bootstrap_weights ({"rademacher", "webb", "mammen"}) – Bootstrap weight distribution.
seed (int or None) – Random seed for reproducibility.
rank_deficient_action ({"warn", "error", "silent"}) – How to handle rank-deficient design matrices.
vcov_type ({"classical", "hc1", "hc2", "hc2_bm", "conley"}, default "hc1") –
Variance-covariance family for the analytical sandwich, OLS path only.
hc1(default) preserves the prior bit-equal CR1 Liang-Zeger cluster-robust behavior via the within-transform path.hc2_bmauto-routes to a full-dummy saturated design (intercept + treatment cells + unit dummies + time dummies) — FWL preserves cohort coefficients but NOT the hat matrix, so HC2 leverage and Bell-McCaffrey Satterthwaite DOF must be computed on the full FE projection (matchesclubSandwich::vcovCR(lm(...), type="CR2") + coef_test()$df_Satt).classical/hc2are supported via the same full-dummy route AND an auto-drop of the unit auto-cluster (one-way families don’t compose with cluster_ids per the linalg validator). Explicitcluster="X"+ one-wayvcov_typeraises at the validator."conley"(Conley 1999 spatial-HAC) threads theconley_*params throughsolve_olson the within-transform design (conley_lag_cutoff=0= within-period spatial only;>0adds within-unit Bartlett serial — the panel-aware path, not pooled cross-sectional, sinceconley_time/conley_unitare always supplied); the unit auto-cluster is dropped (an explicitcluster=enables the spatial+cluster product kernel) andsurvey_design=/weights/n_bootstrap>0are rejected. Conley is OLS-path-only; it routes through the full-dummy design whencohort_trends=True(same as the other full-dummy families), and its vcov flows throughaggregate("group"|"calendar"|"event").methodin{"logit","poisson"}+vcov_type != "hc1"is REJECTED at__init__: the GLM QMLE sandwich path uses pseudo- residuals, and CR2-BM composition with QMLE on canonical-link pseudo- residuals needs derivation + R parity (tracked in TODO.md). Survey designs combined withvcov_type != "hc1"raiseNotImplementedErroratfit()because the survey TSL / replicate- refit variance overrides the analytical sandwich.cohort_trends (bool, default False) – When True, adds linear
dg_i · tcohort-specific trend interactions to the design matrix per paper W2025 Section 8 / Eq. 8.1. Under a heterogeneous-trends DGP this recoversτeven when parallel trends fails (paper Section 8.3). OLS-path only:cohort_trends=True+method ∈ {"logit","poisson"}raisesNotImplementedErrorat__init__. Auto-routes to the full-dummy design regardless ofvcov_type(matching the absorb→fixed_effects auto-route). Each treated cohort must have ≥ 2 observed pre-periods in the analysis sample fordg_i · tto be separately identified from cohort + time FE;fit()raisesValueErrorotherwise. On all-eventually-treated panels the last cohort’s trend column is dropped per paper Section 5.4.cohort_trends=True+survey_designraisesNotImplementedErroratfit()(deferred follow-up).cohort_trends=True+control_group="never_treated"also raisesNotImplementedErroratfit()because the OLS + never_treated branch emits ALL(g, t)placebo cell dummies (paper Section 4.4 placebo coverage); the appendeddg_i · ttrend columns are linearly spanned by the per-cohort sum of those cell dummies, so the Section 8 trend specification is unidentified on this branch. Usecontrol_group="not_yet_treated"(the default) for the cohort_trends surface.
Methods
__init__([method, control_group, ...])fit(data, outcome, unit, time, cohort[, ...])Fit the ETWFE model.
get_params()Return estimator parameters (sklearn-compatible).
set_params(**params)Set estimator parameters (sklearn-compatible).
Attributes
results_- __init__(method='ols', control_group='not_yet_treated', anticipation=0, demean_covariates=True, alpha=0.05, cluster=None, n_bootstrap=0, bootstrap_weights='rademacher', seed=None, rank_deficient_action='warn', vcov_type='hc1', cohort_trends=False, conley_coords=None, conley_cutoff_km=None, conley_metric='haversine', conley_kernel='bartlett', conley_lag_cutoff=None)[source]#
- Parameters:
method (str)
control_group (str)
anticipation (int)
demean_covariates (bool)
alpha (float)
cluster (str | None)
n_bootstrap (int)
bootstrap_weights (str)
seed (int | None)
rank_deficient_action (str)
vcov_type (str)
cohort_trends (bool)
conley_cutoff_km (float | None)
conley_metric (str)
conley_kernel (str)
conley_lag_cutoff (int | None)
- Return type:
None
- classmethod __new__(*args, **kwargs)#