diff_diff.WooldridgeDiD#

class diff_diff.WooldridgeDiD[source]#

Bases: object

Extended Two-Way Fixed Effects (ETWFE) DiD estimator.

Implements the Wooldridge (2025) saturated cohort×time regression (Empirical Economics 69(5), 2545-2587; DOI 10.1007/s00181-025-02807-z) and Wooldridge (2023) nonlinear extensions (logit, Poisson). Produces all four jwdid_estat aggregation types: simple, group, calendar, event. Opt-in surfaces include paper W2025 Section 7 cohort-share aggregation (aggregate(weights="cohort_share"), Eqs. 7.4 + 7.6) and paper W2025 Section 8 heterogeneous cohort-specific linear trends (cohort_trends=True, Eq. 8.1; OLS path only).

Parameters:
  • method ({"ols", "logit", "poisson"}) – Estimation method. “ols” for continuous outcomes; “logit” for binary or fractional outcomes; “poisson” for count data.

  • control_group ({"not_yet_treated", "never_treated"}) – Which units serve as the comparison group. “not_yet_treated” (jwdid default) uses all untreated observations at each time period; “never_treated” uses only units never treated throughout the sample.

  • anticipation (int) – Number of periods before treatment onset to include as treatment cells (anticipation effects). 0 means no anticipation.

  • demean_covariates (bool) – If True (jwdid default), xtvar covariates are demeaned within each cohort×period cell before entering the regression. Set to False to replicate jwdid’s xasis option.

  • alpha (float) – Significance level for confidence intervals.

  • cluster (str or None) – Column name to use for cluster-robust SEs. Defaults to the unit identifier passed to fit().

  • n_bootstrap (int) – Number of bootstrap replications. 0 disables bootstrap.

  • bootstrap_weights ({"rademacher", "webb", "mammen"}) – Bootstrap weight distribution.

  • seed (int or None) – Random seed for reproducibility.

  • rank_deficient_action ({"warn", "error", "silent"}) – How to handle rank-deficient design matrices.

  • vcov_type ({"classical", "hc1", "hc2", "hc2_bm"}, default "hc1") –

    Variance-covariance family for the analytical sandwich, OLS path only. hc1 (default) preserves the prior bit-equal CR1 Liang-Zeger cluster-robust behavior via the within-transform path. hc2_bm auto-routes to a full-dummy saturated design (intercept + treatment cells + unit dummies + time dummies) — FWL preserves cohort coefficients but NOT the hat matrix, so HC2 leverage and Bell-McCaffrey Satterthwaite DOF must be computed on the full FE projection (matches clubSandwich::vcovCR(lm(...), type="CR2") + coef_test()$df_Satt). classical / hc2 are supported via the same full-dummy route AND an auto-drop of the unit auto-cluster (one-way families don’t compose with cluster_ids per the linalg validator). Explicit cluster="X" + one-way vcov_type raises at the validator.

    conley is REJECTED at __init__ (would require threading conley_* params through solve_ols; tracked in TODO.md). method in {"logit","poisson"} + vcov_type != "hc1" is REJECTED at __init__: the GLM QMLE sandwich path uses pseudo- residuals, and CR2-BM composition with QMLE on canonical-link pseudo- residuals needs derivation + R parity (tracked in TODO.md). Survey designs combined with vcov_type != "hc1" raise NotImplementedError at fit() because the survey TSL / replicate- refit variance overrides the analytical sandwich.

  • cohort_trends (bool, default False) – When True, adds linear dg_i · t cohort-specific trend interactions to the design matrix per paper W2025 Section 8 / Eq. 8.1. Under a heterogeneous-trends DGP this recovers τ even when parallel trends fails (paper Section 8.3). OLS-path only: cohort_trends=True + method {"logit","poisson"} raises NotImplementedError at __init__. Auto-routes to the full-dummy design regardless of vcov_type (matching the absorb→fixed_effects auto-route). Each treated cohort must have ≥ 2 observed pre-periods in the analysis sample for dg_i · t to be separately identified from cohort + time FE; fit() raises ValueError otherwise. On all-eventually-treated panels the last cohort’s trend column is dropped per paper Section 5.4. cohort_trends=True + survey_design raises NotImplementedError at fit() (deferred follow-up). cohort_trends=True + control_group="never_treated" also raises NotImplementedError at fit() because the OLS + never_treated branch emits ALL (g, t) placebo cell dummies (paper Section 4.4 placebo coverage); the appended dg_i · t trend columns are linearly spanned by the per-cohort sum of those cell dummies, so the Section 8 trend specification is unidentified on this branch. Use control_group="not_yet_treated" (the default) for the cohort_trends surface.

Methods

__init__([method, control_group, ...])

fit(data, outcome, unit, time, cohort[, ...])

Fit the ETWFE model.

get_params()

Return estimator parameters (sklearn-compatible).

set_params(**params)

Set estimator parameters (sklearn-compatible).

Attributes

results_

__init__(method='ols', control_group='not_yet_treated', anticipation=0, demean_covariates=True, alpha=0.05, cluster=None, n_bootstrap=0, bootstrap_weights='rademacher', seed=None, rank_deficient_action='warn', vcov_type='hc1', cohort_trends=False)[source]#
Parameters:
  • method (str)

  • control_group (str)

  • anticipation (int)

  • demean_covariates (bool)

  • alpha (float)

  • cluster (str | None)

  • n_bootstrap (int)

  • bootstrap_weights (str)

  • seed (int | None)

  • rank_deficient_action (str)

  • vcov_type (str)

  • cohort_trends (bool)

Return type:

None

classmethod __new__(*args, **kwargs)#