Wooldridge Extended Two-Way Fixed Effects (ETWFE)#
Extended Two-Way Fixed Effects estimator from Wooldridge (2021, 2023),
based on the Stata jwdid package specification (Friosavila 2021),
with documented SE/aggregation deviations noted in the Methodology Registry.
This module implements ETWFE via a single saturated regression that:
Estimates ATT(g,t) for each cohort×time treatment cell simultaneously
Supports linear (OLS), Poisson QMLE, and logit link functions
Uses ASF-based ATT for nonlinear models: E[f(η₁)] − E[f(η₀)]
Computes delta-method SEs for all aggregations (event, group, calendar, simple)
Follows the Stata jwdid specification for OLS and nonlinear paths (see Methodology Registry for documented SE/aggregation deviations)
When to use WooldridgeDiD:
Staggered adoption design with heterogeneous treatment timing
Nonlinear outcomes (binary, count, non-negative continuous)
You want a single-regression approach matching Stata’s
jwdidYou need event-study, group, calendar, or simple ATT aggregations
References:
Wooldridge, J. M. (2021). Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Difference-in-Differences Estimators. SSRN 3906345.
Wooldridge, J. M. (2023). Simple approaches to nonlinear difference-in-differences with panel data. The Econometrics Journal, 26(3), C31–C66.
Friosavila, F. (2021).
jwdid: Stata module for ETWFE. SSC s459114.
WooldridgeDiD#
Main estimator class for Wooldridge ETWFE.
- class diff_diff.WooldridgeDiD[source]
Bases:
objectExtended Two-Way Fixed Effects (ETWFE) DiD estimator.
Implements the Wooldridge (2021) saturated cohort×time regression and Wooldridge (2023) nonlinear extensions (logit, Poisson). Produces all four
jwdid_estataggregation types: simple, group, calendar, event.- Parameters:
method ({"ols", "logit", "poisson"}) – Estimation method. “ols” for continuous outcomes; “logit” for binary or fractional outcomes; “poisson” for count data.
control_group ({"not_yet_treated", "never_treated"}) – Which units serve as the comparison group. “not_yet_treated” (jwdid default) uses all untreated observations at each time period; “never_treated” uses only units never treated throughout the sample.
anticipation (int) – Number of periods before treatment onset to include as treatment cells (anticipation effects). 0 means no anticipation.
demean_covariates (bool) – If True (jwdid default),
xtvarcovariates are demeaned within each cohort×period cell before entering the regression. Set to False to replicate jwdid’sxasisoption.alpha (float) – Significance level for confidence intervals.
cluster (str or None) – Column name to use for cluster-robust SEs. Defaults to the
unitidentifier passed tofit().n_bootstrap (int) – Number of bootstrap replications. 0 disables bootstrap.
bootstrap_weights ({"rademacher", "webb", "mammen"}) – Bootstrap weight distribution.
seed (int or None) – Random seed for reproducibility.
rank_deficient_action ({"warn", "error", "silent"}) – How to handle rank-deficient design matrices.
Methods
fit(data, outcome, unit, time, cohort[, ...])Fit the ETWFE model.
get_params()Return estimator parameters (sklearn-compatible).
set_params(**params)Set estimator parameters (sklearn-compatible).
- __init__(method='ols', control_group='not_yet_treated', anticipation=0, demean_covariates=True, alpha=0.05, cluster=None, n_bootstrap=0, bootstrap_weights='rademacher', seed=None, rank_deficient_action='warn')[source]
- property results_: WooldridgeDiDResults
- set_params(**params)[source]
Set estimator parameters (sklearn-compatible). Returns self.
- Parameters:
params (Any)
- Return type:
- fit(data, outcome, unit, time, cohort, exovar=None, xtvar=None, xgvar=None, survey_design=None)[source]
Fit the ETWFE model. See class docstring for parameter details.
- Parameters:
data (DataFrame with panel data (long format))
outcome (outcome column name)
unit (unit identifier column)
time (time period column)
cohort (first treatment period (0 or NaN = never treated))
exovar (time-invariant covariates added without interaction/demeaning)
xtvar (time-varying covariates (demeaned within cohort×period cells) – when
demean_covariates=True)xgvar (covariates interacted with each cohort indicator)
survey_design (SurveyDesign, optional) – Survey design specification for complex survey data. Supports stratified, clustered, and weighted designs via Taylor Series Linearization (TSL). Replicate-weight designs raise
NotImplementedError.
- Return type:
WooldridgeDiDResults#
Results container returned by WooldridgeDiD.fit().
- class diff_diff.wooldridge_results.WooldridgeDiDResults[source]
Bases:
objectResults from WooldridgeDiD.fit().
Core output is
group_time_effects: a dict keyed by (cohort_g, time_t) with per-cell ATT estimates and inference. Call.aggregate(type)to compute any of the four jwdid_estat aggregation types.Methods
aggregate(type)Compute and store one of the four jwdid_estat aggregation types.
summary([aggregation])Print formatted summary table.
- group_time_effects: Dict[Tuple[Any, Any], Dict[str, Any]]
key=(g,t), value={att, se, t_stat, p_value, conf_int}
- overall_att: float
- overall_se: float
- overall_t_stat: float
- overall_p_value: float
- method: str = 'ols'
- control_group: str = 'not_yet_treated'
- n_obs: int = 0
- n_treated_units: int = 0
- n_control_units: int = 0
- alpha: float = 0.05
- anticipation: int = 0
- aggregate(type)[source]
Compute and store one of the four jwdid_estat aggregation types.
- Parameters:
type ("simple" | "group" | "calendar" | "event")
chaining. (Returns self for)
- Return type:
- summary(aggregation='simple')[source]
Print formatted summary table.
- Parameters:
aggregation (which aggregation to display ("simple", "group", "calendar", "event"))
- Return type:
- to_dataframe(aggregation='event')[source]
Export aggregated effects to a DataFrame.
- Parameters:
aggregation ("simple" | "group" | "calendar" | "event" | "gt") – Use “gt” to export raw group-time effects.
- Return type:
- plot_event_study(**kwargs)[source]
Event study plot. Calls aggregate(‘event’) if needed.
- Return type:
None
- property att: float
- property se: float
- __init__(group_time_effects, overall_att, overall_se, overall_t_stat, overall_p_value, overall_conf_int, group_effects=None, calendar_effects=None, event_study_effects=None, method='ols', control_group='not_yet_treated', groups=<factory>, time_periods=<factory>, n_obs=0, n_treated_units=0, n_control_units=0, alpha=0.05, anticipation=0, survey_metadata=None, _gt_weights=<factory>, _gt_vcov=None, _gt_keys=<factory>, _df_survey=None)
- Parameters:
- Return type:
None
- property p_value: float
- property t_stat: float
Example Usage#
Basic OLS (follows Stata jwdid y, ivar(unit) tvar(time) gvar(cohort)):
import pandas as pd
from diff_diff import WooldridgeDiD
df = pd.read_stata("mpdta.dta")
df['first_treat'] = df['first_treat'].astype(int)
m = WooldridgeDiD()
r = m.fit(df, outcome='lemp', unit='countyreal', time='year', cohort='first_treat')
r.aggregate('event').aggregate('group').aggregate('simple')
print(r.summary('event'))
print(r.summary('group'))
print(r.summary('simple'))
View cohort×time cell estimates (post-treatment):
for (g, t), v in sorted(r.group_time_effects.items()):
if t >= g:
print(f"g={g} t={t} ATT={v['att']:.4f} SE={v['se']:.4f}")
Poisson QMLE for non-negative outcomes
(follows Stata jwdid emp, method(poisson)):
import numpy as np
df['emp'] = np.exp(df['lemp'])
m_pois = WooldridgeDiD(method='poisson')
r_pois = m_pois.fit(df, outcome='emp', unit='countyreal',
time='year', cohort='first_treat')
r_pois.aggregate('event').aggregate('group').aggregate('simple')
print(r_pois.summary('simple'))
Logit for binary outcomes
(follows Stata jwdid y, method(logit)):
m_logit = WooldridgeDiD(method='logit')
r_logit = m_logit.fit(df, outcome='hi_emp', unit='countyreal',
time='year', cohort='first_treat')
r_logit.aggregate('group').aggregate('simple')
print(r_logit.summary('group'))
Aggregation Methods#
Call .aggregate(type) before .summary(type):
Type |
Description |
Stata equivalent |
|---|---|---|
|
ATT by relative time k = t − g |
|
|
ATT averaged across post-treatment periods per cohort |
|
|
ATT averaged across cohorts per calendar period |
|
|
Overall weighted average ATT |
|
Comparison with Other Staggered Estimators#
Feature |
WooldridgeDiD (ETWFE) |
CallawaySantAnna |
ImputationDiD |
|---|---|---|---|
Approach |
Single saturated regression |
Separate 2×2 DiD per cell |
Impute Y(0) via FE model |
Nonlinear outcomes |
Yes (Poisson, Logit) |
No |
No |
Covariates |
Via regression (linear index) |
OR, IPW, DR |
Supported |
SE for aggregations |
Delta method |
Multiplier bootstrap |
Multiplier bootstrap |
Stata equivalent |
|
|
|