Wooldridge Extended Two-Way Fixed Effects (ETWFE)#

Extended Two-Way Fixed Effects estimator from Wooldridge (2021, 2023), based on the Stata jwdid package specification (Friosavila 2021), with documented SE/aggregation deviations noted in the Methodology Registry.

This module implements ETWFE via a single saturated regression that:

Estimates ATT(g,t) for each cohort×time treatment cell simultaneously
Supports linear (OLS), Poisson QMLE, and logit link functions
Uses ASF-based ATT for nonlinear models: E[f(η₁)] − E[f(η₀)]
Computes delta-method SEs for all aggregations (event, group, calendar, simple)
Follows the Stata jwdid specification for OLS and nonlinear paths (see Methodology Registry for documented SE/aggregation deviations)

When to use WooldridgeDiD:

Staggered adoption design with heterogeneous treatment timing
Nonlinear outcomes (binary, count, non-negative continuous)
You want a single-regression approach matching Stata’s jwdid
You need event-study, group, calendar, or simple ATT aggregations

References:

Wooldridge, J. M. (2021). Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Difference-in-Differences Estimators. SSRN 3906345.
Wooldridge, J. M. (2023). Simple approaches to nonlinear difference-in-differences with panel data. The Econometrics Journal, 26(3), C31–C66.
Friosavila, F. (2021). jwdid: Stata module for ETWFE. SSC s459114.

WooldridgeDiD#

Main estimator class for Wooldridge ETWFE.

class diff_diff.WooldridgeDiD[source]

Bases: object

Extended Two-Way Fixed Effects (ETWFE) DiD estimator.

Implements the Wooldridge (2021) saturated cohort×time regression and Wooldridge (2023) nonlinear extensions (logit, Poisson). Produces all four jwdid_estat aggregation types: simple, group, calendar, event.

Parameters:

method ({"ols", "logit", "poisson"}) – Estimation method. “ols” for continuous outcomes; “logit” for binary or fractional outcomes; “poisson” for count data.
control_group ({"not_yet_treated", "never_treated"}) – Which units serve as the comparison group. “not_yet_treated” (jwdid default) uses all untreated observations at each time period; “never_treated” uses only units never treated throughout the sample.
anticipation (int) – Number of periods before treatment onset to include as treatment cells (anticipation effects). 0 means no anticipation.
demean_covariates (bool) – If True (jwdid default), xtvar covariates are demeaned within each cohort×period cell before entering the regression. Set to False to replicate jwdid’s xasis option.
alpha (float) – Significance level for confidence intervals.
cluster (str or None) – Column name to use for cluster-robust SEs. Defaults to the unit identifier passed to fit().
n_bootstrap (int) – Number of bootstrap replications. 0 disables bootstrap.
bootstrap_weights ({"rademacher", "webb", "mammen"}) – Bootstrap weight distribution.
seed (int or None) – Random seed for reproducibility.
rank_deficient_action ({"warn", "error", "silent"}) – How to handle rank-deficient design matrices.

Methods

`fit`(data, outcome, unit, time, cohort[, ...])	Fit the ETWFE model.
`get_params`()	Return estimator parameters (sklearn-compatible).
`set_params`(**params)	Set estimator parameters (sklearn-compatible).

__init__(method='ols', control_group='not_yet_treated', anticipation=0, demean_covariates=True, alpha=0.05, cluster=None, n_bootstrap=0, bootstrap_weights='rademacher', seed=None, rank_deficient_action='warn')[source]

Parameters:

method (str)
control_group (str)
anticipation (int)
demean_covariates (bool)
alpha (float)
cluster (str | None)
n_bootstrap (int)
bootstrap_weights (str)
seed (int | None)
rank_deficient_action (str)

Return type:

None

property results_: WooldridgeDiDResults

get_params()[source]

Return estimator parameters (sklearn-compatible).

Return type:: Dict[str, Any]

set_params(**params)[source]

Set estimator parameters (sklearn-compatible). Returns self.

Parameters:: params (Any)
Return type:: WooldridgeDiD

fit(data, outcome, unit, time, cohort, exovar=None, xtvar=None, xgvar=None, survey_design=None)[source]

Fit the ETWFE model. See class docstring for parameter details.

Parameters:

data (DataFrame with panel data (long format))
outcome (outcome column name)
unit (unit identifier column)
time (time period column)
cohort (first treatment period (0 or NaN = never treated))
exovar (time-invariant covariates added without interaction/demeaning)
xtvar (time-varying covariates (demeaned within cohort×period cells) – when demean_covariates=True)
xgvar (covariates interacted with each cohort indicator)
survey_design (SurveyDesign, optional) – Survey design specification for complex survey data. Supports stratified, clustered, and weighted designs via Taylor Series Linearization (TSL). Replicate-weight designs raise NotImplementedError.

Return type:

WooldridgeDiDResults

WooldridgeDiDResults#

Results container returned by WooldridgeDiD.fit().

class diff_diff.wooldridge_results.WooldridgeDiDResults[source]

Bases: object

Results from WooldridgeDiD.fit().

Core output is group_time_effects: a dict keyed by (cohort_g, time_t) with per-cell ATT estimates and inference. Call .aggregate(type) to compute any of the four jwdid_estat aggregation types.

Methods

`aggregate`(type)	Compute and store one of the four jwdid_estat aggregation types.
`summary`([aggregation])	Print formatted summary table.

group_time_effects: Dict[Tuple[Any, Any], Dict[str, Any]]: key=(g,t), value={att, se, t_stat, p_value, conf_int}

overall_att: float

overall_se: float

overall_t_stat: float

overall_p_value: float

overall_conf_int: Tuple[float, float]

group_effects: Dict[Any, Dict] | None = None

calendar_effects: Dict[Any, Dict] | None = None

event_study_effects: Dict[int, Dict] | None = None

method: str = 'ols'

control_group: str = 'not_yet_treated'

groups: List[Any]

time_periods: List[Any]

n_obs: int = 0

n_treated_units: int = 0

n_control_units: int = 0

alpha: float = 0.05

anticipation: int = 0

survey_metadata: Any | None = None

aggregate(type)[source]

Compute and store one of the four jwdid_estat aggregation types.

Parameters:

type ("simple" | "group" | "calendar" | "event")
chaining. (Returns self for)

Return type:

WooldridgeDiDResults

summary(aggregation='simple')[source]

Print formatted summary table.

Parameters:: aggregation (which aggregation to display ("simple", "group", "calendar", "event"))
Return type:: str

to_dataframe(aggregation='event')[source]

Export aggregated effects to a DataFrame.

Parameters:: aggregation ("simple" | "group" | "calendar" | "event" | "gt") – Use “gt” to export raw group-time effects.
Return type:: DataFrame

plot_event_study(**kwargs)[source]

Event study plot. Calls aggregate(‘event’) if needed.

Return type:: None

property att: float

property se: float

__init__(group_time_effects, overall_att, overall_se, overall_t_stat, overall_p_value, overall_conf_int, group_effects=None, calendar_effects=None, event_study_effects=None, method='ols', control_group='not_yet_treated', groups=<factory>, time_periods=<factory>, n_obs=0, n_treated_units=0, n_control_units=0, alpha=0.05, anticipation=0, survey_metadata=None, _gt_weights=<factory>, _gt_vcov=None, _gt_keys=<factory>, _df_survey=None)

Parameters:

group_time_effects (Dict[Tuple[Any, Any], Dict[str, Any]])
overall_att (float)
overall_se (float)
overall_t_stat (float)
overall_p_value (float)
overall_conf_int (Tuple[float, float])
group_effects (Dict[Any, Dict] | None)
calendar_effects (Dict[Any, Dict] | None)
event_study_effects (Dict[int, Dict] | None)
method (str)
control_group (str)
groups (List[Any])
time_periods (List[Any])
n_obs (int)
n_treated_units (int)
n_control_units (int)
alpha (float)
anticipation (int)
survey_metadata (Any | None)
_gt_weights (Dict[Tuple[Any, Any], int])
_gt_vcov (ndarray | None)
_gt_keys (List[Tuple[Any, Any]])
_df_survey (int | None)

Return type:

None

property conf_int: Tuple[float, float]

property p_value: float

property t_stat: float

Example Usage#

Basic OLS (follows Stata jwdid y, ivar(unit) tvar(time) gvar(cohort)):

import pandas as pd
from diff_diff import WooldridgeDiD

df = pd.read_stata("mpdta.dta")
df['first_treat'] = df['first_treat'].astype(int)

m = WooldridgeDiD()
r = m.fit(df, outcome='lemp', unit='countyreal', time='year', cohort='first_treat')

r.aggregate('event').aggregate('group').aggregate('simple')
print(r.summary('event'))
print(r.summary('group'))
print(r.summary('simple'))

View cohort×time cell estimates (post-treatment):

for (g, t), v in sorted(r.group_time_effects.items()):
    if t >= g:
        print(f"g={g} t={t}  ATT={v['att']:.4f}  SE={v['se']:.4f}")

Poisson QMLE for non-negative outcomes (follows Stata jwdid emp, method(poisson)):

import numpy as np
df['emp'] = np.exp(df['lemp'])

m_pois = WooldridgeDiD(method='poisson')
r_pois = m_pois.fit(df, outcome='emp', unit='countyreal',
                    time='year', cohort='first_treat')
r_pois.aggregate('event').aggregate('group').aggregate('simple')
print(r_pois.summary('simple'))

Logit for binary outcomes (follows Stata jwdid y, method(logit)):

m_logit = WooldridgeDiD(method='logit')
r_logit = m_logit.fit(df, outcome='hi_emp', unit='countyreal',
                      time='year', cohort='first_treat')
r_logit.aggregate('group').aggregate('simple')
print(r_logit.summary('group'))

Aggregation Methods#

Call .aggregate(type) before .summary(type):

Type	Description	Stata equivalent
`'event'`	ATT by relative time k = t − g	`estat event`
`'group'`	ATT averaged across post-treatment periods per cohort	`estat group`
`'calendar'`	ATT averaged across cohorts per calendar period	`estat calendar`
`'simple'`	Overall weighted average ATT	`estat simple`

Comparison with Other Staggered Estimators#

Feature	WooldridgeDiD (ETWFE)	CallawaySantAnna	ImputationDiD
Approach	Single saturated regression	Separate 2×2 DiD per cell	Impute Y(0) via FE model
Nonlinear outcomes	Yes (Poisson, Logit)	No	No
Covariates	Via regression (linear index)	OR, IPW, DR	Supported
SE for aggregations	Delta method	Multiplier bootstrap	Multiplier bootstrap
Stata equivalent	`jwdid`	`csdid`	`did_imputation`