diff_diff.TwoWayFixedEffects

class diff_diff.TwoWayFixedEffects[source]

Bases: DifferenceInDifferences

Two-Way Fixed Effects (TWFE) estimator for panel DiD.

Extends DifferenceInDifferences to handle panel data with unit and time fixed effects.

Parameters:

robust (bool, default=True) – Whether to use heteroskedasticity-robust standard errors.
cluster (str, optional) – Column name for cluster-robust standard errors. If None, automatically clusters at the unit level (the unit parameter passed to fit()). This differs from DifferenceInDifferences where cluster=None means no clustering.
alpha (float, default=0.05) – Significance level for confidence intervals.

Notes

This estimator uses the regression:

Y_it = α_i + γ_t + β*(D_i × Post_t) + X_it’δ + ε_it

where α_i are unit fixed effects and γ_t are time fixed effects.

Warning: TWFE can be biased with staggered treatment timing and heterogeneous treatment effects. Consider using more robust estimators (e.g., Callaway-Sant’Anna) for staggered designs.

__init__(robust=True, cluster=None, alpha=0.05, inference='analytical', n_bootstrap=999, bootstrap_weights='rademacher', seed=None, rank_deficient_action='warn')

Parameters:

robust (bool)
cluster (str | None)
alpha (float)
inference (str)
n_bootstrap (int)
bootstrap_weights (str)
seed (int | None)
rank_deficient_action (str)

Methods

`__init__`([robust, cluster, alpha, ...])
`decompose`(data, outcome, unit, time, first_treat)	Perform Goodman-Bacon decomposition of TWFE estimate.
`fit`(data, outcome, treatment, time, unit[, ...])	Fit Two-Way Fixed Effects model.
`get_params`()	Get estimator parameters (sklearn-compatible).
`predict`(data)	Predict outcomes using fitted model.
`print_summary`()	Print summary to stdout.
`set_params`(**params)	Set estimator parameters (sklearn-compatible).
`summary`()	Get summary of estimation results.

fit(data, outcome, treatment, time, unit, covariates=None)[source]

Fit Two-Way Fixed Effects model.

Parameters:

data (pd.DataFrame) – Panel data.
outcome (str) – Name of outcome variable column.
treatment (str) – Name of treatment indicator column.
time (str) – Name of time period column.
unit (str) – Name of unit identifier column.
covariates (list, optional) – List of covariate column names.

Returns:

Estimation results.

Return type:

DiDResults

decompose(data, outcome, unit, time, first_treat, weights='approximate')[source]

Perform Goodman-Bacon decomposition of TWFE estimate.

Decomposes the TWFE estimate into a weighted average of all possible 2x2 DiD comparisons, revealing which comparisons drive the estimate and whether problematic “forbidden comparisons” are involved.

Parameters:

data (pd.DataFrame) – Panel data with unit and time identifiers.
outcome (str) – Name of outcome variable column.
unit (str) – Name of unit identifier column.
time (str) – Name of time period column.
first_treat (str) – Name of column indicating when each unit was first treated. Use 0 (or np.inf) for never-treated units.
weights (str, default="approximate") –
Weight calculation method: - “approximate”: Fast simplified formula (default). Good for

diagnostic purposes where relative weights are sufficient.
- ”exact”: Variance-based weights from Goodman-Bacon (2021) Theorem 1. Use for publication-quality decompositions.

Returns:

Decomposition results showing: - TWFE estimate and its weighted-average breakdown - List of all 2x2 comparisons with estimates and weights - Total weight by comparison type (clean vs forbidden)

Return type:

BaconDecompositionResults

Examples

>>> twfe = TwoWayFixedEffects()
>>> decomp = twfe.decompose(
...     data, outcome='y', unit='id', time='t', first_treat='treat_year'
... )
>>> decomp.print_summary()
>>> # Check weight on forbidden comparisons
>>> if decomp.total_weight_later_vs_earlier > 0.2:
...     print("Warning: significant forbidden comparison weight")

Notes

This decomposition is essential for understanding potential TWFE bias in staggered adoption designs. The three comparison types are:

Treated vs Never-treated: Clean comparisons using never-treated units as controls. These are always valid.
Earlier vs Later treated: Uses later-treated units as controls before they receive treatment. These are valid.
Later vs Earlier treated: Uses already-treated units as controls. These “forbidden comparisons” can introduce bias when treatment effects are dynamic (changing over time since treatment).