diff_diff.ChaisemartinDHaultfoeuille#

class diff_diff.ChaisemartinDHaultfoeuille[source]#

Bases: ChaisemartinDHaultfoeuilleBootstrapMixin

de Chaisemartin-D’Haultfoeuille (dCDH) estimator.

The only modern DiD estimator in the library that handles reversible (non-absorbing) treatments - treatment may switch on AND off over time. Computes the contemporaneous-switch DiD DID_M from the AER 2020 paper (equivalently DID_1 at horizon l = 1 of the dynamic companion paper, NBER WP 29873) plus the full multi-horizon event study DID_l for l = 1..L_max via the L_max parameter on fit().

Supported:

  • Headline DID_M plus multi-horizon DID_l event study

  • Joiners-only DID_+ and leavers-only DID_- decompositions

  • Single-lag placebo DID_M^pl and dynamic placebos DID^{pl}_l (computed automatically by default; gate via placebo=False)

  • Analytical SE via the cohort-recentered plug-in formula from Web Appendix Section 3.7.3; multiplier bootstrap clustered at the group level by default via n_bootstrap; under survey_design with strictly-coarser PSUs the bootstrap automatically upgrades to PSU-level Hall-Mammen wild clustering (see REGISTRY.md ChaisemartinDHaultfoeuille Note on survey + bootstrap)

  • Normalized estimator DID^n_l, cost-benefit aggregate delta, and sup-t simultaneous confidence bands

  • Residualization-style covariate adjustment (DID^X) via controls=, group-specific linear trends (DID^{fd}) via trends_linear=True, state-set-specific trends via trends_nonparam=, heterogeneity testing, non-binary treatment, HonestDiD sensitivity integration on placebos via honest_did=True

  • Per-path event-study disaggregation via by_path=k (top-k most common observed treatment paths within the window [F_g-1, F_g-1+L_max]; requires drop_larger_lower=False; supports binary or integer-coded discrete treatment) or via paths_of_interest=[(...), ...] for an explicit user-specified path subset (Python-only API; mutex with by_path=k)

  • Survey support via survey_design=: pweight with strata/PSU/FPC via Taylor Series Linearization (analytical) or replicate-weight variance (BRR/Fay/JK1/JKn/SDR)

  • TWFE decomposition diagnostic from Theorem 1 of AER 2020

Only aggregate on fit() still raises NotImplementedError.

Parameters:
  • alpha (float, default=0.05) – Significance level for confidence intervals.

  • cluster (str, optional, default=None) –

    Must be None (the default). User-specified clustering via this kwarg is not supported — passing any non-None value raises NotImplementedError at construction time (and the same gate fires from set_params). The effective clustering depends on how you call fit():

    • Default (no survey_design): clustered at the group level via the cohort-recentered influence-function plug-in (analytical SEs) and the multiplier bootstrap.

    • Under ``survey_design`` with auto-inject or explicit ``psu=group``: PSU coincides with the group and the group-level and PSU-level paths are bit-identical.

    • Under ``survey_design`` with strictly-coarser PSUs: the multiplier bootstrap automatically upgrades to PSU-level Hall-Mammen wild clustering.

    So dCDH does NOT always cluster at the group level — see REGISTRY.md ChaisemartinDHaultfoeuille Notes on cluster contract and survey + bootstrap for the full matrix. Custom user-specified clustering at a coarser or finer level than the group is a planned extension.

  • n_bootstrap (int, default=0) – Number of multiplier-bootstrap iterations. 0 (default) uses only the analytical SE. Set to 999 or higher for stable bootstrap inference.

  • bootstrap_weights (str, default="rademacher") – Type of multiplier-bootstrap weights: "rademacher", "mammen", or "webb". Ignored unless n_bootstrap > 0.

  • seed (int, optional) – Random seed for the multiplier bootstrap.

  • placebo (bool, default=True) – If True (default), automatically compute the single-lag placebo DID_M^pl (AER 2020 placebo specification) on the same data. Set to False to skip the placebo computation for speed; the results object will still expose placebo_* fields, but with NaN values and placebo_available=False.

  • twfe_diagnostic (bool, default=True) – If True (default), compute the TWFE decomposition diagnostic from Theorem 1 of AER 2020: per-(g, t) weights, fraction of treated cells with negative weights, and sigma_fe (the smallest cell-effect standard deviation that could flip the sign of the plain TWFE coefficient). The diagnostic answers “what would the plain TWFE estimator say on the data you passed in?”, so it runs on the FULL pre-filter cell sample (the same input as the standalone twowayfeweights() function), NOT on the post-filter estimation sample used by DID_M. When the ragged-panel filter or drop_larger_lower drops groups, the fitted results.twfe_* values describe a LARGER sample (pre-filter) than results.overall_att and a UserWarning is emitted to make the divergence explicit. See REGISTRY.md ChaisemartinDHaultfoeuille Note (TWFE diagnostic sample contract) for the full rationale.

  • drop_larger_lower (bool, default=True) – If True (default, matches R DIDmultiplegtDYN), drops groups whose treatment switches more than once (multi-switch groups) before estimation. This is required for the analytical variance formula to be consistent with the AER 2020 Theorem 3 point estimate — both formulas operate on the same post-drop dataset. Setting to False is supported for diagnostic comparison but produces an inconsistent estimator-variance pairing for multi-switch groups; a warning is emitted.

  • by_path (int, optional, default=None) –

    If set to a positive integer k, disaggregate the per-horizon event study by the observed treatment trajectory in the window [F_g - 1, F_g, ..., F_g - 1 + L_max], reporting ATT + SE + inference for the k most common observed paths (ties broken lexicographically on the path tuple). If k exceeds the number of observed paths, all paths are returned and a UserWarning is emitted. None (the default) disables the disaggregation.

    Requires drop_larger_lower=False (multi-switch groups are the object of interest) and L_max >= 1 (the path window depends on L_max). Compatible with non-binary integer-coded treatment (D in Z); path tuples become integer-state tuples like (0, 2, 2, 2). D values must be integer-valued (D == round(D)); a ValueError is raised at fit-time on continuous D. Compatible with survey_design for analytical Binder TSL SE and replicate-weight bootstrap; per-path SE routes through the cell-period allocator, with non-path switcher-side contributions skipped (control contributions remain unchanged, matching the joiners/leavers IF convention). n_bootstrap > 0 (multiplier bootstrap) under survey_design is not yet supported and raises NotImplementedError. Top-k path ranking under survey_design remains group-cardinality-based (unweighted), not population-weight-based — survey weights do not affect which paths are selected as “top-k”.

    Compatible with heterogeneity="<col>" — per-path heterogeneity coefficient is computed by re-running the Lemma 7 regression on each path-restricted switcher subsample. Cohort dummies absorb baseline (no R-divergence warning needed). Surfaces on results.path_heterogeneity_effects keyed {path: {l: {beta, se, t_stat, p_value, conf_int, n_obs}}} and on to_dataframe(level="by_path") via het_* columns. Mirrors R did_multiplegt_dyn(..., by_path, predict_het) per-by_level. Composes with survey_design (analytical Binder TSL + replicate-weight) via the existing cell-period IF allocator path. Incompatible with design2 and honest_did (each combination raises NotImplementedError in the current release).

    Mutually exclusive with paths_of_interest — use by_path=k for top-k automatic ranking by frequency, or paths_of_interest=[(...), ...] for an explicit user- specified path list. Setting both raises ValueError.

    Compatible with controls (DID^X residualization) – the per-baseline OLS residualization runs once on first-differenced Y BEFORE path enumeration, so per-path point estimates, bootstrap SE, per-path placebos, and per-path sup-t bands all consume the residualized Y_mat automatically (Frisch- Waugh-Lovell). Per-period effects remain unadjusted, consistent with the existing controls + per-period DID contract.

    Deviation from R on multi-baseline switcher panels: R did_multiplegt_dyn(..., by_path, controls) re-runs the per-baseline residualization on each path’s restricted subsample (path’s switchers + same-baseline not-yet-treated controls), so its residualization coefficients vary per path when switchers have different baseline values. Our global- residualization architecture coincides with R on single- baseline panels (every switcher shares the same D_{g,1}) and per-path point estimates match exactly on the one- observation-per-(g, t) regime; on multi-observation-per- cell panels the existing DID^X cell-weighting deviation from R applies (see docs/methodology/REGISTRY.md “Note (Phase 3 DID^X covariate adjustment)”; independent of the by_path lift). On multi-baseline switcher panels, point estimates can diverge — a UserWarning is emitted at fit-time when this configuration is detected. SE inherits the cross-path cohort- sharing deviation from R documented for path_effects.

    Compatible with trends_linear (DID^{fd} group-specific linear trends) – first-differencing replaces Y with Z = Y_t - Y_{t-1} once globally before path enumeration, so per-path raw second-differences DID^{fd}_{path, l} surface on path_effects[path]["horizons"][l] automatically. Per-path cumulated level effects delta_{path, l} = sum_{l'=1..l} DID^{fd}_{path, l'} are surfaced on the new results.path_cumulated_event_study[path][l] field (mirroring the global linear_trends_effects cumulation; inner dict keyed by horizon directly, no "horizons" wrapper). SE on the cumulated layer is the conservative upper bound (sum of per-horizon component SEs, NaN-consistent), matching the global linear_trends_effects SE convention. Path enumeration runs on the post-first-differenced N_mat_fd: switchers with F_g==2 fail the window-eligibility check and are dropped from path enumeration entirely, so a path whose switchers all have F_g < 3 is silently absent from path_effects (the existing global F_g < 3 warning still fires). Per-path R parity matches R did_multiplegt_dyn(..., by_path, trends_lin) on per-path cumulated point estimates under single-baseline panels with sufficient pre-window depth (F_g >= 4 for every selected- path switcher). R re-runs the per-path full pipeline on each path’s restricted subsample; same multi-baseline divergence pattern as controls (a UserWarning fires when switcher baselines take multiple values). F_g=3 boundary-case divergence: F_g=3 switchers have only 1 valid pre-window Z value after first-differencing and the time==1 filter, which causes Python’s global-then-disaggregate architecture to diverge from R’s per-path full-pipeline call (30%+ on point estimates observed empirically). A separate UserWarning fires at fit-time when the panel includes any F_g=3 switchers and by_path + trends_linear is set, so practitioners hitting this boundary regime see the divergence flag explicitly. Placebo under trends_linear returns RAW per-horizon values, not cumulated – there is no per-path placebo cumulation surface (verified empirically against R via the existing joiners_only_trends_lin parity scenario).

    Compatible with trends_nonparam (state-set trends) – the set membership column is validated and stored once globally (time-invariance, NaN rejection, partition coarseness checks unchanged); per-path analytical SE, bootstrap SE, per-path placebos, and per-path sup-t bands all inherit the set-restricted control pool automatically through the set_ids parameter threaded through the per-path IF helpers. Per-path R parity matches R did_multiplegt_dyn(..., by_path, trends_nonparam) on per-path point estimates under single-baseline panels.

    Compatible with n_bootstrap > 0 – the top-k paths are enumerated once on the observed data (paths held fixed across bootstrap draws, matching R did_multiplegt_dyn(..., by_path, bootstrap=B)) and bootstrap SE / percentile CI / percentile p-value are written to path_effects[path]["horizons"][l] in place of the analytical fields. See REGISTRY.md for the full bootstrap contract.

    Compatible with placebo=True – when both are active, per-path backward-horizon placebos DID^{pl}_{path, l} for l = 1..L_max are surfaced on results.path_placebo_event_study[path][-l] (negative-int keys mirroring placebo_event_study). The same per-path SE convention is applied backward (joiners/leavers IF precedent; cohort-recentered plug-in with path-specific divisor); the cross-path cohort-sharing deviation from R is inherited from the analytical event-study path.

    With n_bootstrap > 0, per-path joint sup-t simultaneous confidence bands are also computed across horizons 1..L_max within each path. A path-specific critical value c_p (constructed from a fresh shared-weights multiplier- bootstrap draw per path) is surfaced at top level as results.path_sup_t_bands[path] = {"crit_value", "alpha", "n_bootstrap", "method", "n_valid_horizons"}, applied per-horizon as cband_conf_int on path_effects[path]["horizons"][l], and rendered as cband_lower / cband_upper columns on results.to_dataframe(level="by_path") (mirroring the OVERALL level="event_study" schema). Bands cover joint inference WITHIN a single path across horizons; they do NOT provide simultaneous coverage across paths. Python-only library extension; R did_multiplegt_dyn provides no joint bands at any surface. See REGISTRY.md Note (Phase 3 by_path per-path joint sup-t bands).

    SE convention: per-path IF parallels the joiners / leavers construction — the switcher-side contribution is zeroed for groups not in the selected path, and the cohort structure and control pool are unchanged. Plug-in SE uses the path-specific divisor N_l_path (count of path switchers eligible at horizon l), matching how joiners_se / leavers_se use their respective counts as divisors. See REGISTRY.md ChaisemartinDHaultfoeuille Note on by_path for the full contract.

    Results are exposed on results.path_effects as a dict keyed by the path tuple, with nested "horizons" dicts per horizon l. Also available via results.to_dataframe(level="by_path").

  • paths_of_interest (list of tuple of int, optional, default=None) –

    Explicit user-specified treatment paths to disaggregate by, as an alternative to by_path=k’s top-k automatic ranking. Each path tuple must have length L_max + 1 and represents the treatment trajectory in the window [F_g - 1, F_g, ..., F_g - 1 + L_max], e.g. [(0, 1, 1, 1), (0, 1, 0, 0)] for two paths under L_max=3. Mutually exclusive with by_path; setting both raises ValueError.

    Validation:

    • Each path element must be an int (bool and np.bool_ rejected; np.integer accepted and canonicalized to Python int).

    • All paths must have the same length (uniformity validated at __init__; length match against L_max + 1 validated at fit-time).

    • Empty list raises ValueError.

    • Duplicate paths are deduplicated with a UserWarning.

    • A path with zero observed groups in the panel emits a UserWarning and is omitted from path_effects.

    Compatible with non-binary integer treatment (paths can contain integer states like (0, 2, 2)).

    Compatible with all downstream surfaces inherited by by_path: bootstrap, per-path placebos, per-path joint sup-t bands, controls, trends_linear, trends_nonparam, survey_design (analytical Binder TSL + replicate-weight; multiplier bootstrap under survey remains gated, same as by_path=k), and heterogeneity (per-path heterogeneity coefficient surfaces on results.path_heterogeneity_effects). Mechanical extension to path enumeration; no methodology change.

    Order semantics: paths appear in results.path_effects in the user-specified order, modulo deduplication and unobserved-path filtering.

    Python-only API extension; no R equivalent. R’s did_multiplegt_dyn(..., by_path=k) only accepts a positive int (top-k) or -1 (all paths); there is no list-based path selection in R.

    Results expose the same surfaces as by_path: results.path_effects (dict keyed by path tuple), results.path_placebo_event_study, results.path_sup_t_bands, results.path_cumulated_event_study (under trends_linear), and the level="by_path" DataFrame.

  • rank_deficient_action (str, default="warn") – Action when the TWFE decomposition diagnostic OLS encounters a rank-deficient design matrix: "warn", "error", or "silent". Only used when twfe_diagnostic=True.

results_#

Estimation results after calling fit().

Type:

ChaisemartinDHaultfoeuilleResults

is_fitted_#

Whether the model has been fitted.

Type:

bool

Notes

The analytical CI is conservative under Assumption 8 (independent groups) of the dynamic companion paper, and exact only under iid sampling. This is documented as a deliberate deviation from “default nominal coverage” in REGISTRY.md.

Examples

Basic single-switch panel:

>>> from diff_diff import ChaisemartinDHaultfoeuille
>>> from diff_diff.prep_dgp import generate_reversible_did_data
>>> data = generate_reversible_did_data(n_groups=80, n_periods=6, seed=42)
>>> est = ChaisemartinDHaultfoeuille()
>>> results = est.fit(
...     data, outcome="outcome", group="group",
...     time="period", treatment="treatment",
... )
>>> abs(results.overall_att - 2.0) < 1.0  # close to the true effect
True

Methods

__init__([alpha, cluster, n_bootstrap, ...])

fit(data, outcome, group, time, treatment[, ...])

Fit the dCDH estimator on individual-level panel data.

get_params()

Return all __init__ parameters as a dictionary.

set_params(**params)

Set estimator parameters (sklearn-compatible).

Attributes

n_bootstrap

bootstrap_weights

alpha

seed

__init__(alpha=0.05, cluster=None, n_bootstrap=0, bootstrap_weights='rademacher', seed=None, placebo=True, twfe_diagnostic=True, drop_larger_lower=True, by_path=None, paths_of_interest=None, rank_deficient_action='warn')[source]#
Parameters:
Return type:

None

classmethod __new__(*args, **kwargs)#