de Chaisemartin-D’Haultfœuille (dCDH) DiD#

The only modern staggered DiD estimator in diff-diff that handles non-absorbing (reversible) treatments — treatment may switch on AND off over time.

This module implements the methodology from de Chaisemartin & D’Haultfœuille (2020/2022). The estimator ships the contemporaneous-switch path DID_M (= DID_1 at horizon l = 1); the full multi-horizon event study DID_l for l = 1..L_max via the L_max parameter, with normalized estimator DID^n_l, cost-benefit aggregate delta, dynamic placebos DID^{pl}_l, and sup-t simultaneous confidence bands; residualization-style covariate adjustment (controls); group-specific linear trends (trends_linear); state-set-specific trends (trends_nonparam); heterogeneity testing; non-binary treatment; HonestDiD sensitivity integration on placebos; survey support via Taylor-series linearization (pweight + strata/PSU/FPC); and per-path event-study disaggregation via by_path=k (mirrors R did_multiplegt_dyn(..., by_path=k), including per-path backward placebos and per-path joint sup-t simultaneous bands when n_bootstrap > 0 — Python-only extension beyond R, which provides no joint bands at any surface) or via paths_of_interest=[(...), ...] for an explicit user-specified path subset (Python-only API; mutex with by_path). by_path supports binary or integer-coded discrete (D in Z) treatment, and composes with survey_design for analytical Binder TSL SE and replicate-weight bootstrap variance (multiplier bootstrap under survey + by_path remains gated; no R parity since R did_multiplegt_dyn does not support survey weighting). by_path and paths_of_interest also compose with heterogeneity="<col>": per-path heterogeneity coefficient surfaces on results.path_heterogeneity_effects (mirrors R did_multiplegt_dyn(..., by_path, predict_het) per-by_level). When combined with placebo=True, heterogeneity is also computed on backward (placebo) horizons and surfaced under negative-int keys — both globally on results.heterogeneity_effects[-l] and per-path on results.path_heterogeneity_effects[path][-l]; to_dataframe(level="by_path") placebo rows have populated het_* columns. survey_design + placebo + heterogeneity emits a UserWarning at fit-time and falls back to forward-horizon-only heterogeneity until the pre-period cell allocator is derived; forward- horizon predict_het + survey_design continues to work unchanged.

The estimator:

  1. Aggregates individual-level panel data to (group, time) cells

  2. Drops multi-switch groups by default (matches R DIDmultiplegtDYN)

  3. Excludes singleton-baseline groups from the variance computation only (footnote 15 of the dynamic paper)

  4. Computes per-period joiner (DID_{+,t}) and leaver (DID_{-,t}) contributions via Theorem 3 of the AER 2020 paper

  5. Aggregates them into DID_M, the joiners-only DID_+, and the leavers-only DID_-

  6. Computes the single-lag placebo DID_M^pl

  7. When L_max >= 2: computes per-group DID_{g,l} building blocks, multi-horizon DID_l, dynamic placebos DID^{pl}_l, normalized DID^n_l, and cost-benefit aggregate delta

  8. Optionally computes the TWFE decomposition diagnostic from Theorem 1 (per-cell weights, fraction negative, sigma_fe)

  9. Inference uses the cohort-recentered analytical plug-in variance from Web Appendix Section 3.7.3 of the dynamic paper, optionally complemented by a multiplier bootstrap clustered at the group level (with sup-t simultaneous confidence bands when L_max >= 2)

When to use ChaisemartinDHaultfoeuille:

  • Treatment can switch on and off over time (e.g., marketing campaigns, seasonal promotions, on/off policy cycles)

  • You need separate joiners (DID_+) and leavers (DID_-) views, plus the aggregate DID_M

  • You want a built-in placebo and a TWFE decomposition diagnostic computed on the data you pass in (pre-filter) for direct comparison against DID_M. The fitted TWFE diagnostic uses the FULL pre-filter cell sample (matching twowayfeweights()); when fit() drops groups via the ragged-panel or drop_larger_lower filters, a UserWarning is emitted to make the divergence from the post-filter DID_M sample explicit. See REGISTRY.md ChaisemartinDHaultfoeuille Note (TWFE diagnostic sample contract) for the rationale.

  • You want a Python implementation that matches R DIDmultiplegtDYN at l = 1 on cell-aggregated input (see REGISTRY.md for documented deviations on individual-level inputs with uneven cell sizes)

All other staggered estimators in diff-diff (CallawaySantAnna, SunAbraham, ImputationDiD, TwoStageDiD, EfficientDiD, WooldridgeDiD) assume treatment is absorbing — once treated, stays treated. ChaisemartinDHaultfoeuille is the only library option for non-absorbing treatments.

Panel requirements (deviation from R DIDmultiplegtDYN):

  • Every group must have an observation at the first global period (the panel’s earliest time value). Groups missing this baseline raise ValueError with the offending group IDs.

  • Groups with interior period gaps (missing observations between their first and last observed period) are dropped with a UserWarning.

  • Terminal missingness (groups observed at the baseline but missing one or more later periods - early exit / right-censoring) is supported. The group contributes from its observed periods only, masked out of the missing transitions by the per-period present guard in the variance computation.

  • This is a documented deviation from R DIDmultiplegtDYN, which supports unbalanced panels with missing-treatment-before-first-switch handling. Workaround: pre-process your panel to back-fill the baseline (or drop late-entry groups before fitting), or use R until this restriction is lifted. See the Note (deviation from R DIDmultiplegtDYN) block in docs/methodology/REGISTRY.md for the rationale and the exact defensive guards that make terminal missingness safe.

References:

  • de Chaisemartin, C. & D’Haultfœuille, X. (2020). Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects. American Economic Review, 110(9), 2964-2996.

  • de Chaisemartin, C. & D’Haultfœuille, X. (2022, revised 2024). Difference-in-Differences Estimators of Intertemporal Treatment Effects. NBER Working Paper 29873.

ChaisemartinDHaultfoeuille#

Main estimator class for de Chaisemartin-D’Haultfœuille (dCDH) DiD estimation. The alias DCDH is also available.

class diff_diff.ChaisemartinDHaultfoeuille[source]

Bases: ChaisemartinDHaultfoeuilleBootstrapMixin

de Chaisemartin-D’Haultfoeuille (dCDH) estimator.

The only modern DiD estimator in the library that handles reversible (non-absorbing) treatments - treatment may switch on AND off over time. Computes the contemporaneous-switch DiD DID_M from the AER 2020 paper (equivalently DID_1 at horizon l = 1 of the dynamic companion paper, NBER WP 29873) plus the full multi-horizon event study DID_l for l = 1..L_max via the L_max parameter on fit().

Supported:

  • Headline DID_M plus multi-horizon DID_l event study

  • Joiners-only DID_+ and leavers-only DID_- decompositions

  • Single-lag placebo DID_M^pl and dynamic placebos DID^{pl}_l (computed automatically by default; gate via placebo=False)

  • Analytical SE via the cohort-recentered plug-in formula from Web Appendix Section 3.7.3; multiplier bootstrap clustered at the group level by default via n_bootstrap; under survey_design with strictly-coarser PSUs the bootstrap automatically upgrades to PSU-level Hall-Mammen wild clustering (see REGISTRY.md ChaisemartinDHaultfoeuille Note on survey + bootstrap)

  • Normalized estimator DID^n_l, cost-benefit aggregate delta, and sup-t simultaneous confidence bands

  • Residualization-style covariate adjustment (DID^X) via controls=, group-specific linear trends (DID^{fd}) via trends_linear=True, state-set-specific trends via trends_nonparam=, heterogeneity testing, non-binary treatment, HonestDiD sensitivity integration on placebos via honest_did=True

  • Per-path event-study disaggregation via by_path=k (top-k most common observed treatment paths within the window [F_g-1, F_g-1+L_max]; requires drop_larger_lower=False; supports binary or integer-coded discrete treatment) or via paths_of_interest=[(...), ...] for an explicit user-specified path subset (Python-only API; mutex with by_path=k)

  • Survey support via survey_design=: pweight with strata/PSU/FPC via Taylor Series Linearization (analytical) or replicate-weight variance (BRR/Fay/JK1/JKn/SDR)

  • TWFE decomposition diagnostic from Theorem 1 of AER 2020

Only aggregate on fit() still raises NotImplementedError.

Parameters:
  • alpha (float, default=0.05) – Significance level for confidence intervals.

  • cluster (str, optional, default=None) –

    Must be None (the default). User-specified clustering via this kwarg is not supported — passing any non-None value raises NotImplementedError at construction time (and the same gate fires from set_params). The effective clustering depends on how you call fit():

    • Default (no survey_design): clustered at the group level via the cohort-recentered influence-function plug-in (analytical SEs) and the multiplier bootstrap.

    • Under ``survey_design`` with auto-inject or explicit ``psu=group``: PSU coincides with the group and the group-level and PSU-level paths are bit-identical.

    • Under ``survey_design`` with strictly-coarser PSUs: the multiplier bootstrap automatically upgrades to PSU-level Hall-Mammen wild clustering.

    So dCDH does NOT always cluster at the group level — see REGISTRY.md ChaisemartinDHaultfoeuille Notes on cluster contract and survey + bootstrap for the full matrix. Custom user-specified clustering at a coarser or finer level than the group is a planned extension.

  • n_bootstrap (int, default=0) – Number of multiplier-bootstrap iterations. 0 (default) uses only the analytical SE. Set to 999 or higher for stable bootstrap inference.

  • bootstrap_weights (str, default="rademacher") – Type of multiplier-bootstrap weights: "rademacher", "mammen", or "webb". Ignored unless n_bootstrap > 0.

  • seed (int, optional) – Random seed for the multiplier bootstrap.

  • placebo (bool, default=True) – If True (default), automatically compute the single-lag placebo DID_M^pl (AER 2020 placebo specification) on the same data. Set to False to skip the placebo computation for speed; the results object will still expose placebo_* fields, but with NaN values and placebo_available=False.

  • twfe_diagnostic (bool, default=True) – If True (default), compute the TWFE decomposition diagnostic from Theorem 1 of AER 2020: per-(g, t) weights, fraction of treated cells with negative weights, and sigma_fe (the smallest cell-effect standard deviation that could flip the sign of the plain TWFE coefficient). The diagnostic answers “what would the plain TWFE estimator say on the data you passed in?”, so it runs on the FULL pre-filter cell sample (the same input as the standalone twowayfeweights() function), NOT on the post-filter estimation sample used by DID_M. When the ragged-panel filter or drop_larger_lower drops groups, the fitted results.twfe_* values describe a LARGER sample (pre-filter) than results.overall_att and a UserWarning is emitted to make the divergence explicit. See REGISTRY.md ChaisemartinDHaultfoeuille Note (TWFE diagnostic sample contract) for the full rationale.

  • drop_larger_lower (bool, default=True) – If True (default, matches R DIDmultiplegtDYN), drops groups whose treatment switches more than once (multi-switch groups) before estimation. This is required for the analytical variance formula to be consistent with the AER 2020 Theorem 3 point estimate — both formulas operate on the same post-drop dataset. Setting to False is supported for diagnostic comparison but produces an inconsistent estimator-variance pairing for multi-switch groups; a warning is emitted.

  • by_path (int, optional, default=None) –

    If set to a positive integer k, disaggregate the per-horizon event study by the observed treatment trajectory in the window [F_g - 1, F_g, ..., F_g - 1 + L_max], reporting ATT + SE + inference for the k most common observed paths (ties broken lexicographically on the path tuple). If k exceeds the number of observed paths, all paths are returned and a UserWarning is emitted. None (the default) disables the disaggregation.

    Requires drop_larger_lower=False (multi-switch groups are the object of interest) and L_max >= 1 (the path window depends on L_max). Compatible with non-binary integer-coded treatment (D in Z); path tuples become integer-state tuples like (0, 2, 2, 2). D values must be integer-valued (D == round(D)); a ValueError is raised at fit-time on continuous D. Compatible with survey_design for analytical Binder TSL SE and replicate-weight bootstrap; per-path SE routes through the cell-period allocator, with non-path switcher-side contributions skipped (control contributions remain unchanged, matching the joiners/leavers IF convention). n_bootstrap > 0 (multiplier bootstrap) under survey_design is not yet supported and raises NotImplementedError. Top-k path ranking under survey_design remains group-cardinality-based (unweighted), not population-weight-based — survey weights do not affect which paths are selected as “top-k”.

    Compatible with heterogeneity="<col>" — per-path heterogeneity coefficient is computed by re-running the Lemma 7 regression on each path-restricted switcher subsample. Cohort dummies absorb baseline (no R-divergence warning needed). Surfaces on results.path_heterogeneity_effects keyed {path: {l: {beta, se, t_stat, p_value, conf_int, n_obs}}} and on to_dataframe(level="by_path") via het_* columns. Mirrors R did_multiplegt_dyn(..., by_path, predict_het) per-by_level. Composes with survey_design (analytical Binder TSL + replicate-weight) via the existing cell-period IF allocator path. Incompatible with design2 and honest_did (each combination raises NotImplementedError in the current release).

    Mutually exclusive with paths_of_interest — use by_path=k for top-k automatic ranking by frequency, or paths_of_interest=[(...), ...] for an explicit user- specified path list. Setting both raises ValueError.

    Compatible with controls (DID^X residualization) – the per-baseline OLS residualization runs once on first-differenced Y BEFORE path enumeration, so per-path point estimates, bootstrap SE, per-path placebos, and per-path sup-t bands all consume the residualized Y_mat automatically (Frisch- Waugh-Lovell). Per-period effects remain unadjusted, consistent with the existing controls + per-period DID contract.

    Deviation from R on multi-baseline switcher panels: R did_multiplegt_dyn(..., by_path, controls) re-runs the per-baseline residualization on each path’s restricted subsample (path’s switchers + same-baseline not-yet-treated controls), so its residualization coefficients vary per path when switchers have different baseline values. Our global- residualization architecture coincides with R on single- baseline panels (every switcher shares the same D_{g,1}) and per-path point estimates match exactly on the one- observation-per-(g, t) regime; on multi-observation-per- cell panels the existing DID^X cell-weighting deviation from R applies (see docs/methodology/REGISTRY.md “Note (Phase 3 DID^X covariate adjustment)”; independent of the by_path lift). On multi-baseline switcher panels, point estimates can diverge — a UserWarning is emitted at fit-time when this configuration is detected. SE inherits the cross-path cohort- sharing deviation from R documented for path_effects.

    Compatible with trends_linear (DID^{fd} group-specific linear trends) – first-differencing replaces Y with Z = Y_t - Y_{t-1} once globally before path enumeration, so per-path raw second-differences DID^{fd}_{path, l} surface on path_effects[path]["horizons"][l] automatically. Per-path cumulated level effects delta_{path, l} = sum_{l'=1..l} DID^{fd}_{path, l'} are surfaced on the new results.path_cumulated_event_study[path][l] field (mirroring the global linear_trends_effects cumulation; inner dict keyed by horizon directly, no "horizons" wrapper). SE on the cumulated layer is the conservative upper bound (sum of per-horizon component SEs, NaN-consistent), matching the global linear_trends_effects SE convention. Path enumeration runs on the post-first-differenced N_mat_fd: switchers with F_g==2 fail the window-eligibility check and are dropped from path enumeration entirely, so a path whose switchers all have F_g < 3 is silently absent from path_effects (the existing global F_g < 3 warning still fires). Per-path R parity matches R did_multiplegt_dyn(..., by_path, trends_lin) on per-path cumulated point estimates under single-baseline panels with sufficient pre-window depth (F_g >= 4 for every selected- path switcher). R re-runs the per-path full pipeline on each path’s restricted subsample; same multi-baseline divergence pattern as controls (a UserWarning fires when switcher baselines take multiple values). F_g=3 boundary-case divergence: F_g=3 switchers have only 1 valid pre-window Z value after first-differencing and the time==1 filter, which causes Python’s global-then-disaggregate architecture to diverge from R’s per-path full-pipeline call (30%+ on point estimates observed empirically). A separate UserWarning fires at fit-time when the panel includes any F_g=3 switchers and by_path + trends_linear is set, so practitioners hitting this boundary regime see the divergence flag explicitly. Placebo under trends_linear returns RAW per-horizon values, not cumulated – there is no per-path placebo cumulation surface (verified empirically against R via the existing joiners_only_trends_lin parity scenario).

    Compatible with trends_nonparam (state-set trends) – the set membership column is validated and stored once globally (time-invariance, NaN rejection, partition coarseness checks unchanged); per-path analytical SE, bootstrap SE, per-path placebos, and per-path sup-t bands all inherit the set-restricted control pool automatically through the set_ids parameter threaded through the per-path IF helpers. Per-path R parity matches R did_multiplegt_dyn(..., by_path, trends_nonparam) on per-path point estimates under single-baseline panels.

    Compatible with n_bootstrap > 0 – the top-k paths are enumerated once on the observed data (paths held fixed across bootstrap draws, matching R did_multiplegt_dyn(..., by_path, bootstrap=B)) and bootstrap SE / percentile CI / percentile p-value are written to path_effects[path]["horizons"][l] in place of the analytical fields. See REGISTRY.md for the full bootstrap contract.

    Compatible with placebo=True – when both are active, per-path backward-horizon placebos DID^{pl}_{path, l} for l = 1..L_max are surfaced on results.path_placebo_event_study[path][-l] (negative-int keys mirroring placebo_event_study). The same per-path SE convention is applied backward (joiners/leavers IF precedent; cohort-recentered plug-in with path-specific divisor); the cross-path cohort-sharing deviation from R is inherited from the analytical event-study path.

    With n_bootstrap > 0, per-path joint sup-t simultaneous confidence bands are also computed across horizons 1..L_max within each path. A path-specific critical value c_p (constructed from a fresh shared-weights multiplier- bootstrap draw per path) is surfaced at top level as results.path_sup_t_bands[path] = {"crit_value", "alpha", "n_bootstrap", "method", "n_valid_horizons"}, applied per-horizon as cband_conf_int on path_effects[path]["horizons"][l], and rendered as cband_lower / cband_upper columns on results.to_dataframe(level="by_path") (mirroring the OVERALL level="event_study" schema). Bands cover joint inference WITHIN a single path across horizons; they do NOT provide simultaneous coverage across paths. Python-only library extension; R did_multiplegt_dyn provides no joint bands at any surface. See REGISTRY.md Note (Phase 3 by_path per-path joint sup-t bands).

    SE convention: per-path IF parallels the joiners / leavers construction — the switcher-side contribution is zeroed for groups not in the selected path, and the cohort structure and control pool are unchanged. Plug-in SE uses the path-specific divisor N_l_path (count of path switchers eligible at horizon l), matching how joiners_se / leavers_se use their respective counts as divisors. See REGISTRY.md ChaisemartinDHaultfoeuille Note on by_path for the full contract.

    Results are exposed on results.path_effects as a dict keyed by the path tuple, with nested "horizons" dicts per horizon l. Also available via results.to_dataframe(level="by_path").

  • paths_of_interest (list of tuple of int, optional, default=None) –

    Explicit user-specified treatment paths to disaggregate by, as an alternative to by_path=k’s top-k automatic ranking. Each path tuple must have length L_max + 1 and represents the treatment trajectory in the window [F_g - 1, F_g, ..., F_g - 1 + L_max], e.g. [(0, 1, 1, 1), (0, 1, 0, 0)] for two paths under L_max=3. Mutually exclusive with by_path; setting both raises ValueError.

    Validation:

    • Each path element must be an int (bool and np.bool_ rejected; np.integer accepted and canonicalized to Python int).

    • All paths must have the same length (uniformity validated at __init__; length match against L_max + 1 validated at fit-time).

    • Empty list raises ValueError.

    • Duplicate paths are deduplicated with a UserWarning.

    • A path with zero observed groups in the panel emits a UserWarning and is omitted from path_effects.

    Compatible with non-binary integer treatment (paths can contain integer states like (0, 2, 2)).

    Compatible with all downstream surfaces inherited by by_path: bootstrap, per-path placebos, per-path joint sup-t bands, controls, trends_linear, trends_nonparam, survey_design (analytical Binder TSL + replicate-weight; multiplier bootstrap under survey remains gated, same as by_path=k), and heterogeneity (per-path heterogeneity coefficient surfaces on results.path_heterogeneity_effects). Mechanical extension to path enumeration; no methodology change.

    Order semantics: paths appear in results.path_effects in the user-specified order, modulo deduplication and unobserved-path filtering.

    Python-only API extension; no R equivalent. R’s did_multiplegt_dyn(..., by_path=k) only accepts a positive int (top-k) or -1 (all paths); there is no list-based path selection in R.

    Results expose the same surfaces as by_path: results.path_effects (dict keyed by path tuple), results.path_placebo_event_study, results.path_sup_t_bands, results.path_cumulated_event_study (under trends_linear), and the level="by_path" DataFrame.

  • rank_deficient_action (str, default="warn") – Action when the TWFE decomposition diagnostic OLS encounters a rank-deficient design matrix: "warn", "error", or "silent". Only used when twfe_diagnostic=True.

results_

Estimation results after calling fit().

Type:

ChaisemartinDHaultfoeuilleResults

is_fitted_

Whether the model has been fitted.

Type:

bool

Notes

The analytical CI is conservative under Assumption 8 (independent groups) of the dynamic companion paper, and exact only under iid sampling. This is documented as a deliberate deviation from “default nominal coverage” in REGISTRY.md.

Examples

Basic single-switch panel:

>>> from diff_diff import ChaisemartinDHaultfoeuille
>>> from diff_diff.prep_dgp import generate_reversible_did_data
>>> data = generate_reversible_did_data(n_groups=80, n_periods=6, seed=42)
>>> est = ChaisemartinDHaultfoeuille()
>>> results = est.fit(
...     data, outcome="outcome", group="group",
...     time="period", treatment="treatment",
... )
>>> abs(results.overall_att - 2.0) < 1.0  # close to the true effect
True

Methods

fit(data, outcome, group, time, treatment[, ...])

Fit the dCDH estimator on individual-level panel data.

get_params()

Return all __init__ parameters as a dictionary.

set_params(**params)

Set estimator parameters (sklearn-compatible).

__init__(alpha=0.05, cluster=None, n_bootstrap=0, bootstrap_weights='rademacher', seed=None, placebo=True, twfe_diagnostic=True, drop_larger_lower=True, by_path=None, paths_of_interest=None, rank_deficient_action='warn')[source]
Parameters:
Return type:

None

alpha: float
n_bootstrap: int
bootstrap_weights: str
seed: int | None
results_: ChaisemartinDHaultfoeuilleResults | None
get_params()[source]

Return all __init__ parameters as a dictionary.

Return type:

Dict[str, Any]

set_params(**params)[source]

Set estimator parameters (sklearn-compatible).

Transactional: validation runs after the candidate mutations, and if any rule fails the estimator state is rolled back to its pre-call values before the exception is re-raised. Callers can therefore retry with corrected params on the same instance without repairing inconsistent intermediate state.

Parameters:

params (Any)

Return type:

ChaisemartinDHaultfoeuille

fit(data, outcome, group, time, treatment, aggregate=None, L_max=None, controls=None, trends_linear=None, trends_nonparam=None, honest_did=False, heterogeneity=None, design2=False, survey_design=None)[source]

Fit the dCDH estimator on individual-level panel data.

Parameters:
  • data (pd.DataFrame) – Individual-level panel. Must contain columns for outcome, group, time, and treatment. The estimator internally aggregates to (group, time) cells.

  • outcome (str) – Outcome variable column name.

  • group (str) – Group identifier column name. Treatment must be constant within each (group, time) cell after aggregation; ValueError is raised if any cell has fractional treatment after grouping (within-cell-varying treatment indicates a fuzzy design not supported in Phase 1).

  • time (str) – Time period column name. Must be sortable.

  • treatment (str) – Per-observation treatment column. Must be numeric and constant within each (group, time) cell. Both binary {0, 1} and non-binary (ordinal or continuous) treatment are supported. Non-binary treatment requires L_max >= 1.

  • aggregate (str, optional) – Reserved for Phase 3. Must be None; any other value raises NotImplementedError.

  • L_max (int, optional) – Maximum event-study horizon. When set, computes DID_l for l = 1, ..., L_max using the per-group building block from Equation 3 of the dynamic companion paper. When None (default), only the l = 1 contemporaneous- switch estimator DID_M is computed (Phase 1 behavior). Must be a positive integer not exceeding the number of post-baseline periods in the panel.

  • controls (list of str, optional) – Column names for covariate adjustment via residualization-style DID^X (Web Appendix Section 1.2). Requires L_max >= 1. One theta_hat per baseline treatment value, estimated by OLS on not-yet-treated observations. NOT doubly-robust.

  • trends_linear (bool, optional) – If True, estimate group-specific linear trends via DID^{fd} (Web Appendix Section 1.3, Lemma 6). Requires L_max >= 1 and at least 3 time periods.

  • trends_nonparam (str, optional) – Column name for state-set membership. Restricts the control pool to groups in the same set (Web Appendix Section 1.4). Requires L_max >= 1 and time-invariant values per group.

  • honest_did (bool, default=False) – Run HonestDiD sensitivity analysis (Rambachan & Roth 2023) on the placebo + event study surface. Requires L_max >= 1. Default: relative magnitudes (DeltaRM, Mbar=1.0), targeting the equal-weight average over all post-treatment horizons (l_vec=None). Results stored on results.honest_did_results; None with a warning if the solver fails. For custom parameters (e.g., targeting the on-impact effect only via l_vec), call compute_honest_did(results, ...) post-hoc instead.

  • heterogeneity (str, optional) – Column name for a time-invariant covariate to test for heterogeneous effects (Web Appendix Section 1.5, Lemma 7). Per-horizon OLS regressions are computed for forward horizons (1..L_max), and ALSO for backward (placebo) horizons (-1..-L_max) when placebo=True is set (post-2026-05-15: per-path placebo predict_het R-parity against did_multiplegt_dyn(by_path, predict_het, placebo)). Joint Wald F-test across rows is NOT computed (per-horizon inference only). Cannot be combined with controls, trends_linear, or trends_nonparam. Requires L_max >= 1. Under by_path / paths_of_interest, per-path heterogeneity coefficients also surface on results.path_heterogeneity_effects and on to_dataframe(level="by_path") via het_* columns (positive AND negative-horizon rows populated when placebo=True). Under survey_design, backward- horizon (placebo) heterogeneity is NOT computed (the pre- period Binder TSL cell allocator is deferred to a follow- up methodology PR); a UserWarning fires at fit-time and forward-horizon heterogeneity continues to compute normally.

  • design2 (bool, default=False) – If True, identify and report switch-in/switch-out (Design-2) groups. Convenience wrapper (descriptive summary, not full paper re-estimation). Requires drop_larger_lower=False to retain 2-switch groups.

  • survey_design (SurveyDesign, optional) – Survey design specification for design-based inference. Supports weight_type='pweight' with two variance paths: (1) Taylor Series Linearization using strata / PSU / FPC (analytical) via the cell-period IF allocator that attributes per-(g, t)-cell mass and aggregates through Binder (1983), and (2) replicate-weight variance using BRR / Fay / JK1 / JKn / SDR methods (analytical, closed- form). Survey weights produce weighted cell means for the point estimate. Under a survey design without an explicit psu, fit() auto-injects psu=<group_col> as a safe default (the group is the effective sampling unit). Strata and PSU may vary across cells of a group but must be constant within each (g, t) cell (trivially true in one-obs-per-cell panels; enforced otherwise with ValueError). Three supported combinations under the auto-injected psu=<group_col>: (1) strata constant within group (any nest flag works); (2) strata vary within group and nest=True — the resolver re-labels the synthesized psu uniquely within strata; (3) strata vary within group and nest=False — rejected up front with a targeted ValueError; pass SurveyDesign(..., nest=True) or an explicit psu=<col> with globally-unique labels instead. When n_bootstrap > 0 and a survey design is supplied, the multiplier bootstrap operates at the PSU level (Hall-Mammen wild PSU bootstrap) — under the default auto-inject this collapses to a group-level clustered bootstrap. Under within-group-varying PSU the bootstrap uses a cell-level wild PSU allocator — a group contributing cells to multiple PSUs receives independent multiplier draws per PSU (see the Survey + bootstrap contract Note in REGISTRY.md). Scope note (terminal missingness under any cell-period-allocator path): on panels where a terminally-missing group is in a cohort whose other groups still contribute at the missing period, every survey variance path that uses the cell- period allocator raises a targeted ValueError: Binder TSL with within-group-varying PSU, Rao-Wu replicate-weight ATT (which always uses the cell allocator), and the cell-level wild PSU bootstrap. Cohort-recentering leaks centered IF mass onto cells with no positive-weight obs, which the cell-period allocator cannot allocate to any observation or PSU. Pre-process the panel (drop late-exit groups or trim to a balanced sub-panel), or — for Binder TSL only — use an explicit psu=<group_col> so the analytical path routes through the legacy group-level allocator. Replicate ATT and within-group-varying-PSU bootstrap have no such allocator fallback. Replicate weights with ``n_bootstrap > 0`` raises ``NotImplementedError`` (replicate variance is closed-form; bootstrap would double-count variance). See REGISTRY.md ChaisemartinDHaultfoeuille Notes for the full contract.

Return type:

ChaisemartinDHaultfoeuilleResults

Raises:
  • ValueError – If required columns are missing, treatment is not binary, or the panel has too few groups / periods.

  • NotImplementedError – If any forward-compat parameter is set to a non-default value, with a clear pointer to the relevant ROADMAP phase.

ChaisemartinDHaultfoeuilleResults#

Results container for dCDH estimation.

class diff_diff.ChaisemartinDHaultfoeuilleResults[source]

Bases: object

Results from de Chaisemartin-D’Haultfoeuille (dCDH) Phase 1 estimation.

Phase 1 ships the contemporaneous-switch estimator DID_M (= DID_1 at horizon l = 1 of the dynamic companion paper) plus the joiners- only / leavers-only views, the single-lag placebo DID_M^pl, and optionally the TWFE decomposition diagnostic (per-cell weights, fraction negative, sigma_fe).

Notes

The analytical confidence interval is conservative under Assumption 8 (independent groups) of the dynamic companion paper, and exact only under iid sampling. This is documented as a deliberate deviation from “default nominal coverage” in the methodology registry.

For binary treatment in Phase 1, multi-switch groups (i.e., groups that switch treatment more than once) are dropped before estimation when drop_larger_lower=True (the default), matching the R DIDmultiplegtDYN reference. The number of dropped groups is exposed via n_groups_dropped_crossers.

Inference-method switch when bootstrap is enabled. The overall_p_value / overall_conf_int (and joiners/leavers analogues) fields are populated by normal-theory inference from the cohort-recentered analytical SE when n_bootstrap=0 (the default). When n_bootstrap > 0, the same fields are populated by percentile-based bootstrap inference from the multiplier bootstrap distribution computed by _compute_dcdh_bootstrap(). The t-stat (overall_t_stat, etc.) is computed from the SE in both cases, since percentile bootstrap does not define an alternative t-stat semantic. event_study_effects[1], summary(), to_dataframe(), is_significant, and significance_stars all read from these top-level fields and therefore reflect the bootstrap inference automatically. The single-period placebo (L_max=None) still has NaN bootstrap fields; multi-horizon placebos (L_max >= 1) have valid bootstrap SE/CI/p via placebo_horizon_ses/cis/p_values. See the methodology registry Note (bootstrap inference surface) for the full contract and library precedent.

overall_att

DID_M = DID_1: the contemporaneous-switch dCDH point estimate.

Type:

float

overall_se

Standard error of DID_M.

Type:

float

overall_t_stat
Type:

float

overall_p_value
Type:

float

overall_conf_int
Type:

tuple of float

joiners_att

DID_+: the joiners-only contribution. NaN when joiners_available is False.

Type:

float

joiners_se
Type:

float

joiners_t_stat
Type:

float

joiners_p_value
Type:

float

joiners_conf_int
Type:

tuple of float

n_joiner_cells

Total number of joiner switching (g, t) cells across all periods. Each cell counted once. Equals sum_t (#{g : D_{g,t-1}=0, D_{g,t}=1}).

Type:

int

n_joiner_obs

Total raw observation count across joiner cells, summing n_gt over the same set of cells. For balanced one-observation-per-cell panels this equals n_joiner_cells; for individual-level inputs with multiple observations per (g, t) it can be larger.

Type:

int

joiners_available

True if at least one joiner switching cell exists.

Type:

bool

leavers_att

DID_-: the leavers-only contribution. NaN when leavers_available is False.

Type:

float

leavers_se
Type:

float

leavers_t_stat
Type:

float

leavers_p_value
Type:

float

leavers_conf_int
Type:

tuple of float

n_leaver_cells

Total number of leaver switching (g, t) cells (mirror of n_joiner_cells).

Type:

int

n_leaver_obs

Total raw observation count across leaver cells (mirror of n_joiner_obs).

Type:

int

leavers_available
Type:

bool

placebo_effect

DID_M^pl: the single-lag placebo. NaN when placebo_available is False.

Type:

float

placebo_se
Type:

float

placebo_t_stat
Type:

float

placebo_p_value
Type:

float

placebo_conf_int
Type:

tuple of float

placebo_available

True when T >= 3 and at least one qualifying placebo cell exists.

Type:

bool

per_period_effects

Per-period decomposition. Keys are period values; each value is a dict with the following keys:

  • "did_plus_t" (float): joiner effect at this period (0.0 if no joiners or A11 violation)

  • "did_minus_t" (float): leaver effect at this period

  • "n_10_t" (int): joiner cell count

  • "n_01_t" (int): leaver cell count

  • "n_00_t" (int): stable-untreated cell count

  • "n_11_t" (int): stable-treated cell count

  • "did_plus_t_a11_zeroed" (bool): True when joiners exist but no stable-untreated controls (Assumption 11 violation, period contributes 0 to numerator with non-zero weight in denominator)

  • "did_minus_t_a11_zeroed" (bool): mirror for leavers

Type:

dict

twfe_weights

Per-cell TWFE decomposition weights from Theorem 1 of de Chaisemartin & D’Haultfoeuille (2020). Columns: group, time, weight. Computed on the FULL pre-filter cell sample passed by the user (the same input the standalone twowayfeweights() function uses) — NOT the post-filter estimation sample described by overall_att and groups. When fit() drops groups via the ragged-panel or drop_larger_lower filters, results.twfe_* and results.overall_att describe different samples and a UserWarning is emitted; see REGISTRY.md ChaisemartinDHaultfoeuille Note (TWFE diagnostic sample contract) for the rationale. Only populated when twfe_diagnostic=True.

Type:

pd.DataFrame, optional

twfe_fraction_negative

Fraction of treated-cell weights that are negative. > 0 is the diagnostic for the heterogeneous-treatment-effect bias of the plain TWFE estimator on the FULL pre-filter cell sample (NOT the post-filter estimation sample). See the twfe_weights docstring above for the sample contract.

Type:

float, optional

twfe_sigma_fe

Smallest standard deviation of per-cell treatment effects that could flip the sign of the plain TWFE estimator (Corollary 1 of the AER 2020 paper). Computed on the FULL pre-filter cell sample.

Type:

float, optional

twfe_beta_fe

The plain TWFE coefficient computed on the FULL pre-filter cell sample, for comparison with overall_att. Note that the two are computed on different samples when fit() filters drop groups — see the twfe_weights docstring above for the sample contract.

Type:

float, optional

groups

Group identifiers in the post-filter sample.

Type:

list

time_periods

Time periods in the panel.

Type:

list

n_obs

Total observations after filtering.

Type:

int

n_treated_obs

Treated observations in the post-filter sample.

Type:

int

n_switcher_cells

When L_max=None: number of switching (g, t) cells (N_S = sum_t (n_10_t + n_01_t)). When L_max >= 1: number of eligible switcher groups at horizon 1 (N_1). Previously this field always held the cell count; for L_max >= 1 it was repurposed to hold the per-group count that matches the DID_1 estimand. Originally equals once regardless of how many original observations fed into it. This is the N_S denominator of DID_M per AER 2020 Theorem 3 — cell counts, not within-cell observation counts.

Type:

int

n_cohorts

Distinct cohorts (D_{g,1}, F_g, S_g) after filtering.

Type:

int

n_groups_dropped_crossers

Number of groups dropped because they were multi-switch (matches R’s drop_larger_lower=TRUE behavior). 0 when drop_larger_lower=False or no crossers exist.

Type:

int

n_groups_dropped_singleton_baseline

Number of groups whose baseline D_{g,1} is unique in the post-drop panel (footnote 15 of the dynamic paper). They are excluded from the cohort-recentered VARIANCE computation only — they remain in the point-estimate sample as period-based stable controls (see REGISTRY.md ChaisemartinDHaultfoeuille for the period-vs-cohort deviation that makes this distinction matter).

Type:

int

n_groups_dropped_never_switching

Number of groups with S_g = 0 (never switched). Reported for backwards compatibility only. Per the Round 2 full influence-function fix, never-switching groups are NOT excluded from the variance: they contribute via their stable-control roles in the per-period IF formula. The field name retains “dropped” for API stability but no actual exclusion happens.

Type:

int

alpha

Significance level used for confidence intervals.

Type:

float

event_study_effects

Populated with horizon 1 when L_max=None, or horizons 1..L_max when L_max >= 1. When L_max >= 1, uses the per-group DID_{g,l} path; when L_max=None, uses the per-period DID_M path.

Type:

dict, optional

normalized_effects

Normalized estimator DID^n_l. Populated when L_max >= 1.

Type:

dict, optional

cost_benefit_delta

Cost-benefit aggregate delta. Populated when L_max >= 2.

Type:

dict, optional

sup_t_bands

Sup-t simultaneous confidence-band metadata for the OVERALL event-study surface. Holds {"crit_value": float, "alpha": float, "n_bootstrap": int, "method": str}. Populated when n_bootstrap > 0 AND there are at least 2 valid horizons with finite bootstrap SE > 0 AND a strict majority (more than 50%) of sup-t draws are finite. The band itself is written per-horizon as cband_conf_int on event_study_effects[l]. None otherwise. Python-only library extension; R did_multiplegt_dyn provides no joint / sup-t bands.

Type:

dict, optional

covariate_residuals

DID^X first-stage diagnostics: per-baseline theta_hat, n_obs, and r_squared. Populated when controls is set.

Type:

pd.DataFrame, optional

linear_trends_effects

Cumulated DID^{fd} level effects delta^{fd}_l. Keyed by horizon. Populated when trends_linear=True.

Type:

dict, optional

heterogeneity_effects

Per-horizon heterogeneity test results beta^{het}_l. Populated when heterogeneity is set.

Type:

dict, optional

design2_effects

Design-2 switch-in/switch-out descriptive summary. Populated when design2=True.

Type:

dict, optional

path_effects

Per-path event-study effects keyed by observed treatment trajectory (tuple of int). Populated when by_path is a positive int OR paths_of_interest is a list of int tuples at estimator construction. Each entry holds {"n_groups": int, "frequency_rank": int, "horizons": {l: {"effect", "se", "t_stat", "p_value", "conf_int", "n_obs"}}} for l = 1..L_max. Under paths_of_interest, dict-insertion order matches the user- specified path order; frequency_rank is the within- selected-paths rank by descending observed-group count (decoupled from iteration order).

Type:

dict, optional

path_placebo_event_study

Per-path backward-horizon placebos DID^{pl}_{path, l} for l = 1..L_max, keyed by observed treatment trajectory (tuple of int). Inner dict keys are negative ints (-l for lag l) to mirror the placebo_event_study convention so a unified {**path_effects[p]["horizons"], **path_placebo_event_study[p]} view is well-formed across forward and backward horizons. Each inner entry holds {"effect", "se", "t_stat", "p_value", "conf_int", "n_obs"}. Populated when (by_path is a positive int OR paths_of_interest is set) AND placebo=True AND L_max >= 1. Empty-state contract mirrors path_effects: None when by_path / paths_of_interest + placebo was not requested; {} when requested but no observed path has a complete window [F_g-1, F_g-1+L_max] within the panel (the same regime where path_effects returns {}, with the same UserWarning at fit-time). Downstream callers should distinguish the two states. Inherits the cross-path cohort-sharing SE deviation from R documented for path_effects. See REGISTRY.md Note (Phase 3 by_path ...) → “Per-path placebos”.

Type:

dict, optional

path_heterogeneity_effects

Per-path heterogeneity test results (Web Appendix Section 1.5, Lemma 7) when heterogeneity is set AND (by_path=k or paths_of_interest=[(...), ...]) is set. Inner dict keyed by horizon directly (no "horizons" wrapper); each entry holds {"beta", "se", "t_stat", "p_value", "conf_int", "n_obs"}, where beta is the heterogeneity coefficient on the path- restricted switcher subsample - plain OLS on the non-survey path, WLS-on-pweights under survey_design. Cohort dummies in the design matrix absorb baseline by construction. Empty-state contract mirrors path_effects: None when not requested; {} when requested but no path has eligible switchers. Mirrors R did_multiplegt_dyn(..., by_path, predict_het) per-by_level dispatch. See REGISTRY.md Note (Phase 3 by_path ...) → “Per-path heterogeneity testing”.

Type:

dict, optional

path_cumulated_event_study

Per-path cumulated level effects delta_{path, l} = sum_{l'=1..l} DID^{fd}_{path, l'} for l = 1..L_max, keyed by observed treatment trajectory (tuple of int). Inner dict is keyed by horizon directly (no "horizons" wrapper); each entry holds {"effect", "se", "t_stat", "p_value", "conf_int", "n_obs"}. Populated when (by_path is a positive int OR paths_of_interest is set) AND trends_linear=True AND L_max >= 1; None otherwise. Mirrors the global linear_trends_effects cumulation: SE on the cumulated layer is the conservative upper bound (sum of per-horizon component SEs from path_effects[path]["horizons"][l]["se"], NaN-consistent). Built AFTER bootstrap propagation so the cumulated SE / t / p / CI are derived from the FINAL post-bootstrap per-horizon SEs when n_bootstrap > 0. Surfaced as cumulated_effect / cumulated_se columns on to_dataframe(level="by_path") (always-present, NaN-when- None) and as a per-path “Cumulated Level Effects” sub-section in summary(). See REGISTRY.md Note (Phase 3 by_path ...) → “Per-path linear-trends DID^{fd}”.

Type:

dict, optional

path_sup_t_bands

Per-path joint sup-t simultaneous-band metadata, keyed by observed treatment trajectory (tuple of int). Each entry holds {"crit_value": float, "alpha": float, "n_bootstrap": int, "method": str, "n_valid_horizons": int}. Populated when (by_path is a positive int OR paths_of_interest is set) AND n_bootstrap > 0. The band itself is applied per-horizon as cband_conf_int on path_effects[path]["horizons"][l] and rendered as cband_lower / cband_upper columns on to_dataframe(level="by_path"). Empty-state contract: None when not requested (no bootstrap, or both by_path and paths_of_interest are None); {} when requested but no path passed both gates (>=2 valid horizons with finite bootstrap SE > 0 AND a strict majority — more than 50% — of finite sup-t draws). Bands cover joint inference WITHIN a single path across horizons; they do NOT provide simultaneous coverage across paths. Inherits the cross-path cohort-sharing SE deviation from R documented for path_effects (the bootstrap SE used as the t-stat denominator carries the same deviation). Python-only library extension; R did_multiplegt_dyn provides no joint / sup-t bands at any surface. See REGISTRY.md Note (Phase 3 by_path per-path joint sup-t bands).

Type:

dict, optional

honest_did_results

HonestDiD sensitivity analysis bounds (Rambachan & Roth 2023). Populated when honest_did=True in fit() or by calling compute_honest_did(results) post-hoc. Contains identified set bounds, robust confidence intervals, and breakdown analysis.

Type:

HonestDiDResults, optional

survey_metadata

Populated when fit(..., survey_design=sd) is called; None otherwise. Carries the resolved survey design summary (weight_type, strata/PSU counts, df_survey, weight range, and replicate-method info when applicable). df_survey is threaded into survey-aware inference (t-distribution at all analytical surfaces) and consumed by compute_honest_did() to produce survey-aware critical values.

Type:

Any, optional

bootstrap_results

Bootstrap inference results when n_bootstrap > 0.

Type:

DCDHBootstrapResults, optional

Methods

summary([alpha])

Generate a formatted summary of dCDH estimation results.

print_summary([alpha])

Print the formatted summary to stdout.

to_dataframe([level])

Convert results to a DataFrame at the requested level of aggregation.

overall_att: float
overall_se: float
overall_t_stat: float
overall_p_value: float
overall_conf_int: Tuple[float, float]
joiners_att: float
joiners_se: float
joiners_t_stat: float
joiners_p_value: float
joiners_conf_int: Tuple[float, float]
n_joiner_cells: int
n_joiner_obs: int
joiners_available: bool
leavers_att: float
leavers_se: float
leavers_t_stat: float
leavers_p_value: float
leavers_conf_int: Tuple[float, float]
n_leaver_cells: int
n_leaver_obs: int
leavers_available: bool
placebo_effect: float
placebo_se: float
placebo_t_stat: float
placebo_p_value: float
placebo_conf_int: Tuple[float, float]
placebo_available: bool
per_period_effects: Dict[Any, Dict[str, Any]]
groups: List[Any]
time_periods: List[Any]
n_obs: int
n_treated_obs: int
n_switcher_cells: int
n_cohorts: int
n_groups_dropped_crossers: int
n_groups_dropped_singleton_baseline: int
n_groups_dropped_never_switching: int
event_study_effects: Dict[int, Dict[str, Any]] | None = None
L_max: int | None = None
placebo_event_study: Dict[int, Dict[str, Any]] | None = None
twfe_weights: pd.DataFrame | None = None
twfe_fraction_negative: float | None = None
twfe_sigma_fe: float | None = None
twfe_beta_fe: float | None = None
alpha: float = 0.05
normalized_effects: Dict[int, Dict[str, Any]] | None = None
cost_benefit_delta: Dict[str, Any] | None = None
sup_t_bands: Dict[str, Any] | None = None
covariate_residuals: pd.DataFrame | None = None
linear_trends_effects: Dict[int, Dict[str, Any]] | None = None
trends_linear: bool | None = None
heterogeneity_effects: Dict[int, Dict[str, Any]] | None = None
design2_effects: Dict[str, Any] | None = None
path_effects: Dict[Tuple[int, ...], Dict[str, Any]] | None = None
path_placebo_event_study: Dict[Tuple[int, ...], Dict[int, Dict[str, Any]]] | None = None
path_heterogeneity_effects: Dict[Tuple[int, ...], Dict[int, Dict[str, Any]]] | None = None
path_cumulated_event_study: Dict[Tuple[int, ...], Dict[int, Dict[str, Any]]] | None = None
path_sup_t_bands: Dict[Tuple[int, ...], Dict[str, Any]] | None = None
honest_did_results: 'HonestDiDResults' | None = None
survey_metadata: Any | None = None
bootstrap_results: DCDHBootstrapResults | None = None
property att: float
property se: float
property conf_int: Tuple[float, float]
property p_value: float
property t_stat: float
__repr__()[source]

Concise string representation.

Return type:

str

property coef_var: float

SE / abs(DID_M); NaN when DID_M is 0 or SE non-finite.

property is_significant: bool

True iff overall DID_M p-value is below alpha.

property significance_stars: str

Significance stars for the overall DID_M.

summary(alpha=None)[source]

Generate a formatted summary of dCDH estimation results.

Parameters:

alpha (float, optional) – Significance level for the confidence interval header. Defaults to self.alpha.

Returns:

Formatted multi-block summary including overall DID_M, joiners-only / leavers-only views, the placebo, the TWFE decomposition diagnostic, and a footer of significance codes.

Return type:

str

print_summary(alpha=None)[source]

Print the formatted summary to stdout.

Parameters:

alpha (float | None)

Return type:

None

to_dataframe(level='overall')[source]

Convert results to a DataFrame at the requested level of aggregation.

Parameters:

level (str, default="overall") –

One of:

  • "overall": single-row table with the overall estimand (DID_M when L_max=None, DID_1 when L_max=1, delta when L_max >= 2).

  • "joiners_leavers": up to three rows for the overall, DID_+, and DID_- (binary panels only).

  • "per_period": one row per time period with did_plus_t, did_minus_t, switching cell counts, and the A11-zeroed flags.

  • "event_study": one row per horizon (positive and negative/placebo), including a reference period at horizon 0. Available when L_max >= 1.

  • "normalized": one row per horizon for the normalized effects DID^n_l. Available when L_max >= 1.

  • "twfe_weights": per-(group, time) TWFE decomposition weights table. Only available when twfe_diagnostic=True was passed to fit().

  • "heterogeneity": one row per horizon for the heterogeneity test beta^{het}_l. Available when heterogeneity is passed to fit().

  • "linear_trends": one row per horizon for the cumulated trend-adjusted level effects delta^{fd}_l. Available when trends_linear=True.

  • "design2": Design-2 switch-in/switch-out descriptive summary. Available when design2=True.

  • "by_path": one row per (path, horizon) when either by_path=k or paths_of_interest=[(...), ...] was passed to the estimator. Columns: path, frequency_rank, n_groups, horizon, effect, se, t_stat, p_value, conf_int_lower, conf_int_upper, n_obs, cband_lower, cband_upper, cumulated_effect, cumulated_se, het_beta, het_se, het_t_stat, het_p_value, het_conf_int_lower, het_conf_int_upper. The horizon column takes negative ints for placebo rows when placebo=True. The cband_* columns mirror the OVERALL level="event_study" schema (joint sup-t simultaneous bands); they are populated for positive-horizon rows of paths with a finite per-path sup-t crit (n_bootstrap > 0) and NaN otherwise (placebo rows, unbanded paths, or the requested-but-empty fallback DataFrame). The cumulated_* columns mirror the global linear_trends_effects cumulation; populated for positive-horizon rows when trends_linear=True is also set, NaN for placebo rows or non-trends_linear fits (always-present, NaN-when-None — same convention as cband_*). The het_* columns surface the per-path heterogeneity coefficient (Web Appendix Section 1.5, Lemma 7) when heterogeneity="<col>" is also set. Populated for positive-horizon (forward) rows whenever heterogeneity is requested, AND for negative-horizon (placebo) rows when placebo=True is also set (post-2026-05-15: per-path placebo predict_het R-parity against did_multiplegt_dyn(by_path, predict_het, placebo)). NaN for non-heterogeneity fits / the requested-but-empty fallback DataFrame, AND for placebo rows under survey_design (forward-only fallback — backward-horizon survey predict_het is deferred until the pre-period cell allocator is derived; a UserWarning fires at fit-time when survey_design + placebo + heterogeneity are co-set). Always-present, NaN-when-None — same convention as cband_* and cumulated_*.

Return type:

pd.DataFrame

__init__(overall_att, overall_se, overall_t_stat, overall_p_value, overall_conf_int, joiners_att, joiners_se, joiners_t_stat, joiners_p_value, joiners_conf_int, n_joiner_cells, n_joiner_obs, joiners_available, leavers_att, leavers_se, leavers_t_stat, leavers_p_value, leavers_conf_int, n_leaver_cells, n_leaver_obs, leavers_available, placebo_effect, placebo_se, placebo_t_stat, placebo_p_value, placebo_conf_int, placebo_available, per_period_effects, groups, time_periods, n_obs, n_treated_obs, n_switcher_cells, n_cohorts, n_groups_dropped_crossers, n_groups_dropped_singleton_baseline, n_groups_dropped_never_switching, event_study_effects=None, L_max=None, placebo_event_study=None, twfe_weights=None, twfe_fraction_negative=None, twfe_sigma_fe=None, twfe_beta_fe=None, alpha=0.05, normalized_effects=None, cost_benefit_delta=None, sup_t_bands=None, covariate_residuals=None, linear_trends_effects=None, trends_linear=None, heterogeneity_effects=None, design2_effects=None, path_effects=None, path_placebo_event_study=None, path_heterogeneity_effects=None, path_cumulated_event_study=None, path_sup_t_bands=None, honest_did_results=None, survey_metadata=None, bootstrap_results=None, _estimator_ref=None)
Parameters:
  • overall_att (float)

  • overall_se (float)

  • overall_t_stat (float)

  • overall_p_value (float)

  • overall_conf_int (Tuple[float, float])

  • joiners_att (float)

  • joiners_se (float)

  • joiners_t_stat (float)

  • joiners_p_value (float)

  • joiners_conf_int (Tuple[float, float])

  • n_joiner_cells (int)

  • n_joiner_obs (int)

  • joiners_available (bool)

  • leavers_att (float)

  • leavers_se (float)

  • leavers_t_stat (float)

  • leavers_p_value (float)

  • leavers_conf_int (Tuple[float, float])

  • n_leaver_cells (int)

  • n_leaver_obs (int)

  • leavers_available (bool)

  • placebo_effect (float)

  • placebo_se (float)

  • placebo_t_stat (float)

  • placebo_p_value (float)

  • placebo_conf_int (Tuple[float, float])

  • placebo_available (bool)

  • per_period_effects (Dict[Any, Dict[str, Any]])

  • groups (List[Any])

  • time_periods (List[Any])

  • n_obs (int)

  • n_treated_obs (int)

  • n_switcher_cells (int)

  • n_cohorts (int)

  • n_groups_dropped_crossers (int)

  • n_groups_dropped_singleton_baseline (int)

  • n_groups_dropped_never_switching (int)

  • event_study_effects (Optional[Dict[int, Dict[str, Any]]])

  • L_max (Optional[int])

  • placebo_event_study (Optional[Dict[int, Dict[str, Any]]])

  • twfe_weights (Optional[pd.DataFrame])

  • twfe_fraction_negative (Optional[float])

  • twfe_sigma_fe (Optional[float])

  • twfe_beta_fe (Optional[float])

  • alpha (float)

  • normalized_effects (Optional[Dict[int, Dict[str, Any]]])

  • cost_benefit_delta (Optional[Dict[str, Any]])

  • sup_t_bands (Optional[Dict[str, Any]])

  • covariate_residuals (Optional[pd.DataFrame])

  • linear_trends_effects (Optional[Dict[int, Dict[str, Any]]])

  • trends_linear (Optional[bool])

  • heterogeneity_effects (Optional[Dict[int, Dict[str, Any]]])

  • design2_effects (Optional[Dict[str, Any]])

  • path_effects (Optional[Dict[Tuple[int, ...], Dict[str, Any]]])

  • path_placebo_event_study (Optional[Dict[Tuple[int, ...], Dict[int, Dict[str, Any]]]])

  • path_heterogeneity_effects (Optional[Dict[Tuple[int, ...], Dict[int, Dict[str, Any]]]])

  • path_cumulated_event_study (Optional[Dict[Tuple[int, ...], Dict[int, Dict[str, Any]]]])

  • path_sup_t_bands (Optional[Dict[Tuple[int, ...], Dict[str, Any]]])

  • honest_did_results (Optional['HonestDiDResults'])

  • survey_metadata (Optional[Any])

  • bootstrap_results (Optional[DCDHBootstrapResults])

  • _estimator_ref (Optional[Any])

Return type:

None

DCDHBootstrapResults#

Multiplier-bootstrap inference results, populated when n_bootstrap > 0.

class diff_diff.DCDHBootstrapResults[source]

Bases: object

Results from ChaisemartinDHaultfoeuille (dCDH) multiplier bootstrap inference.

The bootstrap is a library extension beyond the dCDH papers, which propose only the analytical cohort-recentered plug-in variance from Web Appendix Section 3.7.3 of the dynamic companion paper. Provided for consistency with CallawaySantAnna / ImputationDiD / TwoStageDiD.

Per-target SE / CI / p-value are populated for the three scalar dCDH estimands implemented in Phase 1: overall (DID_M), joiners (DID_+), and leavers (DID_-). When a target is not available in the underlying data (e.g., no leavers), the matching fields are None.

Phase 1 per-period placebo (L_max=None) bootstrap is NOT computed. The dynamic companion paper Section 3.7.3 derives the cohort-recentered analytical variance for DID_l only, not for the per-period DID_M^pl. The placebo_se / placebo_ci / placebo_p_value fields below remain None for Phase 1. Multi-horizon placebos (L_max >= 1) have valid SE via placebo_horizon_ses - this is a library extension applying the same IF/variance structure to the placebo estimand (see REGISTRY.md dynamic placebo SE Note).

n_bootstrap

Number of bootstrap iterations.

Type:

int

weight_type

Type of bootstrap weights: "rademacher", "mammen", or "webb".

Type:

str

alpha

Significance level used for confidence intervals.

Type:

float

overall_se

Bootstrap standard error for DID_M.

Type:

float

overall_ci

Bootstrap confidence interval for DID_M.

Type:

tuple of float

overall_p_value

Bootstrap p-value for DID_M.

Type:

float

joiners_se

Bootstrap SE for joiners-only DID_+ (None if no joiners).

Type:

float, optional

joiners_ci

Bootstrap CI for joiners-only DID_+.

Type:

tuple of float, optional

joiners_p_value

Bootstrap p-value for joiners-only DID_+.

Type:

float, optional

leavers_se

Bootstrap SE for leavers-only DID_- (None if no leavers).

Type:

float, optional

leavers_ci

Bootstrap CI for leavers-only DID_-.

Type:

tuple of float, optional

leavers_p_value

Bootstrap p-value for leavers-only DID_-.

Type:

float, optional

placebo_se

None for the Phase 1 single-period placebo (L_max=None). Multi-horizon placebo bootstrap SE is on placebo_horizon_ses.

Type:

float, optional

placebo_ci

None for single-period placebo. See placebo_horizon_cis.

Type:

tuple of float, optional

placebo_p_value

None for single-period placebo. See placebo_horizon_p_values.

Type:

float, optional

bootstrap_distribution

Full bootstrap distribution of the overall DID_M estimator (shape: (n_bootstrap,)). Stored for advanced diagnostics; suppressed from __repr__.

Type:

np.ndarray, optional

n_bootstrap: int
weight_type: str
alpha: float
overall_se: float
overall_ci: Tuple[float, float]
overall_p_value: float
joiners_se: float | None = None
joiners_ci: Tuple[float, float] | None = None
joiners_p_value: float | None = None
leavers_se: float | None = None
leavers_ci: Tuple[float, float] | None = None
leavers_p_value: float | None = None
placebo_se: float | None = None
placebo_ci: Tuple[float, float] | None = None
placebo_p_value: float | None = None
bootstrap_distribution: ndarray | None = None
event_study_ses: Dict[int, float] | None = None
event_study_cis: Dict[int, Tuple[float, float]] | None = None
event_study_p_values: Dict[int, float] | None = None
placebo_horizon_ses: Dict[int, float] | None = None
placebo_horizon_cis: Dict[int, Tuple[float, float]] | None = None
placebo_horizon_p_values: Dict[int, float] | None = None
cband_crit_value: float | None = None
path_ses: Dict[Tuple[int, ...], Dict[int, float]] | None = None
path_cis: Dict[Tuple[int, ...], Dict[int, Tuple[float, float]]] | None = None
path_p_values: Dict[Tuple[int, ...], Dict[int, float]] | None = None
path_placebo_ses: Dict[Tuple[int, ...], Dict[int, float]] | None = None
path_placebo_cis: Dict[Tuple[int, ...], Dict[int, Tuple[float, float]]] | None = None
path_placebo_p_values: Dict[Tuple[int, ...], Dict[int, float]] | None = None
path_cband_crit_values: Dict[Tuple[int, ...], float] | None = None
path_cband_n_valid_horizons: Dict[Tuple[int, ...], int] | None = None
__init__(n_bootstrap, weight_type, alpha, overall_se, overall_ci, overall_p_value, joiners_se=None, joiners_ci=None, joiners_p_value=None, leavers_se=None, leavers_ci=None, leavers_p_value=None, placebo_se=None, placebo_ci=None, placebo_p_value=None, bootstrap_distribution=None, event_study_ses=None, event_study_cis=None, event_study_p_values=None, placebo_horizon_ses=None, placebo_horizon_cis=None, placebo_horizon_p_values=None, cband_crit_value=None, path_ses=None, path_cis=None, path_p_values=None, path_placebo_ses=None, path_placebo_cis=None, path_placebo_p_values=None, path_cband_crit_values=None, path_cband_n_valid_horizons=None)
Parameters:
Return type:

None

Convenience Function#

diff_diff.chaisemartin_dhaultfoeuille(data, outcome, group, time, treatment, **kwargs)[source]#

One-shot convenience wrapper around ChaisemartinDHaultfoeuille.

Equivalent to:

ChaisemartinDHaultfoeuille(**init_kwargs).fit(
    data, outcome=..., group=..., time=..., treatment=...,
    **fit_kwargs,
)

All keyword arguments are split between __init__ and fit based on which signature accepts them. Useful for one-line use in scripts.

Parameters:
  • data (pd.DataFrame)

  • outcome (str)

  • group (str)

  • time (str)

  • treatment (str)

  • **kwargs (Any) – Forwarded to ChaisemartinDHaultfoeuille.__init__ or .fit() based on parameter name.

Return type:

ChaisemartinDHaultfoeuilleResults

Standalone TWFE Decomposition Diagnostic#

The TWFE decomposition diagnostic from Theorem 1 of de Chaisemartin & D’Haultfœuille (2020) is also available as a standalone function for users who want the diagnostic without fitting the full estimator. It returns per-cell weights, the fraction of treated cells with negative weights, and sigma_fe — the smallest standard deviation of per-cell treatment effects that could flip the sign of the plain TWFE coefficient.

diff_diff.twowayfeweights(data, outcome, group, time, treatment, rank_deficient_action='warn', survey_design=None)[source]#

Standalone TWFE decomposition diagnostic.

Computes the per-cell weights, fraction negative, and sigma_fe from Theorem 1 of de Chaisemartin & D’Haultfoeuille (2020), without fitting the full dCDH estimator. Mirrors the standalone Stata twowayfeweights package.

Parameters:
  • data (pd.DataFrame) – Individual-level panel.

  • outcome (str)

  • group (str)

  • time (str)

  • treatment (str)

  • rank_deficient_action (str, default="warn") – Action when the FE design matrix is rank-deficient.

  • survey_design (SurveyDesign, optional) – If provided, cell aggregation uses survey-weighted cell means (matching fit(..., survey_design=sd).twfe_*). Required to preserve fit-vs-helper parity under survey-backed inputs. Only weight_type='pweight' is supported; other types raise ValueError. Replicate-weight designs (BRR/Fay/JK1/JKn/SDR) are accepted — the TWFE diagnostic has no SE field on TWFEWeightsResult, so replicate weights only affect the cell aggregation path (aggregated numbers are identical to fit(..., survey_design=sd).twfe_* under the same input).

Returns:

Object with attributes weights (DataFrame), fraction_negative (float), sigma_fe (float), and beta_fe (float).

Return type:

TWFEWeightsResult

class diff_diff.TWFEWeightsResult[source]

Bases: object

Lightweight container for the standalone twowayfeweights helper.

Returned by twowayfeweights(). Mirrors the per-cell decomposition information that the dCDH estimator stores on its results object when twfe_diagnostic=True, but available as a standalone function for users who only want the diagnostic without fitting the full estimator.

__init__(weights, fraction_negative, sigma_fe, beta_fe)[source]
Parameters:
Return type:

None

weights
fraction_negative
sigma_fe
beta_fe

Example Usage#

Basic usage with reversible treatment:

from diff_diff import ChaisemartinDHaultfoeuille
from diff_diff.prep import generate_reversible_did_data

data = generate_reversible_did_data(
    n_groups=80, n_periods=6, pattern="single_switch", seed=42,
)

est = ChaisemartinDHaultfoeuille()
results = est.fit(
    data,
    outcome="outcome",
    group="group",
    time="period",
    treatment="treatment",
)
results.print_summary()

Joiners and leavers views:

print(f"DID_M (overall):  {results.overall_att:.3f}")
print(f"DID_+ (joiners):  {results.joiners_att:.3f}")
print(f"DID_- (leavers):  {results.leavers_att:.3f}")
print(f"Placebo (DID^pl): {results.placebo_effect:.3f}")

Per-period decomposition:

for t, cell in results.per_period_effects.items():
    print(
        f"t={t}: DID+={cell['did_plus_t']:.3f} "
        f"({cell['n_10_t']} joiners, {cell['n_00_t']} stable_0 controls)"
    )

Multiplier bootstrap inference:

est = ChaisemartinDHaultfoeuille(
    n_bootstrap=999, bootstrap_weights="rademacher", seed=42,
)
results = est.fit(
    data, outcome="outcome", group="group",
    time="period", treatment="treatment",
)
# When n_bootstrap > 0, the top-level overall_*/joiners_*/leavers_*
# p-value and conf_int fields hold percentile-based bootstrap
# inference (not normal-theory recomputations from the bootstrap SE).
# The t-stat is computed from the SE in both cases. See REGISTRY.md
# `Note (bootstrap inference surface)` for the full contract.
print(f"Top-level p-value (bootstrap): {results.overall_p_value:.4f}")
print(f"Top-level CI (bootstrap):     {results.overall_conf_int}")
print(f"bootstrap_results.overall_se: {results.bootstrap_results.overall_se:.3f}")
print(f"bootstrap_results.overall_ci: {results.bootstrap_results.overall_ci}")

Standalone TWFE diagnostic (without fitting the full estimator):

from diff_diff import twowayfeweights

diagnostic = twowayfeweights(
    data, outcome="outcome", group="group", time="period", treatment="treatment",
)
print(f"Plain TWFE coefficient: {diagnostic.beta_fe:.3f}")
print(f"Fraction of negative weights: {diagnostic.fraction_negative:.3f}")
print(f"sigma_fe (sign-flipping threshold): {diagnostic.sigma_fe:.3f}")

The DCDH alias:

from diff_diff import DCDH

est = DCDH()  # equivalent to ChaisemartinDHaultfoeuille()