de Chaisemartin-D'Haultfœuille (dCDH) DiD
============================================
The only modern staggered DiD estimator in diff-diff that handles
**non-absorbing (reversible) treatments** — treatment may switch on AND
off over time.
This module implements the methodology from de Chaisemartin & D'Haultfœuille
(2020/2022). The estimator ships the contemporaneous-switch path ``DID_M``
(= ``DID_1`` at horizon ``l = 1``); the full multi-horizon event study
``DID_l`` for ``l = 1..L_max`` via the ``L_max`` parameter, with normalized
estimator ``DID^n_l``, cost-benefit aggregate ``delta``, dynamic placebos
``DID^{pl}_l``, and sup-t simultaneous confidence bands; residualization-style
covariate adjustment (``controls``); group-specific linear trends
(``trends_linear``); state-set-specific trends (``trends_nonparam``);
heterogeneity testing; non-binary treatment; HonestDiD sensitivity
integration on placebos; survey support via Taylor-series linearization
(pweight + strata/PSU/FPC); and per-path event-study disaggregation via
``by_path=k`` (mirrors R ``did_multiplegt_dyn(..., by_path=k)``,
including per-path backward placebos and per-path joint sup-t
simultaneous bands when ``n_bootstrap > 0`` — Python-only extension
beyond R, which provides no joint bands at any surface) or via
``paths_of_interest=[(...), ...]`` for an explicit user-specified
path subset (Python-only API; mutex with ``by_path``). ``by_path``
supports binary or integer-coded discrete (D in Z) treatment, and
composes with ``survey_design`` for analytical Binder TSL SE and
replicate-weight bootstrap variance (multiplier bootstrap under
survey + by_path remains gated; no R parity since R
``did_multiplegt_dyn`` does not support survey weighting). ``by_path``
and ``paths_of_interest`` also compose with ``heterogeneity="
"``:
per-path heterogeneity coefficient surfaces on
``results.path_heterogeneity_effects`` (mirrors R
``did_multiplegt_dyn(..., by_path, predict_het)`` per-by_level).
The estimator:
1. Aggregates individual-level panel data to ``(group, time)`` cells
2. Drops multi-switch groups by default (matches R ``DIDmultiplegtDYN``)
3. Excludes singleton-baseline groups from the variance computation only (footnote 15 of the dynamic paper)
4. Computes per-period joiner (``DID_{+,t}``) and leaver (``DID_{-,t}``)
contributions via Theorem 3 of the AER 2020 paper
5. Aggregates them into ``DID_M``, the joiners-only ``DID_+``, and the
leavers-only ``DID_-``
6. Computes the single-lag placebo ``DID_M^pl``
7. When ``L_max >= 2``: computes per-group ``DID_{g,l}`` building blocks,
multi-horizon ``DID_l``, dynamic placebos ``DID^{pl}_l``, normalized
``DID^n_l``, and cost-benefit aggregate ``delta``
8. Optionally computes the TWFE decomposition diagnostic from Theorem 1
(per-cell weights, fraction negative, ``sigma_fe``)
9. Inference uses the cohort-recentered analytical plug-in variance from
Web Appendix Section 3.7.3 of the dynamic paper, optionally
complemented by a multiplier bootstrap clustered at the group level
(with sup-t simultaneous confidence bands when ``L_max >= 2``)
**When to use ChaisemartinDHaultfoeuille:**
- Treatment can switch on **and** off over time (e.g., marketing campaigns,
seasonal promotions, on/off policy cycles)
- You need separate joiners (``DID_+``) and leavers (``DID_-``) views, plus
the aggregate ``DID_M``
- You want a built-in placebo and a TWFE decomposition diagnostic computed
on the data you pass in (pre-filter) for direct comparison against
``DID_M``. The fitted TWFE diagnostic uses the FULL pre-filter cell
sample (matching :func:`twowayfeweights`); when ``fit()`` drops groups
via the ragged-panel or ``drop_larger_lower`` filters, a ``UserWarning``
is emitted to make the divergence from the post-filter ``DID_M`` sample
explicit. See REGISTRY.md ``ChaisemartinDHaultfoeuille`` ``Note (TWFE
diagnostic sample contract)`` for the rationale.
- You want a Python implementation that matches R ``DIDmultiplegtDYN`` at
``l = 1`` on cell-aggregated input (see REGISTRY.md for documented
deviations on individual-level inputs with uneven cell sizes)
All other staggered estimators in diff-diff (:class:`~diff_diff.CallawaySantAnna`,
:class:`~diff_diff.SunAbraham`, :class:`~diff_diff.ImputationDiD`,
:class:`~diff_diff.TwoStageDiD`, :class:`~diff_diff.EfficientDiD`,
:class:`~diff_diff.WooldridgeDiD`) assume treatment is **absorbing** —
once treated, stays treated. ``ChaisemartinDHaultfoeuille`` is the only
library option for non-absorbing treatments.
**Panel requirements (deviation from R DIDmultiplegtDYN):**
- Every group must have an observation at the **first global period**
(the panel's earliest time value). Groups missing this baseline raise
``ValueError`` with the offending group IDs.
- Groups with **interior period gaps** (missing observations between
their first and last observed period) are dropped with a
``UserWarning``.
- **Terminal missingness** (groups observed at the baseline but missing
one or more later periods - early exit / right-censoring) is supported.
The group contributes from its observed periods only, masked out of
the missing transitions by the per-period ``present`` guard in the
variance computation.
- This is a documented deviation from R ``DIDmultiplegtDYN``, which
supports unbalanced panels with missing-treatment-before-first-switch
handling. **Workaround:** pre-process your panel to back-fill the
baseline (or drop late-entry groups before fitting), or use R until
this restriction is lifted. See the
``Note (deviation from R DIDmultiplegtDYN)`` block in
``docs/methodology/REGISTRY.md`` for the rationale and the exact
defensive guards that make terminal missingness safe.
**References:**
- de Chaisemartin, C. & D'Haultfœuille, X. (2020). Two-Way Fixed Effects
Estimators with Heterogeneous Treatment Effects. *American Economic
Review*, 110(9), 2964-2996.
- de Chaisemartin, C. & D'Haultfœuille, X. (2022, revised 2024).
Difference-in-Differences Estimators of Intertemporal Treatment
Effects. NBER Working Paper 29873.
ChaisemartinDHaultfoeuille
--------------------------
Main estimator class for de Chaisemartin-D'Haultfœuille (dCDH) DiD estimation.
The alias :class:`~diff_diff.DCDH` is also available.
.. autoclass:: diff_diff.ChaisemartinDHaultfoeuille
:no-index:
:members:
:undoc-members:
:show-inheritance:
:inherited-members:
.. rubric:: Methods
.. autosummary::
~ChaisemartinDHaultfoeuille.fit
~ChaisemartinDHaultfoeuille.get_params
~ChaisemartinDHaultfoeuille.set_params
ChaisemartinDHaultfoeuilleResults
---------------------------------
Results container for dCDH estimation.
.. autoclass:: diff_diff.ChaisemartinDHaultfoeuilleResults
:no-index:
:members:
:undoc-members:
:show-inheritance:
.. rubric:: Methods
.. autosummary::
~ChaisemartinDHaultfoeuilleResults.summary
~ChaisemartinDHaultfoeuilleResults.print_summary
~ChaisemartinDHaultfoeuilleResults.to_dataframe
DCDHBootstrapResults
--------------------
Multiplier-bootstrap inference results, populated when ``n_bootstrap > 0``.
.. autoclass:: diff_diff.DCDHBootstrapResults
:no-index:
:members:
:undoc-members:
:show-inheritance:
Convenience Function
--------------------
.. autofunction:: diff_diff.chaisemartin_dhaultfoeuille
Standalone TWFE Decomposition Diagnostic
----------------------------------------
The TWFE decomposition diagnostic from Theorem 1 of de Chaisemartin &
D'Haultfœuille (2020) is also available as a standalone function for
users who want the diagnostic without fitting the full estimator. It
returns per-cell weights, the fraction of treated cells with negative
weights, and ``sigma_fe`` — the smallest standard deviation of per-cell
treatment effects that could flip the sign of the plain TWFE coefficient.
.. autofunction:: diff_diff.twowayfeweights
.. autoclass:: diff_diff.TWFEWeightsResult
:no-index:
:members:
Example Usage
-------------
Basic usage with reversible treatment::
from diff_diff import ChaisemartinDHaultfoeuille
from diff_diff.prep import generate_reversible_did_data
data = generate_reversible_did_data(
n_groups=80, n_periods=6, pattern="single_switch", seed=42,
)
est = ChaisemartinDHaultfoeuille()
results = est.fit(
data,
outcome="outcome",
group="group",
time="period",
treatment="treatment",
)
results.print_summary()
Joiners and leavers views::
print(f"DID_M (overall): {results.overall_att:.3f}")
print(f"DID_+ (joiners): {results.joiners_att:.3f}")
print(f"DID_- (leavers): {results.leavers_att:.3f}")
print(f"Placebo (DID^pl): {results.placebo_effect:.3f}")
Per-period decomposition::
for t, cell in results.per_period_effects.items():
print(
f"t={t}: DID+={cell['did_plus_t']:.3f} "
f"({cell['n_10_t']} joiners, {cell['n_00_t']} stable_0 controls)"
)
Multiplier bootstrap inference::
est = ChaisemartinDHaultfoeuille(
n_bootstrap=999, bootstrap_weights="rademacher", seed=42,
)
results = est.fit(
data, outcome="outcome", group="group",
time="period", treatment="treatment",
)
# When n_bootstrap > 0, the top-level overall_*/joiners_*/leavers_*
# p-value and conf_int fields hold percentile-based bootstrap
# inference (not normal-theory recomputations from the bootstrap SE).
# The t-stat is computed from the SE in both cases. See REGISTRY.md
# `Note (bootstrap inference surface)` for the full contract.
print(f"Top-level p-value (bootstrap): {results.overall_p_value:.4f}")
print(f"Top-level CI (bootstrap): {results.overall_conf_int}")
print(f"bootstrap_results.overall_se: {results.bootstrap_results.overall_se:.3f}")
print(f"bootstrap_results.overall_ci: {results.bootstrap_results.overall_ci}")
Standalone TWFE diagnostic (without fitting the full estimator)::
from diff_diff import twowayfeweights
diagnostic = twowayfeweights(
data, outcome="outcome", group="group", time="period", treatment="treatment",
)
print(f"Plain TWFE coefficient: {diagnostic.beta_fe:.3f}")
print(f"Fraction of negative weights: {diagnostic.fraction_negative:.3f}")
print(f"sigma_fe (sign-flipping threshold): {diagnostic.sigma_fe:.3f}")
The ``DCDH`` alias::
from diff_diff import DCDH
est = DCDH() # equivalent to ChaisemartinDHaultfoeuille()