Spillover-Aware DiD (Butts 2021)
=================================
Ring-indicator spillover-aware Difference-in-Differences estimator.
This module implements the methodology from Butts, K. (2023, originally 2021),
"Difference-in-Differences with Spatial Spillovers" (arXiv:2105.03737v3).
The estimator augments two-stage Gardner (2022) DiD with ring-indicator
covariates that identify, alongside the direct effect on treated units
(``tau_total``), per-ring spillover effects on near-control units
(``delta_j``).
The standard DiD estimator is biased for ``tau_total`` by
``-tau_spill(0)`` (paper Proposition 2.1 / Equations 1-2); the
ring-augmented regression removes this bias. The "far-away" cutoff
``d_bar`` defines the boundary beyond which spillovers vanish
(Assumption 5).
**When to use SpilloverDiD:**
- Spatial-policy DiD settings where treatment may spill over onto nearby
control units (border-RD, place-based policy, neighborhood effects,
geographic policy boundaries)
- When the canonical DiD coefficient is suspected to be attenuated or
inflated by spillover bias and a far-away control group exists
- As a robustness check on TwoStageDiD / ImputationDiD when the
treatment has plausibly local geographic externalities
**Reference:** Butts, K. (2023). Difference-in-Differences with Spatial
Spillovers. *arXiv:2105.03737v3*. Gardner, J. (2022). Two-stage
differences in differences. *arXiv:2207.05943*.
.. module:: diff_diff.spillover
SpilloverDiD
------------
Main estimator class.
.. autoclass:: diff_diff.SpilloverDiD
:no-index:
:members:
:undoc-members:
:show-inheritance:
:inherited-members:
.. rubric:: Methods
.. autosummary::
~SpilloverDiD.fit
~SpilloverDiD.get_params
~SpilloverDiD.set_params
SpilloverDiDResults
-------------------
Results container with per-ring spillover-effect table.
.. autoclass:: diff_diff.SpilloverDiDResults
:no-index:
:members:
:undoc-members:
:show-inheritance:
.. rubric:: Methods
.. autosummary::
~SpilloverDiDResults.summary
~SpilloverDiDResults.to_dict
Example Usage
-------------
Non-staggered 2-period panel with synthetic spillovers::
from diff_diff import SpilloverDiD
est = SpilloverDiD(
rings=[0, 50, 100, 200], # 3 rings: [0,50), [50,100), [100,200]
conley_coords=("lat", "lon"),
vcov_type="conley",
conley_cutoff_km=200.0,
conley_lag_cutoff=0,
)
result = est.fit(
data=df,
outcome="y",
unit="unit",
time="time",
treatment="D", # binary; auto-converts to first_treat
)
print(result.summary())
# Total effect on treated (Butts tau_total):
print(f"tau_total = {result.att:.4f}")
# Per-ring spillover-on-control:
print(result.spillover_effects)
Staggered timing (Gardner-convention first_treat column)::
result = est.fit(
data=df,
outcome="y",
unit="unit",
time="time",
first_treat="first_treat", # 0 / inf = never-treated
)
Identification spec
-------------------
The stage-2 regressor for ring j is the time-varying form
.. math::
(1 - D_{it}) \cdot \mathrm{Ring}_{it,j}
where ``Ring_{it,j}`` is the indicator that unit *i*'s nearest currently-
treated unit is in distance bin *j* at time *t*, and ``D_it`` is the
treatment indicator (1 if unit *i* is treated by time *t*). For
non-staggered timing, all treated units share one onset and ``Ring`` is
unit-static post-treatment / zero pre-treatment. For staggered timing,
``Ring_{it,j}`` is unit-time-varying (a unit's nearest treated neighbor
changes as more cohorts enter treatment).
Stage 1 fits unit + time fixed effects on Butts' subsample
``Omega_0 = {D_it = 0 AND S_it = 0}`` (untreated AND unexposed) — the
clean far-away control group. Reading the literal unit-static
``(1 - D_it) * S_i`` from paper Equation 5 yields a rank-deficient
design under TWFE; the diff-diff implementation uses the time-varying
form (paper page 12's ``S_it`` notation, made explicit in Section 5
Table 2).
Estimator Comparison
--------------------
.. list-table:: SpilloverDiD vs. TwoStageDiD vs. TwoWayFixedEffects
:header-rows: 1
:widths: 25 25 25 25
* - Feature
- SpilloverDiD
- TwoStageDiD
- TwoWayFixedEffects
* - Spillover handling
- Identifies per-ring delta_j
- None (assumes SUTVA)
- None (assumes SUTVA)
* - Methodology
- Two-stage Gardner + ring covariates
- Two-stage Gardner
- Single-stage TWFE
* - Staggered timing
- Yes
- Yes
- Biased under heterogeneity
* - Stage-1 subsample
- Butts: ``D=0 AND S=0`` (untreated AND unexposed)
- ``D=0`` (untreated)
- N/A (single stage)
* - Conley spatial-HAC SE
- Yes (Wave D GMM-corrected sandwich)
- Not yet supported
- Yes
* - Cluster-robust SE
- Yes (HC1 + CR1, Wave D GMM-corrected sandwich)
- Yes (GMM sandwich + clusters)
- Yes
Restrictions and follow-ups
---------------------------
The current implementation has the following documented restrictions
and planned follow-up enhancements:
- **Gardner GMM first-stage correction at stage 2** — SHIPPED in Wave D.
Stage-2 variance now applies the influence-function-based correction
for stage-1 FE estimation uncertainty across all three ``vcov_type``
paths (HC1, Conley, cluster) on both ``event_study=False`` AND
``event_study=True``. The IF formula is
``psi_i = gamma_hat' * X_{10,i} * eps_{10,i} - X_{2,i} * eps_{2,i}``
with ``gamma_hat = (X_10' X_10)^{-1} (X_1' X_2)``; the meat is
``Psi' K Psi`` where ``K`` is the path-dependent kernel matrix
(identity for HC1, block-indicator for cluster, spatial kernel for
Conley). Documented synthesis of Butts (2021) Section 3.1 + Gardner
(2022) Section 4 + Conley (1999); no reference software combines all
three ingredients. Point estimates unchanged; SE values shift upward
by 1-few percent depending on first-stage residual variance.
- **Event-study mode** — ``event_study=True`` is SHIPPED in Wave C.
The per-event-time × ring decomposition (Butts Section 5 / Table 2)
emits per-event-time direct effects ``tau_k`` and per-(ring,
event-time) spillover effects ``delta_jk`` as a ``att_dynamic``
DataFrame plus MultiIndex ``spillover_effects``. The
``event_study_effects: Dict[int, Dict]`` alias mirrors
``TwoStageDiD``'s schema for ``plot_event_study`` consumption (the
plotter prefers the new ``reference_period`` attribute over the
legacy ``n_obs==0`` heuristic). ``DiagnosticReport`` routing for
``SpilloverDiDResults`` is queued as a follow-up. Reference
period ``-1 - anticipation`` (TwoStageDiD parity). ``horizon_max``
bins event-times into endpoint pools (no row drop — divergence
from TwoStageDiD's filtering semantic, intentional per
``feedback_no_silent_failures``). ``horizon_max`` must be ``>=1`` or
``None`` under ``event_study=True``; ``horizon_max=0`` is rejected
(the single bin ``k=0`` leaves no event-time pair to anchor the
reference period — for a single aggregate effect, use
``event_study=False`` instead). Scalar ``att`` becomes a
share-weighted average of post-treatment ``tau_k`` with SE
from linear-combination inference on the post-treatment vcov block.
Per-event-time SEs apply the Wave D Gardner GMM first-stage
uncertainty correction (see the "Gardner GMM first-stage correction"
entry above). Wave E.1 amendment: when ``survey_design=`` is supplied,
the per-horizon shares are **survey-weight totals** rather than raw
observation counts (``share_k = sum_i w_i * 1{K_direct_i = k AND
treated_i = 1}``); the same vector enters both the ``att`` point
estimate and ``Var(att) = w' V_subset w``. Without this, the
lincom would mix unweighted aggregation shares with weighted WLS
horizon coefficients and target the wrong estimand. See the registry
"Variance (Wave E.1)" subsection for the full survey-aggregation
contract.
- **Survey-design integration (Wave E.1 — HC1 / CR1 via Binder TSL).**
SHIPPED in Wave E.1. ``survey_design=`` is now supported on
``vcov_type ∈ {"hc1"}`` and CR1 (``cluster=
``) paths.
.. note::
Wave E.1 composes Gerber (2026, arXiv:2605.04124) Proposition 1 —
Binder Taylor Series Linearization for IF representations of smooth
functionals; explicitly derived for TwoStageDiD in the paper's
Appendix — with the Wave D Gardner GMM first-stage uncertainty
correction (Butts 2021 §3.1 + Gardner 2022 §4) applied to
SpilloverDiD's ring-indicator stage-2 design. The composition is
mechanical: SpilloverDiD's per-obs IF
``psi_i = gamma_hat' * X_{10,i} * eps_{10,i} - X_{2,i} * eps_{2,i}``
is aggregated to PSU level, then passed to the audited Binder TSL
meat helper. Survey weights enter via Hájek normalization at the
gamma_hat solve, eps construction, and bread inversion. Degrees of
freedom for the t-distribution lookup use
``ResolvedSurveyDesign.df_survey`` (4-way branch: PSU+strata →
``n_PSU - n_strata``; PSU only → ``n_PSU - 1``; strata only →
``n_obs - n_strata``; neither → ``n_obs - 1``). No reference
software combines all ingredients.
Restrictions:
- Replicate-weight variance (BRR / Fay / JK1 / JKn / SDR) raises
``NotImplementedError``; per Gerber (2026) Appendix A, the
IF-reweighting shortcut does not apply because ``gamma_hat`` is
weight-sensitive — follow-up requires per-replicate full re-fit.
- Singleton-stratum ``lonely_psu="remove"`` with all strata singletons
saturates ``df_survey = 0`` and the meat helper NaN-fails; SE /
t-stat / p-value / CI all NaN-propagate (no silent fallback to HC1).
- ``cluster= + survey_design.psu`` with **different groupings**
emits a ``UserWarning`` and uses PSU as the cluster (mirrors
``TwoStageDiD._resolve_effective_cluster``). When the two groupings
are identical, no warning fires; PSU still takes precedence
(inference is unchanged either way).
- ``SurveyDesign.subpopulation()`` and warn-and-drop full-design
retention SHIPPED in Wave E.3 — see "Survey-design integration
(Wave E.3)" entry below.
- **Survey-design integration (Wave E.2 — Conley × survey via
stratified-Conley sandwich on PSU totals).** SHIPPED in Wave E.2.
``vcov_type="conley" + survey_design=`` is now supported via a
per-stratum Conley sandwich applied to PSU-aggregated Wave D Gardner
GMM influence functions.
.. note::
Wave E.2 composes Conley (1999) spatial-HAC with Gerber (2026,
arXiv:2605.04124) Proposition 1 Binder TSL (the Wave E.1 foundation)
and the Wave D Gardner GMM first-stage uncertainty correction
(Butts 2021 §3.1 + Gardner 2022 §4) applied to SpilloverDiD's
ring-indicator stage-2 design. The composition is **panel-aware** —
it preserves the library's existing ``conley_lag_cutoff = 0``
semantic ("within-period spatial only, exclude cross-period pairs")
by looping over periods and aggregating Psi to PSU totals WITHIN
each period (not over the whole panel). For each period ``t``,
``S_psu_t[g] = sum_{i in PSU g, time t} psi_i``; per-PSU centroids
are panel-constant (mean of per-observation ``conley_coords``);
for each stratum the within-stratum sandwich is
``M_h_t = (1 - f_h) * n_h/(n_h-1) * sum_{j,k in PSUs_h}
K(d(centroid_j, centroid_k) / cutoff) *
(S_psu_t[j] - S_bar_h_t)(S_psu_t[k] - S_bar_h_t)'``, where K is the
Bartlett kernel (SpilloverDiD currently exposes Bartlett only and
hardcodes it at the fit-call site; the survey helper's ``kernel``
parameter can also take ``"uniform"``, but exposing that on the
SpilloverDiD constructor is a separate follow-up). Cross-stratum
kernel weights are exactly zero by sampling design (strata are
exact independence partitions). Total meat is ``sum_t sum_h M_h_t``.
Cross-period spatial pairs are excluded by construction. No
reference software combines all three ingredients on a two-stage
influence function.
Reduction semantics:
- Per-period sum invariant: ``sum_t`` of per-period within-stratum
stratified-Conley sandwiches on per-period PSU totals. Pinned at
``tests/test_spillover.py::TestSpilloverDiDWaveE2ConleySurveyDesign::test_b_panel_aware_per_period_sum_invariant``
(pure unit test on the orchestrator + helper composition).
- Single stratum (H = 1, FPC = inf): reduces to ``sum_t`` plain
Conley sandwich on per-period PSU totals (NOT on time-collapsed
PSU totals — the per-period loop preserves ``lag_cutoff = 0``
semantics).
- All PSUs singleton + ``lonely_psu="remove"``: ``df_survey = 0`` and
the stratified-Conley meat NaN-fails (matches Wave E.1 saturation
behaviour, with ``UserWarning`` template "Wave E.2 stratified-Conley
sandwich: df_survey = 0...").
Restrictions:
- Replicate-weight variance (BRR / Fay / JK1 / JKn / SDR) raises
``NotImplementedError`` (inherits Wave E.1 gate; per-replicate refit
is separate follow-up scope).
- ``cluster= + survey_design.psu + vcov_type="conley"``:
``cluster=`` is coerced to PSU per Wave E.1's warn-and-use-PSU
pattern; the Conley cluster product kernel becomes a no-op after
PSU aggregation.
- The LinearRegression-side ``vcov_type="conley" + survey_design=``
gate at ``diff_diff/linalg.py`` remains a separate roadmap (not Wave E)
— weighted spatial-HAC under probability sampling is an open
methodological question; no canonical extension of Conley (1999) exists
for the combination.
- DiagnosticReport routing for ``SpilloverDiDResults(vcov_type="conley",
survey_design=)`` is queued for a follow-up (the
``_APPLICABILITY`` / ``_PT_METHOD`` wiring must register the new
combination first).
- **Survey-design integration (Wave E.2 follow-up —
``conley_lag_cutoff > 0`` panel-block composition).** SHIPPED in
Wave E.2 follow-up. ``vcov_type="conley" + conley_lag_cutoff > 0 +
survey_design=`` is now supported by adding a within-PSU serial
Bartlett HAC term to the Wave E.2 spatial sandwich, **provided the
survey design has an effective PSU** (either explicit
``survey_design.psu`` or a ``cluster=`` argument that gets
injected as the effective PSU per Wave E.1's
``_inject_cluster_as_psu`` routing). No-effective-PSU survey designs
(weights-only / strata-only WITHOUT a cluster fallback) raise
``NotImplementedError`` at ``SpilloverDiD.fit`` post-resolution —
see the Restrictions list below.
.. note::
Wave E.2 follow-up composes Wave E.2's panel-aware stratified-Conley
spatial sandwich with within-PSU serial Bartlett HAC over time
(Newey-West 1987 form, kernel weights ``1 - |t-s|/(L+1)`` for
``|t-s| <= L, t != s``). The composition is
``meat = meat_spatial + meat_serial`` with disjoint index sets,
exactly matching the no-survey panel-block decomposition at
``diff_diff.conley._compute_conley_meat`` (Conley 1999 + Newey-West
1987 separable form, NOT Driscoll-Kraay 2D-HAC). For each stratum
``h``: ``meat_serial_h = FPC_h_panel * sum_{g in stratum h}
sum_{|t-s| <= L, t != s, both periods present for PSU g}
K_serial(|t-s|/(L+1)) * S_centered_t[g] @ S_centered_s[g]'`` where
``S_centered_t[g] = S_psu_t[g] - S_bar_h(g)_t`` is per-period
within-stratum centered (Binder TSL form — matches the spatial
helper's centering exactly), and ``|t-s|`` uses panel-wide dense
time codes (matches ``conley.py:940`` documented R deviation that
mirrors R ``conleyreg``). Serial Bartlett kernel is hardcoded
regardless of ``conley_kernel`` (mirrors ``conley.py:951-965``;
``conley_kernel`` governs spatial kernel only). FPC for serial uses
panel-wide ``n_h_panel = |unique PSUs in stratum h across active
sample|``, NOT per-period ``n_h_t`` (the serial sum is a PANEL-level
construct; standalone Newey-West composition on stratified clusters,
deliberately NOT by analogy to the cross-sectional Wave E.2 spatial
FPC convention). No reference software combines panel-block Conley +
Binder TSL + Gardner GMM correction on a two-stage influence
function.
Reduction semantics:
- ``conley_lag_cutoff = 0`` or ``None``: **bit-identical** to shipped
Wave E.2 ATT and scalar SE (orchestrator skips the serial helper
invocation when ``L = 0`` so meat_serial does not contribute; the
test_a2 mock-spy independently asserts the helper isn't invoked).
- ``conley_time is None`` or ``T = 1``: serial helper short-circuits to
zero meat (no cross-period pairs possible).
- Single stratum (H = 1, FPC = inf) with ``L > 0``: serial reduces to
Newey-West Bartlett HAC on per-period within-stratum-CENTERED per-PSU
score sequences (NOT raw scores — Binder TSL centering is retained at
H=1 because the survey path always subtracts the per-period stratum
mean before the kernel application; the panel-wide ``G/(G-1)``
survey factor replaces FPC).
- Bandwidth → 0 with ``L > 0``: spatial reduces to per-period
within-stratum HC sandwich; serial term unchanged (separable form).
- All PSUs singleton + ``lonely_psu="remove"`` with ``L > 0``: meat
NaN-fails on the saturation diagnostic (template "Wave E.2
stratified-Conley sandwich" covers both spatial-only and panel-block
cases).
Centering asymmetry vs no-survey reference:
- The no-survey panel-block path at ``conley.py:949-965`` uses RAW
scores for the serial term (no centering) because it assumes
``E[scores] = 0`` under correct specification. The survey-weighted
Binder TSL form estimates the within-stratum mean and centers
explicitly (textbook stratified-cluster sandwich); raw scores in the
survey case would inflate variance by twice the squared per-period
stratum mean and would NOT reduce to the cross-sectional Wave E.2
form at ``lag = 0``.
Restrictions:
- **Requires an effective PSU** — either explicit ``survey_design.psu``
OR ``cluster=`` injected as the effective PSU per Wave E.1's
``_inject_cluster_as_psu``. No-effective-PSU survey designs
(weights-only / strata-only WITHOUT a cluster fallback) raise
``NotImplementedError`` post-resolution at ``SpilloverDiD.fit``
because the pseudo-PSU = obs-index fallback would silently zero
the serial sum (each pseudo-PSU appears in exactly one period).
Tracked as a follow-up in ``TODO.md``.
- Replicate-weight variance (BRR / Fay / JK1 / JKn / SDR) raises
``NotImplementedError`` (inherits Wave E.1 gate).
- DiagnosticReport routing for the panel-block case is queued for the
same Wave F follow-up as the Wave E.2 cross-sectional case.
- **Survey-design integration (Wave E.3 — `SurveyDesign.subpopulation()`
/ warn-and-drop full-design retention via zero-pad scores).** SHIPPED
in Wave E.3. ``SurveyDesign.subpopulation()``-derived designs AND
warn-and-drop fits now preserve the full-domain resolved survey
design: ``n_psu`` / ``n_strata`` / ``df_survey`` and the Binder TSL
per-stratum centering reflect the FULL domain rather than the
post-``finite_mask`` fit sample.
.. note::
Wave E.3 adopts the canonical zero-pad convention from R
``survey::svyrecvar`` applied to a ``subset()`` design (Lumley
2010 §2.5 "Domains and subpopulations") — the same convention
already established in ``diff_diff/imputation.py:2175-2183``
(PreTrendsImputation lead regression) and
``diff_diff/prep.py:1401-1432`` (DCDH cell variance). The
``gamma_hat`` / ``Psi`` construction stays on SURVEY-FINITE-MASK
inputs (``X_1_sparse_fit``, ``X_10_sparse_fit``, ``eps_10_fit``
built on ``survey_finite_mask = finite_mask & survey_weights > 0``;
``X_2_kept_gamma``, ``eps_2_fit_gamma``,
``survey_weights_fit_gamma`` projected from the fit-sample frame
down to ``survey_finite_mask``) so the drop-first stage-1 FE
column space excludes zero-weight subpop rows — critical because
``_build_butts_fe_design_csr`` re-factorizes inputs via
``pd.factorize`` and drops the first unit / time code; including
zero-weight subpop rows in the factorize input would change which
unit/time code sorts first and silently shift ``gamma_hat``. The full-domain zero-pad
invariant is delivered by a new optional
``score_pad_mask=survey_finite_mask`` kwarg on
``_compute_gmm_corrected_meat``, where
``survey_finite_mask = finite_mask & (survey_weights > 0)``
under the survey path (= ``finite_mask`` on the no-survey
path). The survey-path subset removes zero-weight subpop rows
from the FE basis-construction sample so the drop-first
identification is invariant to which units the subpop mask
excluded; warn-and-dropped rows (NaN ``y_tilde``) are also
excluded since ``finite_mask`` is False for them. The helper
zero-pads the resulting survey-finite-mask ``Psi`` back to
full panel length AFTER construction but BEFORE kernel dispatch
via ``Psi_padded[score_pad_mask] = Psi``. Kernel-dispatch arrays
(``cluster_ids``, ``conley_coords``, ``conley_time``,
``conley_unit``, ``resolved_survey``) are passed at FULL
length so the meat helpers (Binder TSL / stratified-Conley /
serial Bartlett) see the full-domain PSU / strata / centroid
/ time geometry. The stage-2 OLS solve still runs on
fit-sample-aligned ``X_2_kept`` / ``y_tilde_fit`` (active
sample). No reference software combines all four ingredients
(Conley + Binder TSL + Gardner GMM + R
``svyrecvar``-style zero-pad) on a two-stage influence function.
Reduction semantics:
- When ``finite_mask.all() == True`` AND all weights ``> 0`` (no
zero-pad needed): bit-identical to shipped Wave E.2 /
E.2-follow-up baseline.
- A2 invariant: warn-and-drop and subpopulation drops are treated
identically (both apply the zero-pad mechanism).
- Subpopulation parity vs upstream-subset: ``df_survey`` matches the
full domain (``n_psu_full - n_strata_full``) regardless of how
many rows the subpopulation mask excludes. SE may differ from
``fit(data[mask], ..., survey_design=plain_design)`` by design
(subpopulation retains zero-padded PSU geometry; subset drops
PSUs entirely).
Restrictions:
- Replicate-weight variance + subpopulation continues to raise
``NotImplementedError`` at the Wave E.1 gate
(``spillover.py:2400``).
- TwoStageDiD's analogous ``finite_mask + design-subset`` pattern at
``two_stage.py:567-601`` is NOT yet adopted to Wave E.3 — separate
parity follow-up tracked in ``TODO.md``.
- **Count-of-treated-in-ring** — only the "nearest-treated ring"
specification is implemented. The "count" form re-introduces
functional-form dependence (paper Section 3.2 end) and is queued.
- **Data-driven d_bar selection** — Butts (2021b) / Butts (2023) JUE
Insight propose cross-validation under stronger parallel-trends
assumptions. Not in this PR.
- **HC2 / HC2_BM (Bell-McCaffrey / CR2) variance** — current stage-2
inference uses a generic residual-df (n - effective rank) for
t-distribution lookups. ``vcov_type="hc2"`` / ``"hc2_bm"`` require
per-coefficient BM / CR2 DOF and raise ``NotImplementedError``.
Routing stage 2 through ``LinearRegression`` (which supplies the
per-coefficient DOF metadata) is queued.
- **`vcov_type="classical"` (Wave D restriction)** — raises
``NotImplementedError``. The Wave D Gardner GMM first-stage
uncertainty correction has not been derived for the classical
homoskedastic variance (different meat structure
``sigma_hat^2 * (X_10' X_10)`` vs the Wave D IF outer product
``Psi' Psi``). Use ``vcov_type="hc1"`` for heteroskedasticity-robust
SE with the GMM correction, or combine with ``cluster=`` for
CR1 with the GMM correction; both apply the Wave D synthesis
(Butts §3.1 + Gardner §4 + Conley 1999) unconditionally.
- **Balanced panel required** — every unit must observe every period.
An unbalanced (unit, time) Ω₀ bipartite graph can produce disconnected
FE components and unidentified stage-1 residuals on treated rows.
Validator rejects unbalanced inputs.
- **Omega_0 row-level identification (period strict, unit warn-drop, plus connectivity)** —
Every period must have at least one Omega_0 row (else time FE for
that period is structurally unidentified; hard ``ValueError``). Units
lacking Omega_0 rows (e.g. baseline-treated units with ``D_it = 1`` at
every observed ``t``) are warned-and-dropped: their unit FE is NaN
and the downstream finite-mask path excludes them from stage 2.
Additionally, the supported-units bipartite graph (units linked by
shared Omega_0 periods) must form a single connected component;
``K > 1`` components raise ``ValueError`` because the FE solver would
return only component-specific constants and residualization would
silently mix them across components. Mirrors ``TwoStageDiD``'s
always-treated unit handling on the warn-drop axis.
- **One row per (unit, time) cell required** — duplicate cells silently
re-weight both stage-1 FE estimation and stage-2 OLS. Validator
rejects duplicate cells; aggregate to unique cells before fitting.
- **rings[0] must equal 0** — the partition must cover treated
locations (``d_it = 0`` belongs to Ring 1). Validator rejects rings
that start at a nonzero inner edge to prevent a silent
exposed-but-unmodeled population in ``0 <= d_it < rings[0]``.
- **conley_coords must be constant within unit** — ring construction
uses a single (lat, lon) per unit (units are stationary point
locations). Validator rejects coordinates that vary across rows of
the same unit; aggregate to one location per unit (e.g.
``df.groupby(unit)[["lat","lon"]].first()``) before fitting if your
raw data has moving coordinates. This requirement applies on every
fit path, not just ``vcov_type="conley"``, because the rings
themselves are always coordinate-derived.