Spillover-Aware DiD (Butts 2021)#

Ring-indicator spillover-aware Difference-in-Differences estimator.

This module implements the methodology from Butts, K. (2023, originally 2021), “Difference-in-Differences with Spatial Spillovers” (arXiv:2105.03737v3). The estimator augments two-stage Gardner (2022) DiD with ring-indicator covariates that identify, alongside the direct effect on treated units (tau_total), per-ring spillover effects on near-control units (delta_j).

The standard DiD estimator is biased for tau_total by -tau_spill(0) (paper Proposition 2.1 / Equations 1-2); the ring-augmented regression removes this bias. The “far-away” cutoff d_bar defines the boundary beyond which spillovers vanish (Assumption 5).

When to use SpilloverDiD:

  • Spatial-policy DiD settings where treatment may spill over onto nearby control units (border-RD, place-based policy, neighborhood effects, geographic policy boundaries)

  • When the canonical DiD coefficient is suspected to be attenuated or inflated by spillover bias and a far-away control group exists

  • As a robustness check on TwoStageDiD / ImputationDiD when the treatment has plausibly local geographic externalities

Reference: Butts, K. (2023). Difference-in-Differences with Spatial Spillovers. arXiv:2105.03737v3. Gardner, J. (2022). Two-stage differences in differences. arXiv:2207.05943.

SpilloverDiD#

Main estimator class.

class diff_diff.SpilloverDiD[source]

Bases: object

Ring-indicator spillover-aware DiD (Butts 2021).

Standalone estimator implementing two-stage Gardner (2022) methodology with ring-indicator covariates that identify the direct effect on treated units (tau_total) alongside per-ring spillover effects on near-control units (delta_j). Supports both panel non-staggered timing and Section 5 staggered timing in a single fit() entry point — non-staggered is the special case where all treated units share an onset time.

Parameters:
  • rings (list of float) – Sorted distance breakpoints with at least 2 elements. K = len(rings) - 1 rings are constructed.

  • d_bar (float, optional) – Far-away cutoff (Butts Assumption 5). Defaults to max(rings); if explicitly set, must equal max(rings). Wave B MVP does not support a d_bar strictly larger than the outermost ring edge (a “dead zone” where units satisfy rings[-1] < d_i <= d_bar but are in neither a ring nor the far-away group has no clean methodological interpretation). To use a smaller spillover bandwidth, shrink the outermost ring edge instead.

  • vcov_type (str, default="hc1") – Variance estimator. Set to "conley" and supply conley_coords/conley_cutoff_km/conley_lag_cutoff to enable Conley spatial-HAC at stage 2 (recommended per paper Section 3.1).

  • conley_coords (tuple of (str, str), optional) – (lat_col, lon_col) column names. Used for ring construction AND for the Conley vcov spatial kernel.

  • conley_metric (str or callable, default="haversine") – Distance metric used for both ring construction and the Conley spatial kernel. See diff_diff.conley for callable contract.

  • conley_cutoff_km (float, optional) – Conley spatial-HAC bandwidth. Required when vcov_type="conley".

  • conley_lag_cutoff (int, optional) – Within-unit Bartlett max lag. Required when vcov_type="conley". Use 0 to suppress the serial-component sandwich.

  • cluster (str, optional) – Column name for cluster-robust variance, or the combined Conley cluster product kernel when paired with vcov_type="conley".

  • alpha (float, default=0.05) – Significance level for confidence intervals.

  • anticipation (int, default=0) – Number of pre-treatment periods where effects may occur. Treatment and ring-membership clocks both shift by -anticipation so the stage-1 untreated-and-unexposed subsample correctly excludes anticipation rows.

  • event_study (bool, default=False) – If True, emit per-event-time × ring coefficients (Butts Table 2 staggered specification). The result’s spillover_effects DataFrame uses a MultiIndex over (ring, event_time).

  • horizon_max (int, optional) – Maximum absolute event-study horizon. Used only when event_study=True. Event-times outside [-horizon_max, +horizon_max] are binned into endpoint pools (k <= -H aggregated into a single pre-bin coefficient; k >= +H into a single post-bin coefficient), so no observations are dropped. This intentionally diverges from diff_diff.two_stage.TwoStageDiD, which filters rows with |K| > horizon_max out of the stage-2 sample. The endpoint-pool semantic honors the library’s no-silent-data-drop policy (feedback_no_silent_failures). When None, the helper auto-detects the bin set from observed K values. If ref_period = -1 - anticipation falls outside [-horizon_max, +horizon_max] the fit raises ValueError.

  • rank_deficient_action ({"warn", "error", "silent"}, default="warn") – Action when the stage-2 design is rank-deficient.

results_

Populated after fit() completes.

Type:

SpilloverDiDResults

is_fitted_
Type:

bool

Notes

The implementation uses two-stage Gardner methodology with the time-varying S_it = S_i * 1{t >= t_treat} form (paper page 12, just above Equation 5). Reading the literal unit-static (1 - D_it) * S_i from Equation 5 yields a rank-deficient design under TWFE; Section 5’s Table 2 makes the time-varying form explicit. The diff-diff implementation matches the paper’s identification argument once the S_it notation is read correctly.

For non-staggered timing, Gardner identity → stage-2 point estimates equal a single-stage TWFE with the time-varying spillover regressor.

Methods

fit(data, *, outcome, unit, time[, ...])

Fit the two-stage Gardner DiD with ring-indicator covariates.

get_params()

set_params(**params)

__init__(*, rings, d_bar=None, vcov_type='hc1', conley_coords=None, conley_metric='haversine', conley_cutoff_km=None, conley_lag_cutoff=None, cluster=None, alpha=0.05, anticipation=0, event_study=False, horizon_max=None, rank_deficient_action='warn')[source]
Parameters:
  • rings (List[float])

  • d_bar (float | None)

  • vcov_type (str)

  • conley_coords (Tuple[str, str] | None)

  • conley_metric (Literal['haversine', 'euclidean'] | ~typing.Callable[[~numpy.ndarray, ~numpy.ndarray], ~numpy.ndarray])

  • conley_cutoff_km (float | None)

  • conley_lag_cutoff (int | None)

  • cluster (str | None)

  • alpha (float)

  • anticipation (int)

  • event_study (bool)

  • horizon_max (int | None)

  • rank_deficient_action (str)

get_params()[source]
Return type:

Dict[str, Any]

set_params(**params)[source]
Parameters:

params (Any)

Return type:

SpilloverDiD

fit(data, *, outcome, unit, time, treatment=None, first_treat=None, covariates=None, survey_design=None)[source]

Fit the two-stage Gardner DiD with ring-indicator covariates.

Methodology (Butts 2021 Section 5 + Gardner 2022):

  1. Compute per-row spillover indicators from conley_coords.

  2. Build stage-1 subsample Omega_0 = {D_it=0 AND S_it=0} (untreated AND unexposed) — Butts’ clean control group.

  3. Stage 1: fit Y_it = mu_i + lambda_t + u on Omega_0.

  4. Residualize: Y_tilde = Y - mu_hat - lambda_hat for ALL rows.

  5. Stage 2: regress Y_tilde on [D_it, (1-D_it)*Ring_{it,j}] via solve_ols(), threading the configured vcov_type.

  6. Wrap as SpilloverDiDResults.

Notes

Stage-2 variance applies the Wave D Gardner (2022) GMM first-stage uncertainty correction across all supported vcov_type paths ("hc1", "conley", "cluster" via cluster=<col>). The unified IF outer-product formula is psi_i = gamma_hat' * X_{10,i} * eps_{10,i} - X_{2,i} * eps_{2,i} with meat = Psi' K Psi where K is path-dependent (identity for HC1, block-indicator for cluster, spatial kernel for Conley). Documented synthesis of Butts (2021) §3.1 + Gardner (2022) §4 + Conley (1999); no reference software combines all three. vcov_type="classical" raises NotImplementedError because the Wave D synthesis has not been derived for the homoskedastic meat structure sigma_hat^2 * (X_10' X_10); use "hc1" for heteroskedasticity-robust SE with the GMM correction.

Parameters:
Return type:

SpilloverDiDResults

SpilloverDiDResults#

Results container with per-ring spillover-effect table.

class diff_diff.SpilloverDiDResults[source]

Bases: DiDResults

Results from a ring-indicator spillover-aware DiD estimation (Butts 2021).

Extends DiDResults with per-ring spillover effect estimates and diagnostic counts for the identifying control population.

att

Total effect on the treated, tau_total (inherited from DiDResults).

Type:

float

spillover_effects

Per-ring spillover-on-control estimates. Columns: ["coef", "se", "t_stat", "p_value", "ci_low", "ci_high"]. Index: ring label like "[0, 50)"; for event-study output the index is a MultiIndex over (ring_label, event_time).

Type:

pd.DataFrame, optional

ring_breakpoints

Sorted distance breakpoints used to construct the rings.

Type:

list of float

d_bar

Far-away cutoff (Butts Assumption 5). Equals max(ring_breakpoints) unless explicitly overridden.

Type:

float

n_units_ever_in_ring

Counts of UNIQUE units that appear in each ring AT LEAST ONCE across observed periods. Under staggered timing the ring membership is time-varying, so a unit can be counted in multiple rings (one per period it visits). Under non-staggered timing the rings are unit-static, so this is a clean partition (each unit appears in exactly one ring or the far-away group). Includes treated units in Ring_1 by geometric construction even though the (1 - D_it) factor zeros their stage-2 contribution.

Type:

dict[str, int]

n_far_away_obs

Number of observations with D_it = 0 AND d_it > d_bar; these observations identify the counterfactual trend (Butts Assumption 5(ii)).

Type:

int

is_staggered

True if multiple distinct treatment-onset times were detected.

Type:

bool

event_study

True if per-event-time ring coefficients were emitted (i.e., self.event_study=True was set on the estimator).

Type:

bool

stage1_n_obs

Number of observations in the stage-1 untreated-and-unexposed subsample (Omega_0_butts).

Type:

int

Methods

summary([alpha])

Extended summary with ATT row, per-event-time direct block, and per-(ring, event-time) spillover block.

to_dict()

Override to serialize spillover_effects DataFrame as list-of-dicts.

spillover_effects: DataFrame | None = None
ring_breakpoints: List[float] | None = None
d_bar: float | None = None
n_units_ever_in_ring: Dict[str, int] | None = None
n_far_away_obs: int | None = None
is_staggered: bool | None = None
event_study: bool | None = None
stage1_n_obs: int | None = None
anticipation: int | None = None
att_dynamic: DataFrame | None = None
event_study_effects: Dict[int, Dict[str, Any]] | None = None
horizon_max: int | None = None
reference_period: int | None = None
summary(alpha=None)[source]

Extended summary with ATT row, per-event-time direct block, and per-(ring, event-time) spillover block.

Parameters:

alpha (float | None)

Return type:

str

to_dict()[source]

Override to serialize spillover_effects DataFrame as list-of-dicts.

Return type:

Dict[str, Any]

__init__(att, se, t_stat, p_value, conf_int, n_obs, n_treated, n_control, alpha=0.05, coefficients=None, vcov=None, residuals=None, fitted_values=None, r_squared=None, inference_method='analytical', n_bootstrap=None, n_clusters=None, bootstrap_distribution=None, survey_metadata=None, vcov_type=None, cluster_name=None, conley_lag_cutoff=None, spillover_effects=None, ring_breakpoints=None, d_bar=None, n_units_ever_in_ring=None, n_far_away_obs=None, is_staggered=None, event_study=None, stage1_n_obs=None, anticipation=None, att_dynamic=None, event_study_effects=None, horizon_max=None, reference_period=None)
Parameters:
Return type:

None

Example Usage#

Non-staggered 2-period panel with synthetic spillovers:

from diff_diff import SpilloverDiD

est = SpilloverDiD(
    rings=[0, 50, 100, 200],          # 3 rings: [0,50), [50,100), [100,200]
    conley_coords=("lat", "lon"),
    vcov_type="conley",
    conley_cutoff_km=200.0,
    conley_lag_cutoff=0,
)
result = est.fit(
    data=df,
    outcome="y",
    unit="unit",
    time="time",
    treatment="D",                    # binary; auto-converts to first_treat
)
print(result.summary())
# Total effect on treated (Butts tau_total):
print(f"tau_total = {result.att:.4f}")
# Per-ring spillover-on-control:
print(result.spillover_effects)

Staggered timing (Gardner-convention first_treat column):

result = est.fit(
    data=df,
    outcome="y",
    unit="unit",
    time="time",
    first_treat="first_treat",        # 0 / inf = never-treated
)

Identification spec#

The stage-2 regressor for ring j is the time-varying form

\[(1 - D_{it}) \cdot \mathrm{Ring}_{it,j}\]

where Ring_{it,j} is the indicator that unit i’s nearest currently- treated unit is in distance bin j at time t, and D_it is the treatment indicator (1 if unit i is treated by time t). For non-staggered timing, all treated units share one onset and Ring is unit-static post-treatment / zero pre-treatment. For staggered timing, Ring_{it,j} is unit-time-varying (a unit’s nearest treated neighbor changes as more cohorts enter treatment).

Stage 1 fits unit + time fixed effects on Butts’ subsample Omega_0 = {D_it = 0 AND S_it = 0} (untreated AND unexposed) — the clean far-away control group. Reading the literal unit-static (1 - D_it) * S_i from paper Equation 5 yields a rank-deficient design under TWFE; the diff-diff implementation uses the time-varying form (paper page 12’s S_it notation, made explicit in Section 5 Table 2).

Estimator Comparison#

SpilloverDiD vs. TwoStageDiD vs. TwoWayFixedEffects#

Feature

SpilloverDiD

TwoStageDiD

TwoWayFixedEffects

Spillover handling

Identifies per-ring delta_j

None (assumes SUTVA)

None (assumes SUTVA)

Methodology

Two-stage Gardner + ring covariates

Two-stage Gardner

Single-stage TWFE

Staggered timing

Yes

Yes

Biased under heterogeneity

Stage-1 subsample

Butts: D=0 AND S=0 (untreated AND unexposed)

D=0 (untreated)

N/A (single stage)

Conley spatial-HAC SE

Yes (Wave D GMM-corrected sandwich)

Not yet supported

Yes

Cluster-robust SE

Yes (HC1 + CR1, Wave D GMM-corrected sandwich)

Yes (GMM sandwich + clusters)

Yes

Restrictions and follow-ups#

The current implementation has the following documented restrictions and planned follow-up enhancements:

  • Gardner GMM first-stage correction at stage 2 — SHIPPED in Wave D. Stage-2 variance now applies the influence-function-based correction for stage-1 FE estimation uncertainty across all three vcov_type paths (HC1, Conley, cluster) on both event_study=False AND event_study=True. The IF formula is psi_i = gamma_hat' * X_{10,i} * eps_{10,i} - X_{2,i} * eps_{2,i} with gamma_hat = (X_10' X_10)^{-1} (X_1' X_2); the meat is Psi' K Psi where K is the path-dependent kernel matrix (identity for HC1, block-indicator for cluster, spatial kernel for Conley). Documented synthesis of Butts (2021) Section 3.1 + Gardner (2022) Section 4 + Conley (1999); no reference software combines all three ingredients. Point estimates unchanged; SE values shift upward by 1-few percent depending on first-stage residual variance.

  • Event-study modeevent_study=True is SHIPPED in Wave C. The per-event-time × ring decomposition (Butts Section 5 / Table 2) emits per-event-time direct effects tau_k and per-(ring, event-time) spillover effects delta_jk as a att_dynamic DataFrame plus MultiIndex spillover_effects. The event_study_effects: Dict[int, Dict] alias mirrors TwoStageDiD’s schema for plot_event_study consumption (the plotter prefers the new reference_period attribute over the legacy n_obs==0 heuristic). DiagnosticReport routing for SpilloverDiDResults is queued as a follow-up. Reference period -1 - anticipation (TwoStageDiD parity). horizon_max bins event-times into endpoint pools (no row drop — divergence from TwoStageDiD’s filtering semantic, intentional per feedback_no_silent_failures). horizon_max must be >=1 or None under event_study=True; horizon_max=0 is rejected (the single bin k=0 leaves no event-time pair to anchor the reference period — for a single aggregate effect, use event_study=False instead). Scalar att becomes a sample-share-weighted average of post-treatment tau_k with SE from linear-combination inference on the post-treatment vcov block. Per-event-time SEs apply the Wave D Gardner GMM first-stage uncertainty correction (see the “Gardner GMM first-stage correction” entry above).

  • Survey-design integrationsurvey_design= raises NotImplementedError.

  • Count-of-treated-in-ring — only the “nearest-treated ring” specification is implemented. The “count” form re-introduces functional-form dependence (paper Section 3.2 end) and is queued.

  • Data-driven d_bar selection — Butts (2021b) / Butts (2023) JUE Insight propose cross-validation under stronger parallel-trends assumptions. Not in this PR.

  • HC2 / HC2_BM (Bell-McCaffrey / CR2) variance — current stage-2 inference uses a generic residual-df (n - effective rank) for t-distribution lookups. vcov_type="hc2" / "hc2_bm" require per-coefficient BM / CR2 DOF and raise NotImplementedError. Routing stage 2 through LinearRegression (which supplies the per-coefficient DOF metadata) is queued.

  • `vcov_type=”classical”` (Wave D restriction) — raises NotImplementedError. The Wave D Gardner GMM first-stage uncertainty correction has not been derived for the classical homoskedastic variance (different meat structure sigma_hat^2 * (X_10' X_10) vs the Wave D IF outer product Psi' Psi). Use vcov_type="hc1" for heteroskedasticity-robust SE with the GMM correction, or combine with cluster=<col> for CR1 with the GMM correction; both apply the Wave D synthesis (Butts §3.1 + Gardner §4 + Conley 1999) unconditionally.

  • Balanced panel required — every unit must observe every period. An unbalanced (unit, time) Ω₀ bipartite graph can produce disconnected FE components and unidentified stage-1 residuals on treated rows. Validator rejects unbalanced inputs.

  • Omega_0 row-level identification (period strict, unit warn-drop, plus connectivity) — Every period must have at least one Omega_0 row (else time FE for that period is structurally unidentified; hard ValueError). Units lacking Omega_0 rows (e.g. baseline-treated units with D_it = 1 at every observed t) are warned-and-dropped: their unit FE is NaN and the downstream finite-mask path excludes them from stage 2. Additionally, the supported-units bipartite graph (units linked by shared Omega_0 periods) must form a single connected component; K > 1 components raise ValueError because the FE solver would return only component-specific constants and residualization would silently mix them across components. Mirrors TwoStageDiD’s always-treated unit handling on the warn-drop axis.

  • One row per (unit, time) cell required — duplicate cells silently re-weight both stage-1 FE estimation and stage-2 OLS. Validator rejects duplicate cells; aggregate to unique cells before fitting.

  • rings[0] must equal 0 — the partition must cover treated locations (d_it = 0 belongs to Ring 1). Validator rejects rings that start at a nonzero inner edge to prevent a silent exposed-but-unmodeled population in 0 <= d_it < rings[0].

  • conley_coords must be constant within unit — ring construction uses a single (lat, lon) per unit (units are stationary point locations). Validator rejects coordinates that vary across rows of the same unit; aggregate to one location per unit (e.g. df.groupby(unit)[["lat","lon"]].first()) before fitting if your raw data has moving coordinates. This requirement applies on every fit path, not just vcov_type="conley", because the rings themselves are always coordinate-derived.