diff_diff.EfficientDiD#

class diff_diff.EfficientDiD[source]#

Bases: EfficientDiDBootstrapMixin

Efficient DiD estimator (Chen, Sant’Anna & Xie 2025).

Without covariates, achieves the semiparametric efficiency bound for ATT(g,t) using a closed-form estimator based on within-group sample means and covariances.

With covariates, uses a doubly robust path: sieve-based propensity score ratios (Eq 4.1-4.2), OLS outcome regression, sieve-estimated inverse propensities (algorithm step 4), and kernel-smoothed conditional Omega*(X) with per-unit efficient weights (Eq 3.12). The DR property ensures consistency if either the OLS outcome model or the sieve propensity ratio is correctly specified. The OLS working model for outcome regressions does not generically guarantee the semiparametric efficiency bound (see REGISTRY.md).

Parameters:

pt_assumption (str, default "all") – Parallel trends variant: "all" (overidentified, uses all pre-treatment periods and comparison groups) or "post" (just-identified, single baseline, equivalent to CS).
alpha (float, default 0.05) – Significance level.
cluster (str or None) – Column name for cluster-robust SEs. When set, analytical SEs use the Liang-Zeger clustered sandwich estimator on EIF values. With n_bootstrap > 0, bootstrap weights are generated at the cluster level (all units in a cluster share the same weight).
control_group (str, default "never_treated") – Which units serve as the comparison group: "never_treated" requires a never-treated cohort (raises if none exist); "last_cohort" reclassifies the latest treatment cohort as pseudo-never-treated and drops periods at t >= last_g - anticipation so the pseudo-control’s pre-treatment window excludes anticipation-contaminated periods. Distinct from CallawaySantAnna’s "not_yet_treated" — see REGISTRY.md for details.
n_bootstrap (int, default 0) – Number of multiplier bootstrap iterations (0 = analytical only).
bootstrap_weights (str, default "rademacher") – Bootstrap weight distribution.
seed (int or None) – Random seed for reproducibility.
anticipation (int, default 0) – Number of anticipation periods (shifts the effective treatment boundary forward by this amount). When combined with control_group="last_cohort", also trims the pseudo-control period set at t >= last_g - anticipation (see REGISTRY.md).
sieve_k_max (int or None) – Maximum polynomial degree for sieve ratio estimation. None = auto (min(floor(n_gp^{1/5}), 5)). Only used with covariates.
sieve_criterion (str, default "bic") – Information criterion for sieve degree selection: "aic" or "bic".
ratio_clip (float, default 20.0) – Clip sieve propensity ratios to [1/ratio_clip, ratio_clip].
kernel_bandwidth (float or None) – Bandwidth for Gaussian kernel in conditional Omega* estimation. None = Silverman’s rule-of-thumb (automatic).

Examples

>>> from diff_diff import EfficientDiD
>>> edid = EfficientDiD(pt_assumption="all")
>>> results = edid.fit(data, outcome="y", unit="id", time="t",
...                    first_treat="first_treat", aggregate="all")
>>> results.print_summary()

Methods

`__init__`([pt_assumption, alpha, cluster, ...])
`fit`(data, outcome, unit, time, first_treat)	Fit the Efficient DiD estimator.
`get_params`()	Get estimator parameters (sklearn-compatible).
`hausman_pretest`(data, outcome, unit, time, ...)	Hausman pretest for PT-All vs PT-Post (Theorem A.1).
`print_summary`()	Print summary to stdout.
`set_params`(**params)	Set estimator parameters (sklearn-compatible).
`summary`()	Get summary of estimation results.

Attributes

`n_bootstrap`
`bootstrap_weights`
`alpha`
`seed`
`anticipation`

__init__(pt_assumption='all', alpha=0.05, cluster=None, control_group='never_treated', n_bootstrap=0, bootstrap_weights='rademacher', seed=None, anticipation=0, sieve_k_max=None, sieve_criterion='bic', ratio_clip=20.0, kernel_bandwidth=None)[source]#

Parameters:

pt_assumption (str)
alpha (float)
cluster (str | None)
control_group (str)
n_bootstrap (int)
bootstrap_weights (str)
seed (int | None)
anticipation (int)
sieve_k_max (int | None)
sieve_criterion (str)
ratio_clip (float)
kernel_bandwidth (float | None)

classmethod __new__(*args, **kwargs)#