diff_diff.EfficientDiD#
- class diff_diff.EfficientDiD[source]#
Bases:
EfficientDiDBootstrapMixinEfficient DiD estimator (Chen, Sant’Anna & Xie 2025).
Without covariates, achieves the semiparametric efficiency bound for ATT(g,t) using a closed-form estimator based on within-group sample means and covariances.
With covariates, uses a doubly robust path: sieve-based propensity score ratios (Eq 4.1-4.2), OLS outcome regression, sieve-estimated inverse propensities (algorithm step 4), and kernel-smoothed conditional Omega*(X) with per-unit efficient weights (Eq 3.12). The DR property ensures consistency if either the OLS outcome model or the sieve propensity ratio is correctly specified. The OLS working model for outcome regressions does not generically guarantee the semiparametric efficiency bound (see REGISTRY.md).
- Parameters:
pt_assumption (str, default
"all") – Parallel trends variant:"all"(overidentified, uses all pre-treatment periods and comparison groups) or"post"(just-identified, single baseline, equivalent to CS).alpha (float, default 0.05) – Significance level.
cluster (str or None) – Column name for cluster-robust SEs. When set, analytical SEs use the Liang-Zeger clustered sandwich estimator on EIF values. With
n_bootstrap > 0, bootstrap weights are generated at the cluster level (all units in a cluster share the same weight).control_group (str, default
"never_treated") – Which units serve as the comparison group:"never_treated"requires a never-treated cohort (raises if none exist);"last_cohort"reclassifies the latest treatment cohort as pseudo-never-treated and drops periods att >= last_g - anticipationso the pseudo-control’s pre-treatment window excludes anticipation-contaminated periods. Distinct from CallawaySantAnna’s"not_yet_treated"— see REGISTRY.md for details.n_bootstrap (int, default 0) – Number of multiplier bootstrap iterations (0 = analytical only).
bootstrap_weights (str, default
"rademacher") – Bootstrap weight distribution.seed (int or None) – Random seed for reproducibility.
anticipation (int, default 0) – Number of anticipation periods (shifts the effective treatment boundary forward by this amount). When combined with
control_group="last_cohort", also trims the pseudo-control period set att >= last_g - anticipation(see REGISTRY.md).sieve_k_max (int or None) – Maximum polynomial degree for sieve ratio estimation. None = auto (
min(floor(n_gp^{1/5}), 5)). Only used with covariates.sieve_criterion (str, default
"bic") – Information criterion for sieve degree selection:"aic"or"bic".ratio_clip (float, default 20.0) – Clip sieve propensity ratios to
[1/ratio_clip, ratio_clip].kernel_bandwidth (float or None) – Bandwidth for Gaussian kernel in conditional Omega* estimation. None = Silverman’s rule-of-thumb (automatic).
Examples
>>> from diff_diff import EfficientDiD >>> edid = EfficientDiD(pt_assumption="all") >>> results = edid.fit(data, outcome="y", unit="id", time="t", ... first_treat="first_treat", aggregate="all") >>> results.print_summary()
Methods
__init__([pt_assumption, alpha, cluster, ...])fit(data, outcome, unit, time, first_treat)Fit the Efficient DiD estimator.
get_params()Get estimator parameters (sklearn-compatible).
hausman_pretest(data, outcome, unit, time, ...)Hausman pretest for PT-All vs PT-Post (Theorem A.1).
print_summary()Print summary to stdout.
set_params(**params)Set estimator parameters (sklearn-compatible).
summary()Get summary of estimation results.
Attributes
n_bootstrapbootstrap_weightsalphaseedanticipation- __init__(pt_assumption='all', alpha=0.05, cluster=None, control_group='never_treated', n_bootstrap=0, bootstrap_weights='rademacher', seed=None, anticipation=0, sieve_k_max=None, sieve_criterion='bic', ratio_clip=20.0, kernel_bandwidth=None)[source]#
- classmethod __new__(*args, **kwargs)#