Efficient Difference-in-Differences ==================================== Semiparametrically efficient ATT estimator for staggered adoption designs from Chen, Sant'Anna & Xie (2025). This module implements the efficiency-bound-attaining estimator that: 1. **Achieves the semiparametric efficiency bound** for ATT(g,t) estimation on the no-covariate path 2. **Optimally weights** across comparison groups and baselines via the inverse covariance matrix Ω* 3. **Supports two PT assumptions**: PT-All (overidentified, tighter SEs) and PT-Post (just-identified, matches CS for post-treatment effects) 4. **Uses EIF-based inference** for analytical standard errors and multiplier bootstrap .. note:: EfficientDiD supports a doubly-robust covariate path with all nuisances estimated nonparametrically: sieve-based propensity score ratios and a sieve outcome regression (polynomial basis, AIC/BIC order selection), plus the kernel-smoothed conditional covariance. The DR property ensures consistency if either the outcome regression or the PS ratio is correctly specified, and because the nuisances are growing sieves/kernels the covariate path attains the semiparametric efficiency bound asymptotically under the paper's regularity conditions (degree 1 reproduces a linear working model, and ``sieve_k_max=1`` forces all covariate-path sieves to degree 1). Pass column names to the ``covariates`` parameter on ``fit()``. See ``docs/methodology/REGISTRY.md`` for the full contract. **When to use EfficientDiD:** - Staggered adoption design where you want **maximum efficiency** on the no-covariate path - You believe parallel trends holds across all pre-treatment periods (PT-All) - You want tighter confidence intervals than Callaway-Sant'Anna - You need a formal efficiency benchmark for comparing estimators For covariate-adjusted designs, the doubly-robust path is consistent under either outcome-regression or propensity-ratio correctness and attains the efficiency bound under the paper's regularity conditions, with all nuisances (sieve propensity ratio, sieve outcome regression, kernel covariance) estimated nonparametrically. **Reference:** Chen, X., Sant'Anna, P. H. C., & Xie, H. (2025). Efficient Difference-in-Differences and Event Study Estimators. .. module:: diff_diff.efficient_did EfficientDiD ------------- Main estimator class for Efficient Difference-in-Differences. .. autoclass:: diff_diff.EfficientDiD :no-index: :members: :undoc-members: :show-inheritance: :inherited-members: .. rubric:: Methods .. autosummary:: ~EfficientDiD.fit ~EfficientDiD.get_params ~EfficientDiD.set_params EfficientDiDResults ------------------- Results container for Efficient DiD estimation. .. autoclass:: diff_diff.efficient_did_results.EfficientDiDResults :no-index: :members: :undoc-members: :show-inheritance: .. rubric:: Methods .. autosummary:: ~EfficientDiDResults.summary ~EfficientDiDResults.print_summary ~EfficientDiDResults.to_dataframe EDiDBootstrapResults -------------------- Bootstrap inference results for Efficient DiD. .. autoclass:: diff_diff.efficient_did_bootstrap.EDiDBootstrapResults :no-index: :members: :undoc-members: :show-inheritance: Example Usage ------------- Basic usage:: from diff_diff import EfficientDiD, generate_staggered_data data = generate_staggered_data(n_units=300, n_periods=10, cohort_periods=[4, 6, 8], seed=42) edid = EfficientDiD(pt_assumption="all") results = edid.fit(data, outcome='outcome', unit='unit', time='period', first_treat='first_treat', aggregate='all') results.print_summary() PT-Post mode (matches CS for post-treatment ATT):: edid_post = EfficientDiD(pt_assumption="post") results_post = edid_post.fit(data, outcome='outcome', unit='unit', time='period', first_treat='first_treat', aggregate='all') print(f"PT-All ATT: {results.overall_att:.4f} (SE={results.overall_se:.4f})") print(f"PT-Post ATT: {results_post.overall_att:.4f} (SE={results_post.overall_se:.4f})") Bootstrap inference:: edid_boot = EfficientDiD(pt_assumption="all", n_bootstrap=999, seed=42) results_boot = edid_boot.fit(data, outcome='outcome', unit='unit', time='period', first_treat='first_treat', aggregate='all') print(f"Bootstrap SE: {results_boot.overall_se:.4f}") print(f"Bootstrap CI: [{results_boot.overall_conf_int[0]:.4f}, " f"{results_boot.overall_conf_int[1]:.4f}]") Comparison with Other Staggered Estimators ------------------------------------------ .. list-table:: :header-rows: 1 :widths: 20 27 27 26 * - Feature - EfficientDiD - CallawaySantAnna - ImputationDiD * - Approach - Optimal EIF-based weighting - Separate 2x2 DiD aggregation - Impute Y(0) via FE model * - PT assumption - PT-All (stronger) or PT-Post - Conditional PT - Strict exogeneity * - Efficiency - Achieves the semiparametric bound on the no-covariate path; the doubly-robust covariate path attains it too (under regularity conditions) via sieve nuisances - Not efficient - Efficient under homogeneity * - Covariates - Supported (doubly robust, sieve-based PS ratio + sieve outcome regression) - Supported (OR, IPW, DR) - Supported * - Bootstrap - Multiplier bootstrap (EIF) - Multiplier bootstrap - Multiplier bootstrap * - PT-Post equivalence - Matches CS post-treatment ATT(g,t) - Baseline - Different framework