diff_diff.simulate_power#
- diff_diff.simulate_power(estimator, n_units=100, n_periods=4, treatment_effect=5.0, treatment_fraction=0.5, treatment_period=2, sigma=1.0, n_simulations=500, alpha=0.05, effect_sizes=None, seed=None, data_generator=None, data_generator_kwargs=None, estimator_kwargs=None, result_extractor=None, progress=True, survey_config=None)[source]
Estimate power using Monte Carlo simulation.
This function simulates datasets with known treatment effects and estimates power as the fraction of simulations where the null hypothesis is rejected. Most built-in estimators are supported via an internal registry that selects the appropriate data-generating process and fit signature automatically.
- Parameters:
estimator (estimator object) – DiD estimator to use (e.g., DifferenceInDifferences, CallawaySantAnna).
n_units (int, default=100) – Number of units per simulation.
n_periods (int, default=4) – Number of time periods.
treatment_effect (float, default=5.0) – True treatment effect to simulate.
treatment_fraction (float, default=0.5) – Fraction of units that are treated.
treatment_period (int, default=2) – First post-treatment period (0-indexed).
sigma (float, default=1.0) – Residual standard deviation (noise level).
n_simulations (int, default=500) – Number of Monte Carlo simulations.
alpha (float, default=0.05) – Significance level for hypothesis tests.
effect_sizes (list of float, optional) – Multiple effect sizes to evaluate for power curve. If None, uses only treatment_effect.
seed (int, optional) – Random seed for reproducibility.
data_generator (callable, optional) – Custom data generation function. When provided, bypasses the registry DGP and calls this function with the standard kwargs (n_units, n_periods, treatment_effect, etc.).
data_generator_kwargs (dict, optional) – Additional keyword arguments for data generator.
estimator_kwargs (dict, optional) – Additional keyword arguments for estimator.fit().
result_extractor (callable, optional) – Custom function to extract results from the estimator output. Takes the estimator result object and returns a tuple of
(att, se, p_value, conf_int). Useful for unregistered estimators with non-standard result schemas.progress (bool, default=True) – Whether to print progress updates.
survey_config (SurveyPowerConfig, optional) – When provided, generates survey-structured data via
generate_survey_did_dataand injectsSurveyDesigninto estimatorfit(). Mutually exclusive withdata_generator. Supported estimators: DiD, TWFE, MultiPeriod, CS, SA, Imputation, TwoStage, Stacked, Efficient. Unsupported: TROP, SyntheticDiD, TripleDifference.heterogeneous_te_by_stratamust be False.
- Returns:
Simulation-based power analysis results.
- Return type:
Examples
Basic power simulation:
>>> from diff_diff import DifferenceInDifferences, simulate_power >>> did = DifferenceInDifferences() >>> results = simulate_power( ... estimator=did, ... n_units=100, ... treatment_effect=5.0, ... sigma=5.0, ... n_simulations=500, ... seed=42 ... ) >>> print(f"Power: {results.power:.1%}")
Power curve over multiple effect sizes:
>>> results = simulate_power( ... estimator=did, ... effect_sizes=[1.0, 2.0, 3.0, 5.0, 7.0], ... n_simulations=200, ... seed=42 ... ) >>> print(results.power_curve_df())
With Callaway-Sant’Anna (auto-detected, no custom DGP needed):
>>> from diff_diff import CallawaySantAnna >>> cs = CallawaySantAnna() >>> results = simulate_power(cs, n_simulations=200, seed=42)
Notes
The simulation approach: 1. Generate data with known treatment effect 2. Fit the estimator and record the p-value 3. Repeat n_simulations times 4. Power = fraction of simulations where p-value < alpha
The analytical reference formulas this Monte Carlo path complements (the Bloom 1995 normal multiplier and the Burlig et al. 2020 Eq. 2 equicorrelated panel variance) are documented in
docs/methodology/REGISTRY.md## PowerAnalysis.References
Burlig, F., Preonas, L., & Woerman, M. (2020). “Panel Data and Experimental Design.”