diff_diff.simulate_power#

diff_diff.simulate_power(estimator, n_units=100, n_periods=4, treatment_effect=5.0, treatment_fraction=0.5, treatment_period=2, sigma=1.0, n_simulations=500, alpha=0.05, effect_sizes=None, seed=None, data_generator=None, data_generator_kwargs=None, estimator_kwargs=None, result_extractor=None, progress=True, survey_config=None)[source]

Estimate power using Monte Carlo simulation.

This function simulates datasets with known treatment effects and estimates power as the fraction of simulations where the null hypothesis is rejected. Most built-in estimators are supported via an internal registry that selects the appropriate data-generating process and fit signature automatically.

Parameters:

estimator (estimator object) – DiD estimator to use (e.g., DifferenceInDifferences, CallawaySantAnna).
n_units (int, default=100) – Number of units per simulation.
n_periods (int, default=4) – Number of time periods.
treatment_effect (float, default=5.0) – True treatment effect to simulate.
treatment_fraction (float, default=0.5) – Fraction of units that are treated.
treatment_period (int, default=2) – First post-treatment period (0-indexed).
sigma (float, default=1.0) – Residual standard deviation (noise level).
n_simulations (int, default=500) – Number of Monte Carlo simulations.
alpha (float, default=0.05) – Significance level for hypothesis tests.
effect_sizes (list of float, optional) – Multiple effect sizes to evaluate for power curve. If None, uses only treatment_effect.
seed (int, optional) – Random seed for reproducibility.
data_generator (callable, optional) – Custom data generation function. When provided, bypasses the registry DGP and calls this function with the standard kwargs (n_units, n_periods, treatment_effect, etc.).
data_generator_kwargs (dict, optional) – Additional keyword arguments for data generator.
estimator_kwargs (dict, optional) – Additional keyword arguments for estimator.fit().
result_extractor (callable, optional) – Custom function to extract results from the estimator output. Takes the estimator result object and returns a tuple of (att, se, p_value, conf_int). Useful for unregistered estimators with non-standard result schemas.
progress (bool, default=True) – Whether to print progress updates.
survey_config (SurveyPowerConfig, optional) – When provided, generates survey-structured data via generate_survey_did_data and injects SurveyDesign into estimator fit(). Mutually exclusive with data_generator. Supported estimators: DiD, TWFE, MultiPeriod, CS, SA, Imputation, TwoStage, Stacked, Efficient. Unsupported: TROP, SyntheticDiD, TripleDifference. heterogeneous_te_by_strata must be False.

Returns:

Simulation-based power analysis results.

Return type:

SimulationPowerResults

Examples

Basic power simulation:

>>> from diff_diff import DifferenceInDifferences, simulate_power
>>> did = DifferenceInDifferences()
>>> results = simulate_power(
...     estimator=did,
...     n_units=100,
...     treatment_effect=5.0,
...     sigma=5.0,
...     n_simulations=500,
...     seed=42
... )
>>> print(f"Power: {results.power:.1%}")

Power curve over multiple effect sizes:

>>> results = simulate_power(
...     estimator=did,
...     effect_sizes=[1.0, 2.0, 3.0, 5.0, 7.0],
...     n_simulations=200,
...     seed=42
... )
>>> print(results.power_curve_df())

With Callaway-Sant’Anna (auto-detected, no custom DGP needed):

>>> from diff_diff import CallawaySantAnna
>>> cs = CallawaySantAnna()
>>> results = simulate_power(cs, n_simulations=200, seed=42)

Notes

The simulation approach: 1. Generate data with known treatment effect 2. Fit the estimator and record the p-value 3. Repeat n_simulations times 4. Power = fraction of simulations where p-value < alpha

The analytical reference formulas this Monte Carlo path complements (the Bloom 1995 normal multiplier and the Burlig et al. 2020 Eq. 2 equicorrelated panel variance) are documented in docs/methodology/REGISTRY.md ## PowerAnalysis.

References

Burlig, F., Preonas, L., & Woerman, M. (2020). “Panel Data and Experimental Design.”