diff_diff.simulate_power
- diff_diff.simulate_power(estimator, n_units=100, n_periods=4, treatment_effect=5.0, treatment_fraction=0.5, treatment_period=2, sigma=1.0, n_simulations=500, alpha=0.05, effect_sizes=None, seed=None, data_generator=None, data_generator_kwargs=None, estimator_kwargs=None, progress=True)[source]
Estimate power using Monte Carlo simulation.
This function simulates datasets with known treatment effects and estimates power as the fraction of simulations where the null hypothesis is rejected. This is the recommended approach for complex designs like staggered adoption.
- Parameters:
estimator (estimator object) – DiD estimator to use (e.g., DifferenceInDifferences, CallawaySantAnna).
n_units (int, default=100) – Number of units per simulation.
n_periods (int, default=4) – Number of time periods.
treatment_effect (float, default=5.0) – True treatment effect to simulate.
treatment_fraction (float, default=0.5) – Fraction of units that are treated.
treatment_period (int, default=2) – First post-treatment period (0-indexed).
sigma (float, default=1.0) – Residual standard deviation (noise level).
n_simulations (int, default=500) – Number of Monte Carlo simulations.
alpha (float, default=0.05) – Significance level for hypothesis tests.
effect_sizes (list of float, optional) – Multiple effect sizes to evaluate for power curve. If None, uses only treatment_effect.
seed (int, optional) – Random seed for reproducibility.
data_generator (callable, optional) – Custom data generation function. Should accept same signature as generate_did_data(). If None, uses generate_did_data().
data_generator_kwargs (dict, optional) – Additional keyword arguments for data generator.
estimator_kwargs (dict, optional) – Additional keyword arguments for estimator.fit().
progress (bool, default=True) – Whether to print progress updates.
- Returns:
Simulation-based power analysis results.
- Return type:
Examples
Basic power simulation:
>>> from diff_diff import DifferenceInDifferences, simulate_power >>> did = DifferenceInDifferences() >>> results = simulate_power( ... estimator=did, ... n_units=100, ... treatment_effect=5.0, ... sigma=5.0, ... n_simulations=500, ... seed=42 ... ) >>> print(f"Power: {results.power:.1%}")
Power curve over multiple effect sizes:
>>> results = simulate_power( ... estimator=did, ... effect_sizes=[1.0, 2.0, 3.0, 5.0, 7.0], ... n_simulations=200, ... seed=42 ... ) >>> print(results.power_curve_df())
With Callaway-Sant’Anna for staggered designs:
>>> from diff_diff import CallawaySantAnna >>> cs = CallawaySantAnna() >>> # Custom data generator for staggered adoption >>> def staggered_data(n_units, n_periods, treatment_effect, **kwargs): ... # Your staggered data generation logic ... ... >>> results = simulate_power(cs, data_generator=staggered_data, ...)
Notes
The simulation approach: 1. Generate data with known treatment effect 2. Fit the estimator and record the p-value 3. Repeat n_simulations times 4. Power = fraction of simulations where p-value < alpha
For staggered designs, you’ll need to provide a custom data_generator that creates appropriate staggered treatment timing.
References
Burlig, F., Preonas, L., & Woerman, M. (2020). “Panel Data and Experimental Design.”