diff_diff.PowerAnalysis
- class diff_diff.PowerAnalysis[source]
Bases:
objectPower analysis for difference-in-differences designs.
Provides analytical power calculations for basic 2x2 DiD and panel DiD designs. For complex designs like staggered adoption, use simulate_power() instead.
- Parameters:
Examples
Calculate minimum detectable effect:
>>> from diff_diff import PowerAnalysis >>> pa = PowerAnalysis(alpha=0.05, power=0.80) >>> results = pa.mde(n_treated=50, n_control=50, sigma=1.0) >>> print(f"MDE: {results.mde:.3f}")
Calculate required sample size:
>>> results = pa.sample_size(effect_size=0.5, sigma=1.0) >>> print(f"Required N: {results.required_n}")
Calculate power for given sample and effect:
>>> results = pa.power(effect_size=0.5, n_treated=50, n_control=50, sigma=1.0) >>> print(f"Power: {results.power:.1%}")
Notes
The power calculations are based on the variance of the DiD estimator:
- For basic 2x2 DiD:
- Var(ATT) = sigma^2 * (1/n_treated_post + 1/n_treated_pre
1/n_control_post + 1/n_control_pre)
- For panel DiD with T periods:
- Var(ATT) = sigma^2 * (1/(N_treated * T) + 1/(N_control * T))
(1 + (T-1)*rho) / (1 + (T-1)*rho)
Where rho is the intra-cluster correlation coefficient.
References
Bloom, H. S. (1995). “Minimum Detectable Effects.” Burlig, F., Preonas, L., & Woerman, M. (2020). “Panel Data and Experimental Design.”
Methods
__init__([alpha, power, alternative])mde(n_treated, n_control, sigma[, n_pre, ...])Calculate minimum detectable effect given sample size.
power(effect_size, n_treated, n_control, sigma)Calculate statistical power for given effect size and sample.
power_curve(n_treated, n_control, sigma[, ...])Compute power for a range of effect sizes.
sample_size(effect_size, sigma[, n_pre, ...])Calculate required sample size to detect given effect.
sample_size_curve(effect_size, sigma[, ...])Compute power for a range of sample sizes.
- power(effect_size, n_treated, n_control, sigma, n_pre=1, n_post=1, rho=0.0)[source]
Calculate statistical power for given effect size and sample.
- Parameters:
effect_size (float) – Expected treatment effect size.
n_treated (int) – Number of treated units.
n_control (int) – Number of control units.
sigma (float) – Residual standard deviation.
n_pre (int, default=1) – Number of pre-treatment periods.
n_post (int, default=1) – Number of post-treatment periods.
rho (float, default=0.0) – Intra-cluster correlation for panel data.
- Returns:
Power analysis results.
- Return type:
Examples
>>> pa = PowerAnalysis() >>> results = pa.power(effect_size=2.0, n_treated=50, n_control=50, sigma=5.0) >>> print(f"Power: {results.power:.1%}")
- mde(n_treated, n_control, sigma, n_pre=1, n_post=1, rho=0.0)[source]
Calculate minimum detectable effect given sample size.
The MDE is the smallest effect size that can be detected with the specified power and significance level.
- Parameters:
n_treated (int) – Number of treated units.
n_control (int) – Number of control units.
sigma (float) – Residual standard deviation.
n_pre (int, default=1) – Number of pre-treatment periods.
n_post (int, default=1) – Number of post-treatment periods.
rho (float, default=0.0) – Intra-cluster correlation for panel data.
- Returns:
Power analysis results including MDE.
- Return type:
Examples
>>> pa = PowerAnalysis(power=0.80) >>> results = pa.mde(n_treated=100, n_control=100, sigma=10.0) >>> print(f"MDE: {results.mde:.2f}")
- sample_size(effect_size, sigma, n_pre=1, n_post=1, rho=0.0, treat_frac=0.5)[source]
Calculate required sample size to detect given effect.
- Parameters:
effect_size (float) – Treatment effect to detect.
sigma (float) – Residual standard deviation.
n_pre (int, default=1) – Number of pre-treatment periods.
n_post (int, default=1) – Number of post-treatment periods.
rho (float, default=0.0) – Intra-cluster correlation for panel data.
treat_frac (float, default=0.5) – Fraction of units assigned to treatment.
- Returns:
Power analysis results including required sample size.
- Return type:
Examples
>>> pa = PowerAnalysis(power=0.80) >>> results = pa.sample_size(effect_size=5.0, sigma=10.0) >>> print(f"Required N: {results.required_n}")
- power_curve(n_treated, n_control, sigma, effect_sizes=None, n_pre=1, n_post=1, rho=0.0)[source]
Compute power for a range of effect sizes.
- Parameters:
n_treated (int) – Number of treated units.
n_control (int) – Number of control units.
sigma (float) – Residual standard deviation.
effect_sizes (list of float, optional) – Effect sizes to evaluate. If None, uses a range from 0 to 3*MDE.
n_pre (int, default=1) – Number of pre-treatment periods.
n_post (int, default=1) – Number of post-treatment periods.
rho (float, default=0.0) – Intra-cluster correlation.
- Returns:
DataFrame with columns ‘effect_size’ and ‘power’.
- Return type:
pd.DataFrame
Examples
>>> pa = PowerAnalysis() >>> curve = pa.power_curve(n_treated=50, n_control=50, sigma=5.0) >>> print(curve)
- sample_size_curve(effect_size, sigma, sample_sizes=None, n_pre=1, n_post=1, rho=0.0, treat_frac=0.5)[source]
Compute power for a range of sample sizes.
- Parameters:
effect_size (float) – Treatment effect size.
sigma (float) – Residual standard deviation.
sample_sizes (list of int, optional) – Total sample sizes to evaluate. If None, uses sensible range.
n_pre (int, default=1) – Number of pre-treatment periods.
n_post (int, default=1) – Number of post-treatment periods.
rho (float, default=0.0) – Intra-cluster correlation.
treat_frac (float, default=0.5) – Fraction assigned to treatment.
- Returns:
DataFrame with columns ‘sample_size’ and ‘power’.
- Return type:
pd.DataFrame