diff_diff.PowerAnalysis

class diff_diff.PowerAnalysis[source]

Bases: object

Power analysis for difference-in-differences designs.

Provides analytical power calculations for basic 2x2 DiD and panel DiD designs. For complex designs like staggered adoption, use simulate_power() instead.

Parameters:
  • alpha (float, default=0.05) – Significance level for hypothesis testing.

  • power (float, default=0.80) – Target statistical power.

  • alternative (str, default='two-sided') – Alternative hypothesis: ‘two-sided’, ‘greater’, or ‘less’.

Examples

Calculate minimum detectable effect:

>>> from diff_diff import PowerAnalysis
>>> pa = PowerAnalysis(alpha=0.05, power=0.80)
>>> results = pa.mde(n_treated=50, n_control=50, sigma=1.0)
>>> print(f"MDE: {results.mde:.3f}")

Calculate required sample size:

>>> results = pa.sample_size(effect_size=0.5, sigma=1.0)
>>> print(f"Required N: {results.required_n}")

Calculate power for given sample and effect:

>>> results = pa.power(effect_size=0.5, n_treated=50, n_control=50, sigma=1.0)
>>> print(f"Power: {results.power:.1%}")

Notes

The power calculations are based on the variance of the DiD estimator:

For basic 2x2 DiD:
Var(ATT) = sigma^2 * (1/n_treated_post + 1/n_treated_pre
  • 1/n_control_post + 1/n_control_pre)

For panel DiD with T periods:
Var(ATT) = sigma^2 * (1/(N_treated * T) + 1/(N_control * T))
  • (1 + (T-1)*rho) / (1 + (T-1)*rho)

Where rho is the intra-cluster correlation coefficient.

References

Bloom, H. S. (1995). “Minimum Detectable Effects.” Burlig, F., Preonas, L., & Woerman, M. (2020). “Panel Data and Experimental Design.”

__init__(alpha=0.05, power=0.8, alternative='two-sided')[source]
Parameters:

Methods

__init__([alpha, power, alternative])

mde(n_treated, n_control, sigma[, n_pre, ...])

Calculate minimum detectable effect given sample size.

power(effect_size, n_treated, n_control, sigma)

Calculate statistical power for given effect size and sample.

power_curve(n_treated, n_control, sigma[, ...])

Compute power for a range of effect sizes.

sample_size(effect_size, sigma[, n_pre, ...])

Calculate required sample size to detect given effect.

sample_size_curve(effect_size, sigma[, ...])

Compute power for a range of sample sizes.

__init__(alpha=0.05, power=0.8, alternative='two-sided')[source]
Parameters:
power(effect_size, n_treated, n_control, sigma, n_pre=1, n_post=1, rho=0.0)[source]

Calculate statistical power for given effect size and sample.

Parameters:
  • effect_size (float) – Expected treatment effect size.

  • n_treated (int) – Number of treated units.

  • n_control (int) – Number of control units.

  • sigma (float) – Residual standard deviation.

  • n_pre (int, default=1) – Number of pre-treatment periods.

  • n_post (int, default=1) – Number of post-treatment periods.

  • rho (float, default=0.0) – Intra-cluster correlation for panel data.

Returns:

Power analysis results.

Return type:

PowerResults

Examples

>>> pa = PowerAnalysis()
>>> results = pa.power(effect_size=2.0, n_treated=50, n_control=50, sigma=5.0)
>>> print(f"Power: {results.power:.1%}")
mde(n_treated, n_control, sigma, n_pre=1, n_post=1, rho=0.0)[source]

Calculate minimum detectable effect given sample size.

The MDE is the smallest effect size that can be detected with the specified power and significance level.

Parameters:
  • n_treated (int) – Number of treated units.

  • n_control (int) – Number of control units.

  • sigma (float) – Residual standard deviation.

  • n_pre (int, default=1) – Number of pre-treatment periods.

  • n_post (int, default=1) – Number of post-treatment periods.

  • rho (float, default=0.0) – Intra-cluster correlation for panel data.

Returns:

Power analysis results including MDE.

Return type:

PowerResults

Examples

>>> pa = PowerAnalysis(power=0.80)
>>> results = pa.mde(n_treated=100, n_control=100, sigma=10.0)
>>> print(f"MDE: {results.mde:.2f}")
sample_size(effect_size, sigma, n_pre=1, n_post=1, rho=0.0, treat_frac=0.5)[source]

Calculate required sample size to detect given effect.

Parameters:
  • effect_size (float) – Treatment effect to detect.

  • sigma (float) – Residual standard deviation.

  • n_pre (int, default=1) – Number of pre-treatment periods.

  • n_post (int, default=1) – Number of post-treatment periods.

  • rho (float, default=0.0) – Intra-cluster correlation for panel data.

  • treat_frac (float, default=0.5) – Fraction of units assigned to treatment.

Returns:

Power analysis results including required sample size.

Return type:

PowerResults

Examples

>>> pa = PowerAnalysis(power=0.80)
>>> results = pa.sample_size(effect_size=5.0, sigma=10.0)
>>> print(f"Required N: {results.required_n}")
power_curve(n_treated, n_control, sigma, effect_sizes=None, n_pre=1, n_post=1, rho=0.0)[source]

Compute power for a range of effect sizes.

Parameters:
  • n_treated (int) – Number of treated units.

  • n_control (int) – Number of control units.

  • sigma (float) – Residual standard deviation.

  • effect_sizes (list of float, optional) – Effect sizes to evaluate. If None, uses a range from 0 to 3*MDE.

  • n_pre (int, default=1) – Number of pre-treatment periods.

  • n_post (int, default=1) – Number of post-treatment periods.

  • rho (float, default=0.0) – Intra-cluster correlation.

Returns:

DataFrame with columns ‘effect_size’ and ‘power’.

Return type:

pd.DataFrame

Examples

>>> pa = PowerAnalysis()
>>> curve = pa.power_curve(n_treated=50, n_control=50, sigma=5.0)
>>> print(curve)
sample_size_curve(effect_size, sigma, sample_sizes=None, n_pre=1, n_post=1, rho=0.0, treat_frac=0.5)[source]

Compute power for a range of sample sizes.

Parameters:
  • effect_size (float) – Treatment effect size.

  • sigma (float) – Residual standard deviation.

  • sample_sizes (list of int, optional) – Total sample sizes to evaluate. If None, uses sensible range.

  • n_pre (int, default=1) – Number of pre-treatment periods.

  • n_post (int, default=1) – Number of post-treatment periods.

  • rho (float, default=0.0) – Intra-cluster correlation.

  • treat_frac (float, default=0.5) – Fraction assigned to treatment.

Returns:

DataFrame with columns ‘sample_size’ and ‘power’.

Return type:

pd.DataFrame