Power Analysis#

Power analysis for DiD study design.

Overview#

Power analysis helps researchers design studies with adequate statistical power to detect meaningful treatment effects. This module provides:

  1. Analytical Power Calculations: Fast closed-form power for standard DiD designs

  2. Minimum Detectable Effect (MDE): Smallest effect detectable at target power

  3. Sample Size Calculations: Required sample size for target power

  4. Simulation-Based Power: Monte Carlo power for any DiD estimator

PowerAnalysis#

Main class for analytical power calculations.

class diff_diff.PowerAnalysis[source]

Bases: object

Power analysis for difference-in-differences designs.

Provides analytical power calculations for basic 2x2 DiD and panel DiD designs. For complex designs like staggered adoption, use simulate_power() instead.

Parameters:
  • alpha (float, default=0.05) – Significance level for hypothesis testing.

  • power (float, default=0.80) – Target statistical power.

  • alternative (str, default='two-sided') – Alternative hypothesis: ‘two-sided’, ‘greater’, or ‘less’.

Examples

Calculate minimum detectable effect:

>>> from diff_diff import PowerAnalysis
>>> pa = PowerAnalysis(alpha=0.05, power=0.80)
>>> results = pa.mde(n_treated=50, n_control=50, sigma=1.0)
>>> print(f"MDE: {results.mde:.3f}")

Calculate required sample size:

>>> results = pa.sample_size(effect_size=0.5, sigma=1.0)
>>> print(f"Required N: {results.required_n}")

Calculate power for given sample and effect:

>>> results = pa.power(effect_size=0.5, n_treated=50, n_control=50, sigma=1.0)
>>> print(f"Power: {results.power:.1%}")

Notes

The power calculations are based on the variance of the DiD estimator:

For basic 2x2 DiD:
Var(ATT) = sigma^2 * (1/n_treated_post + 1/n_treated_pre
  • 1/n_control_post + 1/n_control_pre)

For panel DiD with T periods:
Var(ATT) = sigma^2 * (1/(N_treated * T) + 1/(N_control * T))
  • (1 + (T-1)*rho) / (1 + (T-1)*rho)

Where rho is the intra-cluster correlation coefficient.

References

Bloom, H. S. (1995). “Minimum Detectable Effects.” Burlig, F., Preonas, L., & Woerman, M. (2020). “Panel Data and Experimental Design.”

Methods

power(effect_size, n_treated, n_control, sigma)

Calculate statistical power for given effect size and sample.

mde(n_treated, n_control, sigma[, n_pre, ...])

Calculate minimum detectable effect given sample size.

sample_size(effect_size, sigma[, n_pre, ...])

Calculate required sample size to detect given effect.

power_curve(n_treated, n_control, sigma[, ...])

Compute power for a range of effect sizes.

sample_size_curve(effect_size, sigma[, ...])

Compute power for a range of sample sizes.

__init__(alpha=0.05, power=0.8, alternative='two-sided')[source]
Parameters:
power(effect_size, n_treated, n_control, sigma, n_pre=1, n_post=1, rho=0.0, deff=1.0)[source]

Calculate statistical power for given effect size and sample.

Parameters:
  • effect_size (float) – Expected treatment effect size.

  • n_treated (int) – Number of treated units.

  • n_control (int) – Number of control units.

  • sigma (float) – Residual standard deviation.

  • n_pre (int, default=1) – Number of pre-treatment periods.

  • n_post (int, default=1) – Number of post-treatment periods.

  • rho (float, default=0.0) – Intra-cluster correlation for panel data.

  • deff (float, default=1.0) – Survey design effect (variance inflation factor). Not redundant with rho: rho models within-unit serial correlation, deff models survey clustering/weighting.

Returns:

Power analysis results.

Return type:

PowerResults

Examples

>>> pa = PowerAnalysis()
>>> results = pa.power(effect_size=2.0, n_treated=50, n_control=50, sigma=5.0)
>>> print(f"Power: {results.power:.1%}")
mde(n_treated, n_control, sigma, n_pre=1, n_post=1, rho=0.0, deff=1.0)[source]

Calculate minimum detectable effect given sample size.

The MDE is the smallest effect size that can be detected with the specified power and significance level.

Parameters:
  • n_treated (int) – Number of treated units.

  • n_control (int) – Number of control units.

  • sigma (float) – Residual standard deviation.

  • n_pre (int, default=1) – Number of pre-treatment periods.

  • n_post (int, default=1) – Number of post-treatment periods.

  • rho (float, default=0.0) – Intra-cluster correlation for panel data.

  • deff (float, default=1.0) – Survey design effect (variance inflation factor).

Returns:

Power analysis results including MDE.

Return type:

PowerResults

Examples

>>> pa = PowerAnalysis(power=0.80)
>>> results = pa.mde(n_treated=100, n_control=100, sigma=10.0)
>>> print(f"MDE: {results.mde:.2f}")
sample_size(effect_size, sigma, n_pre=1, n_post=1, rho=0.0, treat_frac=0.5, deff=1.0)[source]

Calculate required sample size to detect given effect.

Parameters:
  • effect_size (float) – Treatment effect to detect.

  • sigma (float) – Residual standard deviation.

  • n_pre (int, default=1) – Number of pre-treatment periods.

  • n_post (int, default=1) – Number of post-treatment periods.

  • rho (float, default=0.0) – Intra-cluster correlation for panel data.

  • treat_frac (float, default=0.5) – Fraction of units assigned to treatment.

  • deff (float, default=1.0) – Survey design effect (variance inflation factor).

Returns:

Power analysis results including required sample size.

Return type:

PowerResults

Examples

>>> pa = PowerAnalysis(power=0.80)
>>> results = pa.sample_size(effect_size=5.0, sigma=10.0)
>>> print(f"Required N: {results.required_n}")
power_curve(n_treated, n_control, sigma, effect_sizes=None, n_pre=1, n_post=1, rho=0.0, deff=1.0)[source]

Compute power for a range of effect sizes.

Parameters:
  • n_treated (int) – Number of treated units.

  • n_control (int) – Number of control units.

  • sigma (float) – Residual standard deviation.

  • effect_sizes (list of float, optional) – Effect sizes to evaluate. If None, uses a range from 0 to 3*MDE.

  • n_pre (int, default=1) – Number of pre-treatment periods.

  • n_post (int, default=1) – Number of post-treatment periods.

  • rho (float, default=0.0) – Intra-cluster correlation.

  • deff (float, default=1.0) – Survey design effect (variance inflation factor).

Returns:

DataFrame with columns ‘effect_size’ and ‘power’.

Return type:

pd.DataFrame

Examples

>>> pa = PowerAnalysis()
>>> curve = pa.power_curve(n_treated=50, n_control=50, sigma=5.0)
>>> print(curve)
sample_size_curve(effect_size, sigma, sample_sizes=None, n_pre=1, n_post=1, rho=0.0, treat_frac=0.5, deff=1.0)[source]

Compute power for a range of sample sizes.

Parameters:
  • effect_size (float) – Treatment effect size.

  • sigma (float) – Residual standard deviation.

  • sample_sizes (list of int, optional) – Total sample sizes to evaluate. If None, uses sensible range.

  • n_pre (int, default=1) – Number of pre-treatment periods.

  • n_post (int, default=1) – Number of post-treatment periods.

  • rho (float, default=0.0) – Intra-cluster correlation.

  • treat_frac (float, default=0.5) – Fraction assigned to treatment.

  • deff (float, default=1.0) – Survey design effect (variance inflation factor).

Returns:

DataFrame with columns ‘sample_size’ and ‘power’.

Return type:

pd.DataFrame

Example#

from diff_diff import PowerAnalysis

pa = PowerAnalysis(alpha=0.05, power=0.80)

# Compute power
result = pa.power(effect_size=0.5, n_treated=100, n_control=100, sigma=1.0)
print(f"Power: {result.power:.2%}")

# Compute MDE at 80% power
result = pa.mde(n_treated=100, n_control=100, sigma=1.0)
print(f"MDE: {result.mde:.3f}")

# Required sample size
result = pa.sample_size(effect_size=0.5, sigma=1.0)
print(f"Required N: {result.required_n}")

PowerResults#

Results from power analysis.

class diff_diff.PowerResults[source]

Bases: object

Results from analytical power analysis.

power

Statistical power (probability of rejecting H0 when effect exists).

Type:

float

mde

Minimum detectable effect size.

Type:

float

required_n

Required total sample size (treated + control).

Type:

int

effect_size

Effect size used in calculation.

Type:

float

alpha

Significance level.

Type:

float

alternative

Alternative hypothesis (‘two-sided’, ‘greater’, ‘less’).

Type:

str

n_treated

Number of treated units.

Type:

int

n_control

Number of control units.

Type:

int

n_pre

Number of pre-treatment periods.

Type:

int

n_post

Number of post-treatment periods.

Type:

int

sigma

Residual standard deviation.

Type:

float

rho

Intra-cluster correlation (for panel data).

Type:

float

deff

Survey design effect (variance inflation factor).

Type:

float

design

Study design type (‘basic_did’, ‘panel’, ‘staggered’).

Type:

str

power: float
mde: float
required_n: int
effect_size: float
alpha: float
alternative: str
n_treated: int
n_control: int
n_pre: int
n_post: int
sigma: float
rho: float = 0.0
deff: float = 1.0
design: str = 'basic_did'
__repr__()[source]

Concise string representation.

Return type:

str

summary()[source]

Generate a formatted summary of power analysis results.

Returns:

Formatted summary table.

Return type:

str

print_summary()[source]

Print the summary to stdout.

Return type:

None

to_dict()[source]

Convert results to a dictionary.

Returns:

Dictionary containing all power analysis results.

Return type:

Dict[str, Any]

to_dataframe()[source]

Convert results to a pandas DataFrame.

Returns:

DataFrame with power analysis results.

Return type:

pd.DataFrame

__init__(power, mde, required_n, effect_size, alpha, alternative, n_treated, n_control, n_pre, n_post, sigma, rho=0.0, deff=1.0, design='basic_did')
Parameters:
Return type:

None

SimulationPowerResults#

Results from simulation-based power analysis.

class diff_diff.SimulationPowerResults[source]

Bases: object

Results from simulation-based power analysis.

power

Estimated power (proportion of simulations rejecting H0).

Type:

float

power_se

Standard error of power estimate.

Type:

float

power_ci

Confidence interval for power estimate.

Type:

Tuple[float, float]

rejection_rate

Proportion of simulations with p-value < alpha.

Type:

float

mean_estimate

Mean treatment effect estimate across simulations.

Type:

float

std_estimate

Standard deviation of estimates across simulations.

Type:

float

mean_se

Mean standard error across simulations.

Type:

float

coverage

Proportion of CIs containing true effect.

Type:

float

n_simulations

Number of simulations performed (successful count; see n_simulation_failures for failed-replicate count).

Type:

int

n_simulation_failures

Number of simulations at the primary effect size whose estimator.fit (or result extraction) raised an exception and was skipped. Lets callers programmatically detect fragile DGP/estimator pairings; a proportional warning is also emitted above a 10% failure rate.

Type:

int

effect_sizes

Effect sizes tested (if multiple).

Type:

List[float]

powers

Power at each effect size (if multiple).

Type:

List[float]

true_effect

True treatment effect used in simulation.

Type:

float

alpha

Significance level.

Type:

float

estimator_name

Name of the estimator used.

Type:

str

effective_n_units

Effective sample size when it differs from the requested n_units (e.g., due to DDD grid rounding). None when no rounding occurred.

Type:

int or None

power: float
power_se: float
power_ci: Tuple[float, float]
rejection_rate: float
mean_estimate: float
std_estimate: float
mean_se: float
coverage: float
n_simulations: int
effect_sizes: List[float]
powers: List[float]
true_effect: float
alpha: float
estimator_name: str
bias: float
rmse: float
simulation_results: List[Dict[str, Any]] | None = None
effective_n_units: int | None = None
survey_config: Any | None = None
mean_deff: float | None = None
mean_icc_realized: float | None = None
n_simulation_failures: int = 0
__post_init__()[source]

Compute derived statistics.

__repr__()[source]

Concise string representation.

Return type:

str

summary()[source]

Generate a formatted summary of simulation power results.

Returns:

Formatted summary table.

Return type:

str

print_summary()[source]

Print the summary to stdout.

Return type:

None

to_dict()[source]

Convert results to a dictionary.

Returns:

Dictionary containing simulation power results.

Return type:

Dict[str, Any]

to_dataframe()[source]

Convert results to a pandas DataFrame.

Returns:

DataFrame with simulation power results.

Return type:

pd.DataFrame

power_curve_df()[source]

Get power curve data as a DataFrame.

Returns:

DataFrame with effect_size and power columns.

Return type:

pd.DataFrame

__init__(power, power_se, power_ci, rejection_rate, mean_estimate, std_estimate, mean_se, coverage, n_simulations, effect_sizes, powers, true_effect, alpha, estimator_name, simulation_results=None, effective_n_units=None, survey_config=None, mean_deff=None, mean_icc_realized=None, n_simulation_failures=0)
Parameters:
Return type:

None

SimulationMDEResults#

Results from simulation-based MDE search.

class diff_diff.SimulationMDEResults[source]

Bases: object

Results from simulation-based minimum detectable effect search.

mde

Minimum detectable effect (smallest effect achieving target power).

Type:

float

power_at_mde

Power achieved at the MDE.

Type:

float

target_power

Target power used in the search.

Type:

float

alpha

Significance level.

Type:

float

n_units

Sample size used.

Type:

int

n_simulations_per_step

Number of simulations per bisection step.

Type:

int

n_steps

Number of bisection steps performed.

Type:

int

search_path

Diagnostic trace of {effect_size, power} at each step.

Type:

list of dict

estimator_name

Name of the estimator used.

Type:

str

effective_n_units

Effective sample size when it differs from the requested n_units (e.g., due to DDD grid rounding). None when no rounding occurred.

Type:

int or None

mde: float
power_at_mde: float
target_power: float
alpha: float
n_units: int
n_simulations_per_step: int
n_steps: int
search_path: List[Dict[str, float]]
estimator_name: str
effective_n_units: int | None = None
survey_config: Any | None = None
summary()[source]

Generate a formatted summary.

Return type:

str

to_dict()[source]

Convert results to a dictionary.

Return type:

Dict[str, Any]

to_dataframe()[source]

Convert results to a single-row DataFrame.

Return type:

DataFrame

__init__(mde, power_at_mde, target_power, alpha, n_units, n_simulations_per_step, n_steps, search_path, estimator_name, effective_n_units=None, survey_config=None)
Parameters:
Return type:

None

SimulationSampleSizeResults#

Results from simulation-based sample size search.

class diff_diff.SimulationSampleSizeResults[source]

Bases: object

Results from simulation-based sample size search.

required_n

Required number of units to achieve target power.

Type:

int

power_at_n

Power achieved at the required N.

Type:

float

target_power

Target power used in the search.

Type:

float

alpha

Significance level.

Type:

float

effect_size

Effect size used in the search.

Type:

float

n_simulations_per_step

Number of simulations per bisection step.

Type:

int

n_steps

Number of bisection steps performed.

Type:

int

search_path

Diagnostic trace of {n_units, power} at each step.

Type:

list of dict

estimator_name

Name of the estimator used.

Type:

str

effective_n_units

Effective sample size when it differs from required_n (e.g., due to DDD grid rounding). None when no rounding occurred or when the search already snapped to the estimator’s grid.

Type:

int or None

required_n: int
power_at_n: float
target_power: float
alpha: float
effect_size: float
n_simulations_per_step: int
n_steps: int
search_path: List[Dict[str, float]]
estimator_name: str
effective_n_units: int | None = None
survey_config: Any | None = None
summary()[source]

Generate a formatted summary.

Return type:

str

to_dict()[source]

Convert results to a dictionary.

Return type:

Dict[str, Any]

to_dataframe()[source]

Convert results to a single-row DataFrame.

Return type:

DataFrame

__init__(required_n, power_at_n, target_power, alpha, effect_size, n_simulations_per_step, n_steps, search_path, estimator_name, effective_n_units=None, survey_config=None)
Parameters:
Return type:

None

Convenience Functions#

compute_power#

Quick power computation.

diff_diff.compute_power(effect_size, n_treated, n_control, sigma, alpha=0.05, n_pre=1, n_post=1, rho=0.0, deff=1.0)[source]#

Convenience function to compute power for given effect and sample.

Parameters:
  • effect_size (float) – Expected treatment effect.

  • n_treated (int) – Number of treated units.

  • n_control (int) – Number of control units.

  • sigma (float) – Residual standard deviation.

  • alpha (float, default=0.05) – Significance level.

  • n_pre (int, default=1) – Number of pre-treatment periods.

  • n_post (int, default=1) – Number of post-treatment periods.

  • rho (float, default=0.0) – Intra-cluster correlation.

  • deff (float, default=1.0) – Survey design effect (variance inflation factor).

Returns:

Statistical power.

Return type:

float

Examples

>>> power = compute_power(effect_size=5.0, n_treated=50, n_control=50, sigma=10.0)
>>> print(f"Power: {power:.1%}")

compute_mde#

Compute minimum detectable effect.

diff_diff.compute_mde(n_treated, n_control, sigma, power=0.8, alpha=0.05, n_pre=1, n_post=1, rho=0.0, deff=1.0)[source]#

Convenience function to compute minimum detectable effect.

Parameters:
  • n_treated (int) – Number of treated units.

  • n_control (int) – Number of control units.

  • sigma (float) – Residual standard deviation.

  • power (float, default=0.80) – Target statistical power.

  • alpha (float, default=0.05) – Significance level.

  • n_pre (int, default=1) – Number of pre-treatment periods.

  • n_post (int, default=1) – Number of post-treatment periods.

  • rho (float, default=0.0) – Intra-cluster correlation.

  • deff (float, default=1.0) – Survey design effect (variance inflation factor).

Returns:

Minimum detectable effect size.

Return type:

float

Examples

>>> mde = compute_mde(n_treated=50, n_control=50, sigma=10.0)
>>> print(f"MDE: {mde:.2f}")

compute_sample_size#

Compute required sample size.

diff_diff.compute_sample_size(effect_size, sigma, power=0.8, alpha=0.05, n_pre=1, n_post=1, rho=0.0, treat_frac=0.5, deff=1.0)[source]#

Convenience function to compute required sample size.

Parameters:
  • effect_size (float) – Treatment effect to detect.

  • sigma (float) – Residual standard deviation.

  • power (float, default=0.80) – Target statistical power.

  • alpha (float, default=0.05) – Significance level.

  • n_pre (int, default=1) – Number of pre-treatment periods.

  • n_post (int, default=1) – Number of post-treatment periods.

  • rho (float, default=0.0) – Intra-cluster correlation.

  • treat_frac (float, default=0.5) – Fraction assigned to treatment.

  • deff (float, default=1.0) – Survey design effect (variance inflation factor).

Returns:

Required total sample size.

Return type:

int

Examples

>>> n = compute_sample_size(effect_size=5.0, sigma=10.0)
>>> print(f"Required N: {n}")

simulate_power#

Simulation-based power for any DiD estimator.

diff_diff.simulate_power(estimator, n_units=100, n_periods=4, treatment_effect=5.0, treatment_fraction=0.5, treatment_period=2, sigma=1.0, n_simulations=500, alpha=0.05, effect_sizes=None, seed=None, data_generator=None, data_generator_kwargs=None, estimator_kwargs=None, result_extractor=None, progress=True, survey_config=None)[source]#

Estimate power using Monte Carlo simulation.

This function simulates datasets with known treatment effects and estimates power as the fraction of simulations where the null hypothesis is rejected. Most built-in estimators are supported via an internal registry that selects the appropriate data-generating process and fit signature automatically.

Parameters:
  • estimator (estimator object) – DiD estimator to use (e.g., DifferenceInDifferences, CallawaySantAnna).

  • n_units (int, default=100) – Number of units per simulation.

  • n_periods (int, default=4) – Number of time periods.

  • treatment_effect (float, default=5.0) – True treatment effect to simulate.

  • treatment_fraction (float, default=0.5) – Fraction of units that are treated.

  • treatment_period (int, default=2) – First post-treatment period (0-indexed).

  • sigma (float, default=1.0) – Residual standard deviation (noise level).

  • n_simulations (int, default=500) – Number of Monte Carlo simulations.

  • alpha (float, default=0.05) – Significance level for hypothesis tests.

  • effect_sizes (list of float, optional) – Multiple effect sizes to evaluate for power curve. If None, uses only treatment_effect.

  • seed (int, optional) – Random seed for reproducibility.

  • data_generator (callable, optional) – Custom data generation function. When provided, bypasses the registry DGP and calls this function with the standard kwargs (n_units, n_periods, treatment_effect, etc.).

  • data_generator_kwargs (dict, optional) – Additional keyword arguments for data generator.

  • estimator_kwargs (dict, optional) – Additional keyword arguments for estimator.fit().

  • result_extractor (callable, optional) – Custom function to extract results from the estimator output. Takes the estimator result object and returns a tuple of (att, se, p_value, conf_int). Useful for unregistered estimators with non-standard result schemas.

  • progress (bool, default=True) – Whether to print progress updates.

  • survey_config (SurveyPowerConfig, optional) – When provided, generates survey-structured data via generate_survey_did_data and injects SurveyDesign into estimator fit(). Mutually exclusive with data_generator. Supported estimators: DiD, TWFE, MultiPeriod, CS, SA, Imputation, TwoStage, Stacked, Efficient. Unsupported: TROP, SyntheticDiD, TripleDifference. heterogeneous_te_by_strata must be False.

Returns:

Simulation-based power analysis results.

Return type:

SimulationPowerResults

Examples

Basic power simulation:

>>> from diff_diff import DifferenceInDifferences, simulate_power
>>> did = DifferenceInDifferences()
>>> results = simulate_power(
...     estimator=did,
...     n_units=100,
...     treatment_effect=5.0,
...     sigma=5.0,
...     n_simulations=500,
...     seed=42
... )
>>> print(f"Power: {results.power:.1%}")

Power curve over multiple effect sizes:

>>> results = simulate_power(
...     estimator=did,
...     effect_sizes=[1.0, 2.0, 3.0, 5.0, 7.0],
...     n_simulations=200,
...     seed=42
... )
>>> print(results.power_curve_df())

With Callaway-Sant’Anna (auto-detected, no custom DGP needed):

>>> from diff_diff import CallawaySantAnna
>>> cs = CallawaySantAnna()
>>> results = simulate_power(cs, n_simulations=200, seed=42)

Notes

The simulation approach: 1. Generate data with known treatment effect 2. Fit the estimator and record the p-value 3. Repeat n_simulations times 4. Power = fraction of simulations where p-value < alpha

References

Burlig, F., Preonas, L., & Woerman, M. (2020). “Panel Data and Experimental Design.”

simulate_mde#

Simulation-based MDE for any DiD estimator.

diff_diff.simulate_mde(estimator, n_units=100, n_periods=4, treatment_fraction=0.5, treatment_period=2, sigma=1.0, n_simulations=200, power=0.8, alpha=0.05, effect_range=None, tol=0.02, max_steps=15, seed=None, data_generator=None, data_generator_kwargs=None, estimator_kwargs=None, result_extractor=None, progress=True, survey_config=None)[source]#

Find the minimum detectable effect via simulation-based bisection search.

Searches over effect sizes to find the smallest effect that achieves the target power, using simulate_power() at each step.

Parameters:
  • estimator (estimator object) – DiD estimator to use.

  • n_units (int, default=100) – Number of units per simulation.

  • n_periods (int, default=4) – Number of time periods.

  • treatment_fraction (float, default=0.5) – Fraction of units that are treated.

  • treatment_period (int, default=2) – First post-treatment period (0-indexed).

  • sigma (float, default=1.0) – Residual standard deviation.

  • n_simulations (int, default=200) – Simulations per bisection step.

  • power (float, default=0.80) – Target power.

  • alpha (float, default=0.05) – Significance level.

  • effect_range (tuple of (float, float), optional) – (lo, hi) bracket for the search. If None, auto-brackets.

  • tol (float, default=0.02) – Convergence tolerance on power.

  • max_steps (int, default=15) – Maximum bisection steps.

  • seed (int, optional) – Random seed for reproducibility.

  • data_generator (callable, optional) – Custom data generation function.

  • data_generator_kwargs (dict, optional) – Additional keyword arguments for data generator.

  • estimator_kwargs (dict, optional) – Additional keyword arguments for estimator.fit().

  • result_extractor (callable, optional) – Custom function to extract results from the estimator output. Forwarded to simulate_power().

  • progress (bool, default=True) – Whether to print progress updates.

  • survey_config (SurveyPowerConfig, optional) – Survey-aware simulation config. Forwarded to simulate_power(). See simulate_power() for details and constraints.

Returns:

Results including the MDE and search diagnostics.

Return type:

SimulationMDEResults

Examples

>>> from diff_diff import simulate_mde, DifferenceInDifferences
>>> result = simulate_mde(DifferenceInDifferences(), n_simulations=100, seed=42)
>>> print(f"MDE: {result.mde:.3f}")

simulate_sample_size#

Simulation-based sample size for any DiD estimator.

diff_diff.simulate_sample_size(estimator, treatment_effect=5.0, n_periods=4, treatment_fraction=0.5, treatment_period=2, sigma=1.0, n_simulations=200, power=0.8, alpha=0.05, n_range=None, max_steps=15, seed=None, data_generator=None, data_generator_kwargs=None, estimator_kwargs=None, result_extractor=None, progress=True, survey_config=None)[source]#

Find the required sample size via simulation-based bisection search.

Searches over n_units to find the smallest N that achieves the target power, using simulate_power() at each step.

Parameters:
  • estimator (estimator object) – DiD estimator to use.

  • treatment_effect (float, default=5.0) – True treatment effect to simulate.

  • n_periods (int, default=4) – Number of time periods.

  • treatment_fraction (float, default=0.5) – Fraction of units that are treated.

  • treatment_period (int, default=2) – First post-treatment period (0-indexed).

  • sigma (float, default=1.0) – Residual standard deviation.

  • n_simulations (int, default=200) – Simulations per bisection step.

  • power (float, default=0.80) – Target power.

  • alpha (float, default=0.05) – Significance level.

  • n_range (tuple of (int, int), optional) – (lo, hi) bracket for sample size. If None, auto-brackets.

  • max_steps (int, default=15) – Maximum bisection steps.

  • seed (int, optional) – Random seed for reproducibility.

  • data_generator (callable, optional) – Custom data generation function.

  • data_generator_kwargs (dict, optional) – Additional keyword arguments for data generator.

  • estimator_kwargs (dict, optional) – Additional keyword arguments for estimator.fit().

  • result_extractor (callable, optional) – Custom function to extract results from the estimator output. Forwarded to simulate_power().

  • progress (bool, default=True) – Whether to print progress updates.

  • survey_config (SurveyPowerConfig, optional) – Survey-aware simulation config. Forwarded to simulate_power(). When set, the bisection floor is raised to survey_config.min_viable_n to ensure viable survey structure. See simulate_power() for details and constraints.

Returns:

Results including the required N and search diagnostics.

Return type:

SimulationSampleSizeResults

Examples

>>> from diff_diff import simulate_sample_size, DifferenceInDifferences
>>> result = simulate_sample_size(
...     DifferenceInDifferences(), treatment_effect=5.0, n_simulations=100, seed=42
... )
>>> print(f"Required N: {result.required_n}")

plot_power_curve#

Visualization for a power curve from PowerAnalysis results.

diff_diff.plot_power_curve(results=None, *, effect_sizes=None, powers=None, mde=None, target_power=0.8, plot_type='effect', figsize=(10, 6), title=None, xlabel=None, ylabel='Power', color='#2563eb', mde_color='#dc2626', target_color='#22c55e', linewidth=2.0, show_mde_line=True, show_target_line=True, show_grid=True, ax=None, show=True, backend='matplotlib')[source]#

Create a power curve visualization.

Shows how statistical power changes with effect size or sample size, helping researchers understand the trade-offs in study design.

Parameters:
  • results (PowerResults, SimulationPowerResults, or DataFrame, optional) – Results object from PowerAnalysis or simulate_power(), or a DataFrame with columns ‘effect_size’ and ‘power’ (or ‘sample_size’ and ‘power’). If None, must provide effect_sizes and powers directly.

  • effect_sizes (list of float, optional) – Effect sizes (x-axis values). Required if results is None.

  • powers (list of float, optional) – Power values (y-axis values). Required if results is None.

  • mde (float, optional) – Minimum detectable effect to mark on the plot.

  • target_power (float, default=0.80) – Target power level to show as horizontal line.

  • plot_type (str, default="effect") – Type of power curve: “effect” (power vs effect size) or “sample” (power vs sample size).

  • figsize (tuple, default=(10, 6)) – Figure size (width, height) in inches.

  • title (str, optional) – Plot title. If None, uses a sensible default.

  • xlabel (str, optional) – X-axis label. If None, uses a sensible default.

  • ylabel (str, default="Power") – Y-axis label.

  • color (str, default="#2563eb") – Color for the power curve line.

  • mde_color (str, default="#dc2626") – Color for the MDE vertical line.

  • target_color (str, default="#22c55e") – Color for the target power horizontal line.

  • linewidth (float, default=2.0) – Line width for the power curve.

  • show_mde_line (bool, default=True) – Whether to show vertical line at MDE.

  • show_target_line (bool, default=True) – Whether to show horizontal line at target power.

  • show_grid (bool, default=True) – Whether to show grid lines.

  • ax (matplotlib.axes.Axes, optional) – Axes to plot on. If None, creates new figure.

  • show (bool, default=True) – Whether to call plt.show() at the end.

  • backend (str, default="matplotlib") – Plotting backend: "matplotlib" or "plotly".

Returns:

The axes object (matplotlib) or figure (plotly).

Return type:

matplotlib.axes.Axes or plotly.graph_objects.Figure

Examples

From PowerAnalysis results:

>>> from diff_diff import PowerAnalysis, plot_power_curve
>>> pa = PowerAnalysis(power=0.80)
>>> curve_df = pa.power_curve(n_treated=50, n_control=50, sigma=5.0)
>>> mde_result = pa.mde(n_treated=50, n_control=50, sigma=5.0)
>>> plot_power_curve(curve_df, mde=mde_result.mde)

From simulation results:

>>> from diff_diff import simulate_power, DifferenceInDifferences
>>> results = simulate_power(
...     DifferenceInDifferences(),
...     effect_sizes=[1, 2, 3, 5, 7, 10],
...     n_simulations=200
... )
>>> plot_power_curve(results)

Manual data:

>>> plot_power_curve(
...     effect_sizes=[1, 2, 3, 4, 5],
...     powers=[0.2, 0.5, 0.75, 0.90, 0.97],
...     mde=2.5,
...     target_power=0.80
... )

Complete Example#

from diff_diff import (
    PowerAnalysis,
    compute_mde,
    simulate_power,
    simulate_mde,
    DifferenceInDifferences,
)

# Quick MDE calculation
mde = compute_mde(
    n_treated=50,
    n_control=50,
    n_pre=4,
    n_post=4,
    sigma=1.0,
    rho=0.5,
    power=0.80,
    alpha=0.05
)
print(f"MDE: {mde:.3f}")

# Simulation-based power for DiD estimator
sim_results = simulate_power(
    estimator=DifferenceInDifferences(),
    treatment_effect=5.0,
    n_units=100,
    n_periods=4,
    treatment_period=2,
    sigma=1.0,
    n_simulations=20,
)
print(f"Simulated power: {sim_results.power:.2%}")

# Simulation-based MDE
mde_results = simulate_mde(
    estimator=DifferenceInDifferences(),
    n_units=100,
    n_simulations=10,
    max_steps=5,
)
print(f"Simulated MDE: {mde_results.mde:.3f}")

See Also#