diff_diff.simulate_sample_size#

diff_diff.simulate_sample_size(estimator, treatment_effect=5.0, n_periods=4, treatment_fraction=0.5, treatment_period=2, sigma=1.0, n_simulations=200, power=0.8, alpha=0.05, n_range=None, max_steps=15, seed=None, data_generator=None, data_generator_kwargs=None, estimator_kwargs=None, result_extractor=None, progress=True, survey_config=None)[source]

Find the required sample size via simulation-based bisection search.

Searches over n_units to find the smallest N that achieves the target power, using simulate_power() at each step.

Parameters:

estimator (estimator object) – DiD estimator to use.
treatment_effect (float, default=5.0) – True treatment effect to simulate.
n_periods (int, default=4) – Number of time periods.
treatment_fraction (float, default=0.5) – Fraction of units that are treated.
treatment_period (int, default=2) – First post-treatment period (0-indexed).
sigma (float, default=1.0) – Residual standard deviation.
n_simulations (int, default=200) – Simulations per bisection step.
power (float, default=0.80) – Target power.
alpha (float, default=0.05) – Significance level.
n_range (tuple of (int, int), optional) – (lo, hi) bracket for sample size. If None, auto-brackets.
max_steps (int, default=15) – Maximum bisection steps.
seed (int, optional) – Random seed for reproducibility.
data_generator (callable, optional) – Custom data generation function.
data_generator_kwargs (dict, optional) – Additional keyword arguments for data generator.
estimator_kwargs (dict, optional) – Additional keyword arguments for estimator.fit().
result_extractor (callable, optional) – Custom function to extract results from the estimator output. Forwarded to simulate_power().
progress (bool, default=True) – Whether to print progress updates.
survey_config (SurveyPowerConfig, optional) – Survey-aware simulation config. Forwarded to simulate_power(). When set, the bisection floor is raised to survey_config.min_viable_n to ensure viable survey structure. See simulate_power() for details and constraints.

Returns:

Results including the required N and search diagnostics.

Return type:

SimulationSampleSizeResults

Examples

>>> from diff_diff import simulate_sample_size, DifferenceInDifferences
>>> result = simulate_sample_size(
...     DifferenceInDifferences(), treatment_effect=5.0, n_simulations=100, seed=42
... )
>>> print(f"Required N: {result.required_n}")