diff_diff.simulate_sample_size#
- diff_diff.simulate_sample_size(estimator, treatment_effect=5.0, n_periods=4, treatment_fraction=0.5, treatment_period=2, sigma=1.0, n_simulations=200, power=0.8, alpha=0.05, n_range=None, max_steps=15, seed=None, data_generator=None, data_generator_kwargs=None, estimator_kwargs=None, result_extractor=None, progress=True, survey_config=None)[source]
Find the required sample size via simulation-based bisection search.
Searches over
n_unitsto find the smallest N that achieves the target power, usingsimulate_power()at each step.- Parameters:
estimator (estimator object) – DiD estimator to use.
treatment_effect (float, default=5.0) – True treatment effect to simulate.
n_periods (int, default=4) – Number of time periods.
treatment_fraction (float, default=0.5) – Fraction of units that are treated.
treatment_period (int, default=2) – First post-treatment period (0-indexed).
sigma (float, default=1.0) – Residual standard deviation.
n_simulations (int, default=200) – Simulations per bisection step.
power (float, default=0.80) – Target power.
alpha (float, default=0.05) – Significance level.
n_range (tuple of (int, int), optional) –
(lo, hi)bracket for sample size. If None, auto-brackets.max_steps (int, default=15) – Maximum bisection steps.
seed (int, optional) – Random seed for reproducibility.
data_generator (callable, optional) – Custom data generation function.
data_generator_kwargs (dict, optional) – Additional keyword arguments for data generator.
estimator_kwargs (dict, optional) – Additional keyword arguments for estimator.fit().
result_extractor (callable, optional) – Custom function to extract results from the estimator output. Forwarded to
simulate_power().progress (bool, default=True) – Whether to print progress updates.
survey_config (SurveyPowerConfig, optional) – Survey-aware simulation config. Forwarded to
simulate_power(). When set, the bisection floor is raised tosurvey_config.min_viable_nto ensure viable survey structure. Seesimulate_power()for details and constraints.
- Returns:
Results including the required N and search diagnostics.
- Return type:
Examples
>>> from diff_diff import simulate_sample_size, DifferenceInDifferences >>> result = simulate_sample_size( ... DifferenceInDifferences(), treatment_effect=5.0, n_simulations=100, seed=42 ... ) >>> print(f"Required N: {result.required_n}")