diff_diff.generate_continuous_did_data#

diff_diff.generate_continuous_did_data(n_units=500, n_periods=4, cohort_periods=None, never_treated_frac=0.3, dose_distribution='lognormal', dose_params=None, att_function='linear', att_slope=2.0, att_intercept=1.0, unit_fe_sd=2.0, time_trend=0.5, noise_sd=1.0, seed=None)[source]

Generate synthetic data for continuous DiD analysis with known dose-response.

Creates a balanced panel with continuous treatment doses and known ATT(d) function, satisfying strong parallel trends by construction.

Parameters:
  • n_units (int, default=500) – Number of units in the panel.

  • n_periods (int, default=4) – Number of time periods (1-indexed).

  • cohort_periods (list of int, optional) – Treatment cohort periods. Default: [2] (single cohort).

  • never_treated_frac (float, default=0.3) – Fraction of units that are never-treated.

  • dose_distribution (str, default="lognormal") – Distribution for dose: "lognormal", "uniform", "exponential".

  • dose_params (dict, optional) – Distribution-specific parameters. Defaults: lognormal: {"mean": 0.5, "sigma": 0.5} uniform: {"low": 0.5, "high": 5.0} exponential: {"scale": 2.0}

  • att_function (str, default="linear") – Functional form of ATT(d): "linear", "quadratic", "log".

  • att_slope (float, default=2.0) – Slope parameter for ATT function.

  • att_intercept (float, default=1.0) – Intercept parameter for ATT function.

  • unit_fe_sd (float, default=2.0) – Standard deviation of unit fixed effects.

  • time_trend (float, default=0.5) – Linear time trend coefficient.

  • noise_sd (float, default=1.0) – Standard deviation of idiosyncratic noise.

  • seed (int, optional) – Random seed for reproducibility.

Returns:

Panel data with columns: unit, period, outcome, first_treat, dose, true_att.

Return type:

pd.DataFrame