Results Classes#

Dataclass containers for estimation results from various estimators.

DiDResults#

Results from basic DifferenceInDifferences estimation.

class diff_diff.DiDResults[source]

Bases: object

Results from a Difference-in-Differences estimation.

Provides easy access to coefficients, standard errors, confidence intervals, and summary statistics in a Pythonic way.

att

Average Treatment effect on the Treated (ATT).

Type:

float

se

Standard error of the ATT estimate.

Type:

float

t_stat

T-statistic for the ATT estimate.

Type:

float

p_value

P-value for the null hypothesis that ATT = 0.

Type:

float

conf_int

Confidence interval for the ATT.

Type:

tuple[float, float]

n_obs

Number of observations used in estimation.

Type:

int

n_treated

Number of treated units/observations.

Type:

int

n_control

Number of control units/observations.

Type:

int

Attributes

att

se

t_stat

p_value

conf_int

n_obs

is_significant

Check if the ATT is statistically significant at the alpha level.

significance_stars

Return significance stars based on p-value.

Methods

summary([alpha])

Generate a formatted summary of the estimation results.

to_dict()

Convert results to a dictionary.

to_dataframe()

Convert results to a pandas DataFrame.

att: float
se: float
t_stat: float
p_value: float
conf_int: Tuple[float, float]
n_obs: int
n_treated: int
n_control: int
alpha: float = 0.05
coefficients: Dict[str, float] | None = None
vcov: ndarray | None = None
residuals: ndarray | None = None
fitted_values: ndarray | None = None
r_squared: float | None = None
inference_method: str = 'analytical'
n_bootstrap: int | None = None
n_clusters: int | None = None
bootstrap_distribution: ndarray | None = None
survey_metadata: Any | None = None
vcov_type: str | None = None
cluster_name: str | None = None
conley_lag_cutoff: int | None = None
__repr__()[source]

Concise string representation.

Return type:

str

property coef_var: float

SE / abs(ATT). NaN when ATT is 0 or SE non-finite.

Type:

Coefficient of variation

summary(alpha=None)[source]

Generate a formatted summary of the estimation results.

Parameters:

alpha (float, optional) – Significance level for confidence intervals. Defaults to the alpha used during estimation.

Returns:

Formatted summary table.

Return type:

str

print_summary(alpha=None)[source]

Print the summary to stdout.

Parameters:

alpha (float | None)

Return type:

None

to_dict()[source]

Convert results to a dictionary.

Returns:

Dictionary containing all estimation results.

Return type:

Dict[str, Any]

to_dataframe()[source]

Convert results to a pandas DataFrame.

Returns:

DataFrame with estimation results.

Return type:

pd.DataFrame

property is_significant: bool

Check if the ATT is statistically significant at the alpha level.

property significance_stars: str

Return significance stars based on p-value.

__init__(att, se, t_stat, p_value, conf_int, n_obs, n_treated, n_control, alpha=0.05, coefficients=None, vcov=None, residuals=None, fitted_values=None, r_squared=None, inference_method='analytical', n_bootstrap=None, n_clusters=None, bootstrap_distribution=None, survey_metadata=None, vcov_type=None, cluster_name=None, conley_lag_cutoff=None)
Parameters:
Return type:

None

MultiPeriodDiDResults#

Results from MultiPeriodDiD event study estimation.

class diff_diff.MultiPeriodDiDResults[source]

Bases: object

Results from a Multi-Period Difference-in-Differences estimation.

Provides access to period-specific treatment effects as well as an aggregate average treatment effect.

period_effects

Dictionary mapping period identifiers to their PeriodEffect objects. Contains all estimated period effects (pre and post, excluding the reference period which is normalized to zero).

Type:

dict[any, PeriodEffect]

avg_att

Average Treatment effect on the Treated across post-periods only.

Type:

float

avg_se

Standard error of the average ATT.

Type:

float

avg_t_stat

T-statistic for the average ATT.

Type:

float

avg_p_value

P-value for the null hypothesis that average ATT = 0.

Type:

float

avg_conf_int

Confidence interval for the average ATT.

Type:

tuple[float, float]

n_obs

Number of observations used in estimation.

Type:

int

n_treated

Number of treated units/observations.

Type:

int

n_control

Number of control units/observations.

Type:

int

pre_periods

List of pre-treatment period identifiers.

Type:

list

post_periods

List of post-treatment period identifiers.

Type:

list

reference_period

The reference (omitted) period. Its coefficient is zero by construction and it is excluded from period_effects.

Type:

any, optional

interaction_indices

Mapping from period identifier to column index in the full variance-covariance matrix. Used internally for sub-VCV extraction (e.g., by HonestDiD and PreTrendsPower).

Type:

dict, optional

Attributes

period_effects

att

pre_periods

post_periods

reference_period

interaction_indices

pre_period_effects

Pre-period effects only (for parallel trends assessment).

post_period_effects

Post-period effects only.

period_effects: Dict[Any, PeriodEffect]
avg_att: float
avg_se: float
avg_t_stat: float
avg_p_value: float
avg_conf_int: Tuple[float, float]
n_obs: int
n_treated: int
n_control: int
pre_periods: List[Any]
post_periods: List[Any]
alpha: float = 0.05
coefficients: Dict[str, float] | None = None
vcov: ndarray | None = None
residuals: ndarray | None = None
fitted_values: ndarray | None = None
r_squared: float | None = None
reference_period: Any | None = None
interaction_indices: Dict[Any, int] | None = None
survey_metadata: Any | None = None
inference_method: str = 'analytical'
n_bootstrap: int | None = None
n_clusters: int | None = None
vcov_type: str | None = None
cluster_name: str | None = None
conley_lag_cutoff: int | None = None
property att: float
property se: float
property conf_int: Tuple[float, float]
property p_value: float
property t_stat: float
__repr__()[source]

Concise string representation.

Return type:

str

property pre_period_effects: Dict[Any, PeriodEffect]

Pre-period effects only (for parallel trends assessment).

property post_period_effects: Dict[Any, PeriodEffect]

Post-period effects only.

property coef_var: float

SE / abs(overall ATT). NaN when ATT is 0 or SE non-finite.

Type:

Coefficient of variation

summary(alpha=None)[source]

Generate a formatted summary of the estimation results.

Parameters:

alpha (float, optional) – Significance level for confidence intervals. Defaults to the alpha used during estimation.

Returns:

Formatted summary table.

Return type:

str

print_summary(alpha=None)[source]

Print the summary to stdout.

Parameters:

alpha (float | None)

Return type:

None

get_effect(period)[source]

Get the treatment effect for a specific period.

Parameters:

period (any) – The period identifier.

Returns:

The treatment effect for the specified period.

Return type:

PeriodEffect

Raises:

KeyError – If the period is not found in post-treatment periods.

to_dict()[source]

Convert results to a dictionary.

Returns:

Dictionary containing all estimation results.

Return type:

Dict[str, Any]

to_dataframe()[source]

Convert period-specific effects to a pandas DataFrame.

Returns:

DataFrame with one row per estimated period (pre and post).

Return type:

pd.DataFrame

property is_significant: bool

Check if the average ATT is statistically significant at the alpha level.

property significance_stars: str

Return significance stars for the average ATT based on p-value.

__init__(period_effects, avg_att, avg_se, avg_t_stat, avg_p_value, avg_conf_int, n_obs, n_treated, n_control, pre_periods, post_periods, alpha=0.05, coefficients=None, vcov=None, residuals=None, fitted_values=None, r_squared=None, reference_period=None, interaction_indices=None, survey_metadata=None, inference_method='analytical', n_bootstrap=None, n_clusters=None, vcov_type=None, cluster_name=None, conley_lag_cutoff=None)
Parameters:
Return type:

None

PeriodEffect#

Container for a single period’s treatment effect in event studies.

class diff_diff.PeriodEffect[source]

Bases: object

Treatment effect for a single time period.

period

The time period identifier.

Type:

any

effect

The treatment effect estimate for this period.

Type:

float

se

Standard error of the effect estimate.

Type:

float

t_stat

T-statistic for the effect estimate.

Type:

float

p_value

P-value for the null hypothesis that effect = 0.

Type:

float

conf_int

Confidence interval for the effect.

Type:

tuple[float, float]

period: Any
effect: float
se: float
t_stat: float
p_value: float
conf_int: Tuple[float, float]
__repr__()[source]

Concise string representation.

Return type:

str

property is_significant: bool

Check if the effect is statistically significant at 0.05 level.

property significance_stars: str

Return significance stars based on p-value.

__init__(period, effect, se, t_stat, p_value, conf_int)
Parameters:
Return type:

None

SyntheticDiDResults#

Results from SyntheticDiD estimation.

class diff_diff.SyntheticDiDResults[source]

Bases: object

Results from a Synthetic Difference-in-Differences estimation.

Combines DiD with synthetic control by re-weighting control units to match pre-treatment trends of treated units.

att

Average Treatment effect on the Treated (ATT).

Type:

float

se

Standard error of the ATT estimate (bootstrap, jackknife, or placebo-based).

Type:

float

t_stat

T-statistic for the ATT estimate.

Type:

float

p_value

P-value for the null hypothesis that ATT = 0.

Type:

float

conf_int

Confidence interval for the ATT.

Type:

tuple[float, float]

n_obs

Number of observations used in estimation.

Type:

int

n_treated

Number of treated units/observations.

Type:

int

n_control

Number of control units/observations.

Type:

int

unit_weights

Dictionary mapping control unit IDs to their synthetic weights. When survey weights are used, these are the composed effective weights (omega_eff = raw Frank-Wolfe * survey, renormalized) that were applied to produce the ATT, not the raw Frank-Wolfe solution.

Type:

dict

time_weights

Dictionary mapping pre-treatment periods to their time weights.

Type:

dict

pre_periods

List of pre-treatment period identifiers.

Type:

list

post_periods

List of post-treatment period identifiers.

Type:

list

variance_method

Method used for variance estimation: "bootstrap" (paper-faithful pairs bootstrap re-estimating ω and λ via Frank-Wolfe on each draw; Arkhangelsky et al. 2021 Algorithm 2 step 2, and R’s default synthdid::vcov(method="bootstrap")), "jackknife", or "placebo".

Type:

str

placebo_effects

Method-specific per-iteration estimates: placebo treatment effects (for "placebo"), bootstrap ATT estimates with re-estimated weights per draw (for "bootstrap"), or leave-one-out estimates (for "jackknife"). The variance_method field disambiguates the contents.

Type:

np.ndarray, optional

synthetic_pre_trajectory

Synthetic control trajectory in pre-treatment periods, shape (n_pre,). Equal to Y_pre_control @ omega_eff where omega_eff is the composed effective weight vector.

Type:

np.ndarray, optional

synthetic_post_trajectory

Synthetic control trajectory in post-treatment periods, shape (n_post,).

Type:

np.ndarray, optional

treated_pre_trajectory

Treated-unit mean trajectory in pre-treatment periods, shape (n_pre,). Survey-weighted when the fit used survey weights.

Type:

np.ndarray, optional

treated_post_trajectory

Treated-unit mean trajectory in post-treatment periods, shape (n_post,).

Type:

np.ndarray, optional

time_weights_array

The Frank-Wolfe time weights as a 1-D array (same values as the time_weights dict but order-stable and usable for re-estimation by sensitivity methods). Shape (n_pre,).

Type:

np.ndarray, optional

Attributes

att: float
se: float
t_stat: float
p_value: float
conf_int: Tuple[float, float]
n_obs: int
n_treated: int
n_control: int
unit_weights: Dict[Any, float]
time_weights: Dict[Any, float]
pre_periods: List[Any]
post_periods: List[Any]
alpha: float = 0.05
variance_method: str = 'placebo'
noise_level: float | None = None
zeta_omega: float | None = None
zeta_lambda: float | None = None
pre_treatment_fit: float | None = None
placebo_effects: ndarray | None = None
n_bootstrap: int | None = None
survey_metadata: Any | None = None
synthetic_pre_trajectory: ndarray | None = None
synthetic_post_trajectory: ndarray | None = None
treated_pre_trajectory: ndarray | None = None
treated_post_trajectory: ndarray | None = None
time_weights_array: ndarray | None = None
__repr__()[source]

Concise string representation.

Return type:

str

__getstate__()[source]

Exclude the internal fit snapshot from pickling.

The snapshot retains outcome matrices, unit IDs, and survey weights to support post-hoc diagnostics (in_time_placebo, sensitivity_to_zeta_omega). Serialization would otherwise carry that panel state to wherever the pickle is sent, which is a privacy hazard for survey-weighted or sensitive fits.

Unpickled results keep the public fields (ATT, weights, trajectories, etc.); calling a diagnostic method that needs the snapshot raises a ValueError directing the user to re-fit.

Return type:

Dict[str, Any]

property coef_var: float

SE / abs(ATT). NaN when ATT is 0 or SE non-finite.

Type:

Coefficient of variation

summary(alpha=None)[source]

Generate a formatted summary of the estimation results.

Parameters:

alpha (float, optional) – Significance level for confidence intervals. Defaults to the alpha used during estimation.

Returns:

Formatted summary table.

Return type:

str

print_summary(alpha=None)[source]

Print the summary to stdout.

Parameters:

alpha (float | None)

Return type:

None

to_dict()[source]

Convert results to a dictionary.

Returns:

Dictionary containing all estimation results.

Return type:

Dict[str, Any]

to_dataframe()[source]

Convert results to a pandas DataFrame.

Returns:

DataFrame with estimation results.

Return type:

pd.DataFrame

get_unit_weights_df()[source]

Get unit weights as a pandas DataFrame.

Returns:

DataFrame with unit IDs and their weights.

Return type:

pd.DataFrame

get_time_weights_df()[source]

Get time weights as a pandas DataFrame.

Returns:

DataFrame with time periods and their weights.

Return type:

pd.DataFrame

get_loo_effects_df()[source]

Per-unit leave-one-out ATT from the jackknife variance pass.

Requires variance_method='jackknife' (ValueError otherwise) and unit-level LOO granularity (NotImplementedError for the full-design survey jackknife path, which uses PSU-level LOO).

Available on:

  • non-survey jackknife fits (classical Arkhangelsky Algorithm 3).

  • pweight-only survey jackknife fits (Algorithm 3 with post-hoc ω_eff composition; PSU labels in survey_metadata come from implicit-PSU metadata but the LOO remains unit-level).

Blocked on:

  • full-design survey jackknife fits (strata / PSU / FPC set in SurveyDesign) - the underlying replicates are PSU-level τ̂_{(h,j)} (Rust & Rao 1996), not unit-level. See result.placebo_effects for the raw PSU-level replicate array and REGISTRY §SyntheticDiD “Note (survey + jackknife composition)” for the aggregation formula.

The underlying unit-level values come from the jackknife loops in SyntheticDiD._jackknife_se: control LOO estimates fill the first n_control positions (in the order of the control units seen by fit), then treated LOO estimates fill the next n_treated positions. This method joins those estimates back to user-facing unit identities.

att_loo is NaN when the fit hit the zero-sum weight guard for that unit (survey weights composed to zero once the unit was dropped). delta_from_full propagates NaN in that case.

Returns:

Columns:

  • unit - user’s unit ID

  • role - 'control' or 'treated'

  • att_loo - ATT with this unit dropped

  • delta_from_full - att_loo - self.att

Sorted by |delta_from_full| descending, NaN rows at the end.

Return type:

pd.DataFrame

in_time_placebo(fake_treatment_periods=None, zeta_omega_override=None, zeta_lambda_override=None)[source]

Re-estimate the ATT on shifted fake treatment periods within the original pre-treatment window.

A credible placebo should produce near-zero ATTs at every shifted date. Departures from zero signal that whatever the estimator picked up at the real treatment date is also present pre-treatment, weakening the causal interpretation.

The post-treatment data is never used — only the pre-window is re-sliced. Regularization reuses self.zeta_omega and self.zeta_lambda from the original fit (R synthdid convention), unless overrides are supplied.

Parameters:
  • fake_treatment_periods (list, optional) – Explicit pre-period values to test. If None (default), sweeps every feasible pre-period — every P in pre_periods whose position i satisfies i >= 2 (so at least 2 pre-fake periods remain for weight estimation) and i <= n_pre - 1 (so at least 1 post-fake period exists). Values not in pre_periods raise ValueError (a value in post_periods is explicitly not a placebo).

  • zeta_omega_override (float, optional) – Override self.zeta_omega for the refit. Default reuses the original.

  • zeta_lambda_override (float, optional) – Override self.zeta_lambda for the refit.

Returns:

Columns:
  • fake_treatment_period — the shifted date

  • att — placebo ATT (ideally near 0)

  • pre_fit_rmse — RMSE on the fake pre-window

  • n_pre_fake — periods before the fake date

  • n_post_fake — periods from the fake date onward

NaN is emitted only for dimensional infeasibility. Frank-Wolfe does not expose a mid-solver non-convergence signal; inspect pre_fit_rmse for poor refit quality.

Return type:

pd.DataFrame

sensitivity_to_zeta_omega(zeta_grid=None, multipliers=(0.25, 0.5, 1.0, 2.0, 4.0))[source]

Re-estimate the ATT across a grid of zeta_omega values to show how sensitive the estimate is to the unit-weight regularization.

The Frank-Wolfe time weights computed during the original fit are held fixed here — this method isolates sensitivity to zeta_omega specifically. zeta_lambda and the time weights are not re-fit.

Parameters:
  • zeta_grid (list of float, optional) – Absolute zeta_omega values to evaluate. If None (default), uses multipliers * self.zeta_omega — i.e. a 5-point grid by default, spanning 16x from the smallest to the largest multiplier and symmetric in log space around 1.0.

  • multipliers (tuple of float, default (0.25, 0.5, 1.0, 2.0, 4.0)) – Multipliers on self.zeta_omega. Ignored when zeta_grid is supplied.

Returns:

Columns:
  • zeta_omega — the regularization value evaluated

  • att — resulting ATT

  • pre_fit_rmse — RMSE on the original pre-period

  • max_unit_weight — max element of the composed omega_eff (sensitivity indicator: close to 1 means near-one-hot solutions; close to 1/n_control means near-uniform)

  • effective_n1 / sum(omega_eff**2)

Return type:

pd.DataFrame

Notes

Extreme zeta_omega: very small values push weights toward sparse one-hot solutions (few controls dominate); very large values push toward uniform weighting. The pre_fit_rmse column exposes the tradeoff.

get_weight_concentration(top_k=5)[source]

Concentration metrics for the control unit weights.

Operates on self.unit_weights, which for survey-weighted fits stores the composed effective weights (omega_eff = raw_omega * w_control, renormalized to sum to 1) that were applied to produce the ATT. For non-survey fits the values equal the raw Frank-Wolfe solution. Either way, the concentration reflects the distribution actually used by the estimator.

Parameters:

top_k (int, default 5) – Number of largest weights to sum for top_k_share. Must be non-negative. Clamped to the available number of control units.

Returns:

Keys:
  • effective_n1 / sum(w**2), inverse Herfindahl

  • herfindahlsum(w**2)

  • top_k_share — sum of the top_k largest weights

  • top_k — the (possibly clamped) value used

Return type:

dict

Raises:

ValueError – If top_k is negative.

property is_significant: bool

Check if the ATT is statistically significant at the alpha level.

property significance_stars: str

Return significance stars based on p-value.

__init__(att, se, t_stat, p_value, conf_int, n_obs, n_treated, n_control, unit_weights, time_weights, pre_periods, post_periods, alpha=0.05, variance_method='placebo', noise_level=None, zeta_omega=None, zeta_lambda=None, pre_treatment_fit=None, placebo_effects=None, n_bootstrap=None, survey_metadata=None, synthetic_pre_trajectory=None, synthetic_post_trajectory=None, treated_pre_trajectory=None, treated_post_trajectory=None, time_weights_array=None)
Parameters:
Return type:

None