Results Classes#

Dataclass containers for estimation results from various estimators.

DiDResults#

Results from basic DifferenceInDifferences estimation.

class diff_diff.DiDResults[source]

Bases: object

Results from a Difference-in-Differences estimation.

Provides easy access to coefficients, standard errors, confidence intervals, and summary statistics in a Pythonic way.

att

Average Treatment effect on the Treated (ATT).

Type:: float

se

Standard error of the ATT estimate.

Type:: float

t_stat

T-statistic for the ATT estimate.

Type:: float

p_value

P-value for the null hypothesis that ATT = 0.

Type:: float

conf_int

Confidence interval for the ATT.

Type:: tuple[float, float]

n_obs

Number of observations used in estimation.

Type:: int

n_treated

Number of treated units/observations.

Type:: int

n_control

Number of control units/observations.

Type:: int

Attributes

`att`
`se`
`t_stat`
`p_value`
`conf_int`
`n_obs`
`is_significant`	Check if the ATT is statistically significant at the alpha level.
`significance_stars`	Return significance stars based on p-value.

Methods

`summary`([alpha])	Generate a formatted summary of the estimation results.
`to_dict`()	Convert results to a dictionary.
`to_dataframe`()	Convert results to a pandas DataFrame.

att: float

se: float

t_stat: float

p_value: float

conf_int: Tuple[float, float]

n_obs: int

n_treated: int

n_control: int

alpha: float = 0.05

coefficients: Dict[str, float] | None = None

vcov: ndarray | None = None

residuals: ndarray | None = None

fitted_values: ndarray | None = None

r_squared: float | None = None

inference_method: str = 'analytical'

n_bootstrap: int | None = None

n_clusters: int | None = None

p_val_type: str | None = None

bootstrap_distribution: ndarray | None = None

survey_metadata: Any | None = None

vcov_type: str | None = None

cluster_name: str | None = None

conley_lag_cutoff: int | None = None

df_convention: str | None = None

inference_df: float | None = None

__repr__()[source]

Concise string representation.

Return type:: str

property coef_var: float

SE / abs(ATT). NaN when ATT is 0 or SE non-finite.

Type:: Coefficient of variation

summary(alpha=None)[source]

Generate a formatted summary of the estimation results.

Parameters:: alpha (float, optional) – Significance level for confidence intervals. Defaults to the alpha used during estimation.
Returns:: Formatted summary table.
Return type:: str

print_summary(alpha=None)[source]

Print the summary to stdout.

Parameters:: alpha (float | None)
Return type:: None

to_dict()[source]

Convert results to a dictionary.

Returns:: Dictionary containing all estimation results.
Return type:: Dict[str, Any]

to_dataframe()[source]

Convert results to a pandas DataFrame.

Returns:: DataFrame with estimation results.
Return type:: pd.DataFrame

property is_significant: bool: Check if the ATT is statistically significant at the alpha level.

property significance_stars: str: Return significance stars based on p-value.

__init__(att, se, t_stat, p_value, conf_int, n_obs, n_treated, n_control, alpha=0.05, coefficients=None, vcov=None, residuals=None, fitted_values=None, r_squared=None, inference_method='analytical', n_bootstrap=None, n_clusters=None, p_val_type=None, bootstrap_distribution=None, survey_metadata=None, vcov_type=None, cluster_name=None, conley_lag_cutoff=None, df_convention=None, inference_df=None)

Parameters:

att (float)
se (float)
t_stat (float)
p_value (float)
conf_int (Tuple[float, float])
n_obs (int)
n_treated (int)
n_control (int)
alpha (float)
coefficients (Dict[str, float] | None)
vcov (ndarray | None)
residuals (ndarray | None)
fitted_values (ndarray | None)
r_squared (float | None)
inference_method (str)
n_bootstrap (int | None)
n_clusters (int | None)
p_val_type (str | None)
bootstrap_distribution (ndarray | None)
survey_metadata (Any | None)
vcov_type (str | None)
cluster_name (str | None)
conley_lag_cutoff (int | None)
df_convention (str | None)
inference_df (float | None)

Return type:

None

MultiPeriodDiDResults#

Results from MultiPeriodDiD event study estimation.

class diff_diff.MultiPeriodDiDResults[source]

Bases: object

Results from a Multi-Period Difference-in-Differences estimation.

Provides access to period-specific treatment effects as well as an aggregate average treatment effect.

period_effects

Dictionary mapping period identifiers to their PeriodEffect objects. Contains all estimated period effects (pre and post, excluding the reference period which is normalized to zero).

Type:: dict[any, PeriodEffect]

avg_att

Average Treatment effect on the Treated across post-periods only.

Type:: float

avg_se

Standard error of the average ATT.

Type:: float

avg_t_stat

T-statistic for the average ATT.

Type:: float

avg_p_value

P-value for the null hypothesis that average ATT = 0.

Type:: float

avg_conf_int

Confidence interval for the average ATT.

Type:: tuple[float, float]

n_obs

Number of observations used in estimation.

Type:: int

n_treated

Number of treated units/observations.

Type:: int

n_control

Number of control units/observations.

Type:: int

pre_periods

List of pre-treatment period identifiers.

Type:: list

post_periods

List of post-treatment period identifiers.

Type:: list

reference_period

The reference (omitted) period. Its coefficient is zero by construction and it is excluded from period_effects.

Type:: any, optional

interaction_indices

Mapping from period identifier to column index in the full variance-covariance matrix. Used internally for sub-VCV extraction (e.g., by HonestDiD and PreTrendsPower).

Type:: dict, optional

Attributes

`period_effects`
`att`
`pre_periods`
`post_periods`
`reference_period`
`interaction_indices`
`pre_period_effects`	Pre-period effects only (for parallel trends assessment).
`post_period_effects`	Post-period effects only.

period_effects: Dict[Any, PeriodEffect]

avg_att: float

avg_se: float

avg_t_stat: float

avg_p_value: float

avg_conf_int: Tuple[float, float]

n_obs: int

n_treated: int

n_control: int

pre_periods: List[Any]

post_periods: List[Any]

alpha: float = 0.05

coefficients: Dict[str, float] | None = None

vcov: ndarray | None = None

residuals: ndarray | None = None

fitted_values: ndarray | None = None

r_squared: float | None = None

reference_period: Any | None = None

interaction_indices: Dict[Any, int] | None = None

survey_metadata: Any | None = None

inference_method: str = 'analytical'

n_bootstrap: int | None = None

n_clusters: int | None = None

vcov_type: str | None = None

cluster_name: str | None = None

conley_lag_cutoff: int | None = None

df_convention: str | None = None

inference_df: float | None = None

property att: float

property se: float

property conf_int: Tuple[float, float]

property p_value: float

property t_stat: float

__repr__()[source]

Concise string representation.

Return type:: str

property pre_period_effects: Dict[Any, PeriodEffect]: Pre-period effects only (for parallel trends assessment).

property post_period_effects: Dict[Any, PeriodEffect]: Post-period effects only.

property coef_var: float

SE / abs(overall ATT). NaN when ATT is 0 or SE non-finite.

Type:: Coefficient of variation

summary(alpha=None)[source]

Generate a formatted summary of the estimation results.

Parameters:: alpha (float, optional) – Significance level for confidence intervals. Defaults to the alpha used during estimation.
Returns:: Formatted summary table.
Return type:: str

print_summary(alpha=None)[source]

Print the summary to stdout.

Parameters:: alpha (float | None)
Return type:: None

get_effect(period)[source]

Get the treatment effect for a specific period.

Parameters:: period (any) – The period identifier.
Returns:: The treatment effect for the specified period.
Return type:: PeriodEffect
Raises:: KeyError – If the period is not found in post-treatment periods.

to_dict()[source]

Convert results to a dictionary.

Returns:: Dictionary containing all estimation results.
Return type:: Dict[str, Any]

to_dataframe()[source]

Convert period-specific effects to a pandas DataFrame.

Returns:: DataFrame with one row per estimated period (pre and post).
Return type:: pd.DataFrame

property is_significant: bool: Check if the average ATT is statistically significant at the alpha level.

property significance_stars: str: Return significance stars for the average ATT based on p-value.

__init__(period_effects, avg_att, avg_se, avg_t_stat, avg_p_value, avg_conf_int, n_obs, n_treated, n_control, pre_periods, post_periods, alpha=0.05, coefficients=None, vcov=None, residuals=None, fitted_values=None, r_squared=None, reference_period=None, interaction_indices=None, survey_metadata=None, inference_method='analytical', n_bootstrap=None, n_clusters=None, vcov_type=None, cluster_name=None, conley_lag_cutoff=None, df_convention=None, inference_df=None)

Parameters:

period_effects (Dict[Any, PeriodEffect])
avg_att (float)
avg_se (float)
avg_t_stat (float)
avg_p_value (float)
avg_conf_int (Tuple[float, float])
n_obs (int)
n_treated (int)
n_control (int)
pre_periods (List[Any])
post_periods (List[Any])
alpha (float)
coefficients (Dict[str, float] | None)
vcov (ndarray | None)
residuals (ndarray | None)
fitted_values (ndarray | None)
r_squared (float | None)
reference_period (Any | None)
interaction_indices (Dict[Any, int] | None)
survey_metadata (Any | None)
inference_method (str)
n_bootstrap (int | None)
n_clusters (int | None)
vcov_type (str | None)
cluster_name (str | None)
conley_lag_cutoff (int | None)
df_convention (str | None)
inference_df (float | None)

Return type:

None

PeriodEffect#

Container for a single period’s treatment effect in event studies.

class diff_diff.PeriodEffect[source]

Bases: object

Treatment effect for a single time period.

period

The time period identifier.

Type:: any

effect

The treatment effect estimate for this period.

Type:: float

se

Standard error of the effect estimate.

Type:: float

t_stat

T-statistic for the effect estimate.

Type:: float

p_value

P-value for the null hypothesis that effect = 0.

Type:: float

conf_int

Confidence interval for the effect.

Type:: tuple[float, float]

period: Any

effect: float

se: float

t_stat: float

p_value: float

conf_int: Tuple[float, float]

__repr__()[source]

Concise string representation.

Return type:: str

property is_significant: bool: Check if the effect is statistically significant at 0.05 level.

property significance_stars: str: Return significance stars based on p-value.

__init__(period, effect, se, t_stat, p_value, conf_int)

Parameters:

period (Any)
effect (float)
se (float)
t_stat (float)
p_value (float)
conf_int (Tuple[float, float])

Return type:

None

SyntheticDiDResults#

Results from SyntheticDiD estimation.

class diff_diff.SyntheticDiDResults[source]

Bases: object

Results from a Synthetic Difference-in-Differences estimation.

Combines DiD with synthetic control by re-weighting control units to match pre-treatment trends of treated units.

att

Average Treatment effect on the Treated (ATT).

Type:: float

se

Standard error of the ATT estimate (bootstrap, jackknife, or placebo-based).

Type:: float

t_stat

T-statistic for the ATT estimate.

Type:: float

p_value

P-value for the null hypothesis that ATT = 0.

Type:: float

conf_int

Confidence interval for the ATT.

Type:: tuple[float, float]

n_obs

Number of observations used in estimation.

Type:: int

n_treated

Number of treated units/observations.

Type:: int

n_control

Number of control units/observations.

Type:: int

unit_weights

Dictionary mapping control unit IDs to their synthetic weights. When survey weights are used, these are the composed effective weights (omega_eff = raw Frank-Wolfe * survey, renormalized) that were applied to produce the ATT, not the raw Frank-Wolfe solution.

Type:: dict

time_weights

Dictionary mapping pre-treatment periods to their time weights.

Type:: dict

pre_periods

List of pre-treatment period identifiers.

Type:: list

post_periods

List of post-treatment period identifiers.

Type:: list

variance_method

Method used for variance estimation: "bootstrap" (paper-faithful pairs bootstrap re-estimating ω and λ via Frank-Wolfe on each draw; Arkhangelsky et al. 2021 Algorithm 2 step 2, and R’s default synthdid::vcov(method="bootstrap")), "jackknife", or "placebo".

Type:: str

variance_effects

Method-specific per-iteration estimates: placebo treatment effects (for "placebo"), bootstrap ATT estimates with re-estimated weights per draw (for "bootstrap"), or leave-one-out estimates (for "jackknife"). The variance_method field disambiguates the contents. (The deprecated read-only alias placebo_effects returns this array and is removed in v4.0.0.)

Type:: np.ndarray, optional

synthetic_pre_trajectory

Synthetic control trajectory in pre-treatment periods, shape (n_pre,). Equal to Y_pre_control @ omega_eff where omega_eff is the composed effective weight vector.

Type:: np.ndarray, optional

synthetic_post_trajectory

Synthetic control trajectory in post-treatment periods, shape (n_post,).

Type:: np.ndarray, optional

treated_pre_trajectory

Treated-unit mean trajectory in pre-treatment periods, shape (n_pre,). Survey-weighted when the fit used survey weights.

Type:: np.ndarray, optional

treated_post_trajectory

Treated-unit mean trajectory in post-treatment periods, shape (n_post,).

Type:: np.ndarray, optional

time_weights_array

The Frank-Wolfe time weights as a 1-D array (same values as the time_weights dict but order-stable and usable for re-estimation by sensitivity methods). Shape (n_pre,).

Type:: np.ndarray, optional

Attributes

`att`
`unit_weights`
`time_weights`

att: float

se: float

t_stat: float

p_value: float

conf_int: Tuple[float, float]

n_obs: int

n_treated: int

n_control: int

unit_weights: Dict[Any, float]

time_weights: Dict[Any, float]

pre_periods: List[Any]

post_periods: List[Any]

alpha: float = 0.05

variance_method: str = 'placebo'

noise_level: float | None = None

zeta_omega: float | None = None

zeta_lambda: float | None = None

pre_treatment_fit: float | None = None

variance_effects: ndarray | None = None

n_bootstrap: int | None = None

survey_metadata: Any | None = None

synthetic_pre_trajectory: ndarray | None = None

synthetic_post_trajectory: ndarray | None = None

treated_pre_trajectory: ndarray | None = None

treated_post_trajectory: ndarray | None = None

time_weights_array: ndarray | None = None

__repr__()[source]

Concise string representation.

Return type:: str

__getstate__()[source]

Exclude the internal fit snapshot from pickling.

The snapshot retains outcome matrices, unit IDs, and survey weights to support post-hoc diagnostics (in_time_placebo, sensitivity_to_zeta_omega). Serialization would otherwise carry that panel state to wherever the pickle is sent, which is a privacy hazard for survey-weighted or sensitive fits.

Unpickled results keep the public fields (ATT, weights, trajectories, etc.); calling a diagnostic method that needs the snapshot raises a ValueError directing the user to re-fit.

Return type:: Dict[str, Any]

__setstate__(state)[source]

Restore from pickle, migrating the legacy field name.

Results pickled before the placebo_effects → variance_effects rename (<= 3.5.x) carry the old key in their state; map it so the stored variance draws survive and remain reachable through both variance_effects and the deprecated placebo_effects alias. Remove together with the alias in v4.0.0.

Parameters:: state (Dict[str, Any])
Return type:: None

property coef_var: float

SE / abs(ATT). NaN when ATT is 0 or SE non-finite.

Type:: Coefficient of variation

property placebo_effects: ndarray | None: Deprecated alias for variance_effects (removed in v4.0.0).

Deprecated since version 3.5.2: Renamed to variance_effects because the array’s contents are method-specific (placebo effects, bootstrap ATT draws, or leave-one-out estimates depending on variance_method).

summary(alpha=None)[source]

Generate a formatted summary of the estimation results.

Parameters:: alpha (float, optional) – Significance level for confidence intervals. Defaults to the alpha used during estimation.
Returns:: Formatted summary table.
Return type:: str

print_summary(alpha=None)[source]

Print the summary to stdout.

Parameters:: alpha (float | None)
Return type:: None

to_dict()[source]

Convert results to a dictionary.

Returns:: Dictionary containing all estimation results.
Return type:: Dict[str, Any]

to_dataframe()[source]

Convert results to a pandas DataFrame.

Returns:: DataFrame with estimation results.
Return type:: pd.DataFrame

get_unit_weights_df()[source]

Get unit weights as a pandas DataFrame.

Returns:: DataFrame with unit IDs and their weights.
Return type:: pd.DataFrame

get_time_weights_df()[source]

Get time weights as a pandas DataFrame.

Returns:: DataFrame with time periods and their weights.
Return type:: pd.DataFrame

get_loo_effects_df()[source]

Per-unit leave-one-out ATT from the jackknife variance pass.

Requires variance_method='jackknife' (ValueError otherwise) and unit-level LOO granularity (NotImplementedError for the full-design survey jackknife path, which uses PSU-level LOO).

Available on:

non-survey jackknife fits (classical Arkhangelsky Algorithm 3).
pweight-only survey jackknife fits (Algorithm 3 with post-hoc ω_eff composition; PSU labels in survey_metadata come from implicit-PSU metadata but the LOO remains unit-level).

Blocked on:

full-design survey jackknife fits (strata / PSU / FPC set in SurveyDesign) - the underlying replicates are PSU-level τ̂_{(h,j)} (Rust & Rao 1996), not unit-level. See result.variance_effects for the raw PSU-level replicate array and REGISTRY §SyntheticDiD “Note (survey + jackknife composition)” for the aggregation formula.

The underlying unit-level values come from the jackknife loops in SyntheticDiD._jackknife_se: control LOO estimates fill the first n_control positions (in the order of the control units seen by fit), then treated LOO estimates fill the next n_treated positions. This method joins those estimates back to user-facing unit identities.

att_loo is NaN when the fit hit the zero-sum weight guard for that unit (survey weights composed to zero once the unit was dropped). delta_from_full propagates NaN in that case.

Returns:

Columns:

unit - user’s unit ID
role - 'control' or 'treated'
att_loo - ATT with this unit dropped
delta_from_full - att_loo - self.att

Sorted by |delta_from_full| descending, NaN rows at the end.

Return type:

pd.DataFrame

in_time_placebo(fake_treatment_periods=None, zeta_omega_override=None, zeta_lambda_override=None)[source]

Re-estimate the ATT on shifted fake treatment periods within the original pre-treatment window.

A credible placebo should produce near-zero ATTs at every shifted date. Departures from zero signal that whatever the estimator picked up at the real treatment date is also present pre-treatment, weakening the causal interpretation.

The post-treatment data is never used — only the pre-window is re-sliced. Regularization reuses self.zeta_omega and self.zeta_lambda from the original fit (R synthdid convention), unless overrides are supplied.

Parameters:

fake_treatment_periods (list, optional) – Explicit pre-period values to test. If None (default), sweeps every feasible pre-period — every P in pre_periods whose position i satisfies i >= 2 (so at least 2 pre-fake periods remain for weight estimation) and i <= n_pre - 1 (so at least 1 post-fake period exists). Values not in pre_periods raise ValueError (a value in post_periods is explicitly not a placebo).
zeta_omega_override (float, optional) – Override self.zeta_omega for the refit. Default reuses the original.
zeta_lambda_override (float, optional) – Override self.zeta_lambda for the refit.

Returns:

Columns:

fake_treatment_period — the shifted date
att — placebo ATT (ideally near 0)
pre_fit_rmse — RMSE on the fake pre-window
n_pre_fake — periods before the fake date
n_post_fake — periods from the fake date onward

NaN is emitted only for dimensional infeasibility. Frank-Wolfe does not expose a mid-solver non-convergence signal; inspect pre_fit_rmse for poor refit quality.

Return type:

pd.DataFrame

sensitivity_to_zeta_omega(zeta_grid=None, multipliers=(0.25, 0.5, 1.0, 2.0, 4.0))[source]

Re-estimate the ATT across a grid of zeta_omega values to show how sensitive the estimate is to the unit-weight regularization.

The Frank-Wolfe time weights computed during the original fit are held fixed here — this method isolates sensitivity to zeta_omega specifically. zeta_lambda and the time weights are not re-fit.

Parameters:

zeta_grid (list of float, optional) – Absolute zeta_omega values to evaluate. If None (default), uses multipliers * self.zeta_omega — i.e. a 5-point grid by default, spanning 16x from the smallest to the largest multiplier and symmetric in log space around 1.0.
multipliers (tuple of float, default (0.25, 0.5, 1.0, 2.0, 4.0)) – Multipliers on self.zeta_omega. Ignored when zeta_grid is supplied.

Returns:

Columns:

zeta_omega — the regularization value evaluated
att — resulting ATT
pre_fit_rmse — RMSE on the original pre-period
max_unit_weight — max element of the composed omega_eff (sensitivity indicator: close to 1 means near-one-hot solutions; close to 1/n_control means near-uniform)
effective_n — 1 / sum(omega_eff**2)

Return type:

pd.DataFrame

Notes

Extreme zeta_omega: very small values push weights toward sparse one-hot solutions (few controls dominate); very large values push toward uniform weighting. The pre_fit_rmse column exposes the tradeoff.

get_weight_concentration(top_k=5)[source]

Concentration metrics for the control unit weights.

Operates on self.unit_weights, which for survey-weighted fits stores the composed effective weights (omega_eff = raw_omega * w_control, renormalized to sum to 1) that were applied to produce the ATT. For non-survey fits the values equal the raw Frank-Wolfe solution. Either way, the concentration reflects the distribution actually used by the estimator.

Parameters:

top_k (int, default 5) – Number of largest weights to sum for top_k_share. Must be non-negative. Clamped to the available number of control units.

Returns:

Keys:

effective_n — 1 / sum(w**2), inverse Herfindahl
herfindahl — sum(w**2)
top_k_share — sum of the top_k largest weights
top_k — the (possibly clamped) value used

Return type:

dict

Raises:

ValueError – If top_k is negative.

property is_significant: bool: Check if the ATT is statistically significant at the alpha level.

property significance_stars: str: Return significance stars based on p-value.

__init__(att, se, t_stat, p_value, conf_int, n_obs, n_treated, n_control, unit_weights, time_weights, pre_periods, post_periods, alpha=0.05, variance_method='placebo', noise_level=None, zeta_omega=None, zeta_lambda=None, pre_treatment_fit=None, variance_effects=None, n_bootstrap=None, survey_metadata=None, synthetic_pre_trajectory=None, synthetic_post_trajectory=None, treated_pre_trajectory=None, treated_post_trajectory=None, time_weights_array=None)

Parameters:

att (float)
se (float)
t_stat (float)
p_value (float)
conf_int (Tuple[float, float])
n_obs (int)
n_treated (int)
n_control (int)
unit_weights (Dict[Any, float])
time_weights (Dict[Any, float])
pre_periods (List[Any])
post_periods (List[Any])
alpha (float)
variance_method (str)
noise_level (float | None)
zeta_omega (float | None)
zeta_lambda (float | None)
pre_treatment_fit (float | None)
variance_effects (ndarray | None)
n_bootstrap (int | None)
survey_metadata (Any | None)
synthetic_pre_trajectory (ndarray | None)
synthetic_post_trajectory (ndarray | None)
treated_pre_trajectory (ndarray | None)
treated_post_trajectory (ndarray | None)
time_weights_array (ndarray | None)

Return type:

None