Estimators
Core estimator classes for Difference-in-Differences analysis.
The main estimators module (diff_diff.estimators) contains the base classes
DifferenceInDifferences and MultiPeriodDiD. Additional estimators are
organized in separate modules for maintainability:
diff_diff.twfe-TwoWayFixedEffectsestimatordiff_diff.synthetic_did-SyntheticDiDestimator
All estimators are re-exported from diff_diff.estimators and diff_diff
for backward compatibility, so you can import any of them using:
from diff_diff import DifferenceInDifferences, TwoWayFixedEffects, MultiPeriodDiD, SyntheticDiD
DifferenceInDifferences
Basic 2x2 DiD estimator.
- class diff_diff.DifferenceInDifferences[source]
Bases:
objectDifference-in-Differences estimator with sklearn-like interface.
Estimates the Average Treatment effect on the Treated (ATT) using the canonical 2x2 DiD design or panel data with two-way fixed effects.
- Parameters:
formula (str, optional) – R-style formula for the model (e.g., “outcome ~ treated * post”). If provided, overrides column name parameters.
robust (bool, default=True) – Whether to use heteroskedasticity-robust standard errors (HC1).
cluster (str, optional) – Column name for cluster-robust standard errors.
alpha (float, default=0.05) – Significance level for confidence intervals.
inference (str, default="analytical") – Inference method: “analytical” for standard asymptotic inference, or “wild_bootstrap” for wild cluster bootstrap (recommended when number of clusters is small, <50).
n_bootstrap (int, default=999) – Number of bootstrap replications when inference=”wild_bootstrap”.
bootstrap_weights (str, default="rademacher") – Type of bootstrap weights: “rademacher” (standard), “webb” (recommended for <10 clusters), or “mammen” (skewness correction).
seed (int, optional) – Random seed for reproducibility when using bootstrap inference. If None (default), results will vary between runs.
rank_deficient_action (str, default "warn") – Action when design matrix is rank-deficient (linearly dependent columns): - “warn”: Issue warning and drop linearly dependent columns (default) - “error”: Raise ValueError - “silent”: Drop columns silently without warning
- results_
Estimation results after calling fit().
- Type:
Examples
Basic usage with a DataFrame:
>>> import pandas as pd >>> from diff_diff import DifferenceInDifferences >>> >>> # Create sample data >>> data = pd.DataFrame({ ... 'outcome': [10, 11, 15, 18, 9, 10, 12, 13], ... 'treated': [1, 1, 1, 1, 0, 0, 0, 0], ... 'post': [0, 0, 1, 1, 0, 0, 1, 1] ... }) >>> >>> # Fit the model >>> did = DifferenceInDifferences() >>> results = did.fit(data, outcome='outcome', treatment='treated', time='post') >>> >>> # View results >>> print(results.att) # ATT estimate >>> results.print_summary() # Full summary table
Using formula interface:
>>> did = DifferenceInDifferences() >>> results = did.fit(data, formula='outcome ~ treated * post')
Notes
The ATT is computed using the standard DiD formula:
ATT = (E[Y|D=1,T=1] - E[Y|D=1,T=0]) - (E[Y|D=0,T=1] - E[Y|D=0,T=0])
Or equivalently via OLS regression:
Y = α + β₁*D + β₂*T + β₃*(D×T) + ε
Where β₃ is the ATT.
Methods
fit(data[, outcome, treatment, time, ...])Fit the Difference-in-Differences model.
Get estimator parameters (sklearn-compatible).
set_params(**params)Set estimator parameters (sklearn-compatible).
- __init__(robust=True, cluster=None, alpha=0.05, inference='analytical', n_bootstrap=999, bootstrap_weights='rademacher', seed=None, rank_deficient_action='warn')[source]
- fit(data, outcome=None, treatment=None, time=None, formula=None, covariates=None, fixed_effects=None, absorb=None)[source]
Fit the Difference-in-Differences model.
- Parameters:
data (pd.DataFrame) – DataFrame containing the outcome, treatment, and time variables.
outcome (str) – Name of the outcome variable column.
treatment (str) – Name of the treatment group indicator column (0/1).
time (str) – Name of the post-treatment period indicator column (0/1).
formula (str, optional) – R-style formula (e.g., “outcome ~ treated * post”). If provided, overrides outcome, treatment, and time parameters.
covariates (list, optional) – List of covariate column names to include as linear controls.
fixed_effects (list, optional) – List of categorical column names to include as fixed effects. Creates dummy variables for each category (drops first level). Use for low-dimensional fixed effects (e.g., industry, region).
absorb (list, optional) – List of categorical column names for high-dimensional fixed effects. Uses within-transformation (demeaning) instead of dummy variables. More efficient for large numbers of categories (e.g., firm, individual).
- Returns:
Object containing estimation results.
- Return type:
- Raises:
ValueError – If required parameters are missing or data validation fails.
Examples
Using fixed effects (dummy variables):
>>> did.fit(data, outcome='sales', treatment='treated', time='post', ... fixed_effects=['state', 'industry'])
Using absorbed fixed effects (within-transformation):
>>> did.fit(data, outcome='sales', treatment='treated', time='post', ... absorb=['firm_id'])
- predict(data)[source]
Predict outcomes using fitted model.
- Parameters:
data (pd.DataFrame) – DataFrame with same structure as training data.
- Returns:
Predicted values.
- Return type:
np.ndarray
- get_params()[source]
Get estimator parameters (sklearn-compatible).
- Returns:
Estimator parameters.
- Return type:
Dict[str, Any]
MultiPeriodDiD
Event study estimator with period-specific treatment effects.
- class diff_diff.MultiPeriodDiD[source]
Bases:
DifferenceInDifferencesMulti-Period Difference-in-Differences estimator.
Extends the standard DiD to handle multiple pre-treatment and post-treatment time periods, providing period-specific treatment effects as well as an aggregate average treatment effect.
- Parameters:
- results_
Estimation results after calling fit().
- Type:
Examples
Basic usage with multiple time periods:
>>> import pandas as pd >>> from diff_diff import MultiPeriodDiD >>> >>> # Create sample panel data with 6 time periods >>> # Periods 0-2 are pre-treatment, periods 3-5 are post-treatment >>> data = create_panel_data() # Your data >>> >>> # Fit the model >>> did = MultiPeriodDiD() >>> results = did.fit( ... data, ... outcome='sales', ... treatment='treated', ... time='period', ... post_periods=[3, 4, 5] # Specify which periods are post-treatment ... ) >>> >>> # View period-specific effects >>> for period, effect in results.period_effects.items(): ... print(f"Period {period}: {effect.effect:.3f} (SE: {effect.se:.3f})") >>> >>> # View average treatment effect >>> print(f"Average ATT: {results.avg_att:.3f}")
Notes
The model estimates:
Y_it = α + β*D_i + Σ_t γ_t*Period_t + Σ_{t≠ref} δ_t*(D_i × 1{t}) + ε_it
Where: - D_i is the treatment indicator - Period_t are time period dummies (all non-reference periods) - D_i × 1{t} are treatment-by-period interactions (all non-reference) - δ_t are the period-specific treatment effects - The reference period (default: last pre-period) has δ_ref = 0 by construction
Pre-treatment δ_t test the parallel trends assumption (should be ≈ 0). Post-treatment δ_t estimate dynamic treatment effects. The average ATT is computed from post-treatment δ_t only.
- fit(data, outcome, treatment, time, post_periods=None, covariates=None, fixed_effects=None, absorb=None, reference_period=None, unit=None)[source]
Fit the Multi-Period Difference-in-Differences model.
- Parameters:
data (pd.DataFrame) – DataFrame containing the outcome, treatment, and time variables.
outcome (str) – Name of the outcome variable column.
treatment (str) – Name of the treatment group indicator column (0/1). Should be a time-invariant ever-treated indicator (D_i = 1 for all periods of treated units). If treatment is time-varying (D_it), pre-period interaction coefficients will be unidentified.
time (str) – Name of the time period column (can have multiple values).
post_periods (list) – List of time period values that are post-treatment. All other periods are treated as pre-treatment.
covariates (list, optional) – List of covariate column names to include as linear controls.
fixed_effects (list, optional) – List of categorical column names to include as fixed effects.
absorb (list, optional) – List of categorical column names for high-dimensional fixed effects.
reference_period (any, optional) – The reference (omitted) time period for the period dummies. Defaults to the last pre-treatment period (e=-1 convention).
unit (str, optional) – Name of the unit identifier column. When provided, checks whether treatment timing varies across units and warns if staggered adoption is detected (suggests CallawaySantAnna instead). Does NOT affect standard error computation – use the
clusterparameter for cluster-robust SEs.
- Returns:
Object containing period-specific and average treatment effects.
- Return type:
- Raises:
ValueError – If required parameters are missing or data validation fails.
- __init__(robust=True, cluster=None, alpha=0.05, inference='analytical', n_bootstrap=999, bootstrap_weights='rademacher', seed=None, rank_deficient_action='warn')
- get_params()
Get estimator parameters (sklearn-compatible).
- Returns:
Estimator parameters.
- Return type:
Dict[str, Any]
- predict(data)
Predict outcomes using fitted model.
- Parameters:
data (pd.DataFrame) – DataFrame with same structure as training data.
- Returns:
Predicted values.
- Return type:
np.ndarray
- print_summary()
Print summary to stdout.
- Return type:
None
- set_params(**params)
Set estimator parameters (sklearn-compatible).
- Parameters:
**params – Estimator parameters.
- Return type:
self
TwoWayFixedEffects
Panel DiD with unit and time fixed effects.
- class diff_diff.TwoWayFixedEffects[source]
Bases:
DifferenceInDifferencesTwo-Way Fixed Effects (TWFE) estimator for panel DiD.
Extends DifferenceInDifferences to handle panel data with unit and time fixed effects.
- Parameters:
robust (bool, default=True) – Whether to use heteroskedasticity-robust standard errors.
cluster (str, optional) – Column name for cluster-robust standard errors. If None, automatically clusters at the unit level (the unit parameter passed to fit()). This differs from DifferenceInDifferences where cluster=None means no clustering.
alpha (float, default=0.05) – Significance level for confidence intervals.
Notes
This estimator uses the regression:
Y_it = α_i + γ_t + β*(D_i × Post_t) + X_it’δ + ε_it
where α_i are unit fixed effects and γ_t are time fixed effects.
Warning: TWFE can be biased with staggered treatment timing and heterogeneous treatment effects. Consider using more robust estimators (e.g., Callaway-Sant’Anna) for staggered designs.
- fit(data, outcome, treatment, time, unit, covariates=None)[source]
Fit Two-Way Fixed Effects model.
- Parameters:
- Returns:
Estimation results.
- Return type:
- decompose(data, outcome, unit, time, first_treat, weights='approximate')[source]
Perform Goodman-Bacon decomposition of TWFE estimate.
Decomposes the TWFE estimate into a weighted average of all possible 2x2 DiD comparisons, revealing which comparisons drive the estimate and whether problematic “forbidden comparisons” are involved.
- Parameters:
data (pd.DataFrame) – Panel data with unit and time identifiers.
outcome (str) – Name of outcome variable column.
unit (str) – Name of unit identifier column.
time (str) – Name of time period column.
first_treat (str) – Name of column indicating when each unit was first treated. Use 0 (or np.inf) for never-treated units.
weights (str, default="approximate") –
Weight calculation method: - “approximate”: Fast simplified formula (default). Good for
diagnostic purposes where relative weights are sufficient.
”exact”: Variance-based weights from Goodman-Bacon (2021) Theorem 1. Use for publication-quality decompositions.
- Returns:
Decomposition results showing: - TWFE estimate and its weighted-average breakdown - List of all 2x2 comparisons with estimates and weights - Total weight by comparison type (clean vs forbidden)
- Return type:
BaconDecompositionResults
Examples
>>> twfe = TwoWayFixedEffects() >>> decomp = twfe.decompose( ... data, outcome='y', unit='id', time='t', first_treat='treat_year' ... ) >>> decomp.print_summary() >>> # Check weight on forbidden comparisons >>> if decomp.total_weight_later_vs_earlier > 0.2: ... print("Warning: significant forbidden comparison weight")
Notes
This decomposition is essential for understanding potential TWFE bias in staggered adoption designs. The three comparison types are:
Treated vs Never-treated: Clean comparisons using never-treated units as controls. These are always valid.
Earlier vs Later treated: Uses later-treated units as controls before they receive treatment. These are valid.
Later vs Earlier treated: Uses already-treated units as controls. These “forbidden comparisons” can introduce bias when treatment effects are dynamic (changing over time since treatment).
See also
bacon_decomposeStandalone decomposition function
BaconDecompositionClass-based decomposition interface
CallawaySantAnnaRobust estimator that avoids forbidden comparisons
- __init__(robust=True, cluster=None, alpha=0.05, inference='analytical', n_bootstrap=999, bootstrap_weights='rademacher', seed=None, rank_deficient_action='warn')
- get_params()
Get estimator parameters (sklearn-compatible).
- Returns:
Estimator parameters.
- Return type:
Dict[str, Any]
- predict(data)
Predict outcomes using fitted model.
- Parameters:
data (pd.DataFrame) – DataFrame with same structure as training data.
- Returns:
Predicted values.
- Return type:
np.ndarray
- print_summary()
Print summary to stdout.
- Return type:
None
- set_params(**params)
Set estimator parameters (sklearn-compatible).
- Parameters:
**params – Estimator parameters.
- Return type:
self
SyntheticDiD
Synthetic control combined with DiD (Arkhangelsky et al. 2021).
- class diff_diff.SyntheticDiD[source]
Bases:
DifferenceInDifferencesSynthetic Difference-in-Differences (SDID) estimator.
Combines the strengths of Difference-in-Differences and Synthetic Control methods by re-weighting control units to better match treated units’ pre-treatment trends.
This method is particularly useful when: - You have few treated units (possibly just one) - Parallel trends assumption may be questionable - Control units are heterogeneous and need reweighting - You want robustness to pre-treatment differences
- Parameters:
zeta_omega (float, optional) – Regularization for unit weights. If None (default), auto-computed from data as
(N1 * T1)^(1/4) * noise_levelmatching R’s synthdid.zeta_lambda (float, optional) – Regularization for time weights. If None (default), auto-computed from data as
1e-6 * noise_levelmatching R’s synthdid.alpha (float, default=0.05) – Significance level for confidence intervals.
variance_method (str, default="placebo") –
Method for variance estimation: - “placebo”: Placebo-based variance matching R’s synthdid::vcov(method=”placebo”).
Implements Algorithm 4 from Arkhangelsky et al. (2021). This is R’s default.
”bootstrap”: Bootstrap at unit level with fixed weights matching R’s synthdid::vcov(method=”bootstrap”).
n_bootstrap (int, default=200) – Number of replications for variance estimation. Used for both: - Bootstrap: Number of bootstrap samples - Placebo: Number of random permutations (matches R’s replications argument)
seed (int, optional) – Random seed for reproducibility. If None (default), results will vary between runs.
- results_
Estimation results after calling fit().
- Type:
Examples
Basic usage with panel data:
>>> import pandas as pd >>> from diff_diff import SyntheticDiD >>> >>> # Panel data with units observed over multiple time periods >>> # Treatment occurs at period 5 for treated units >>> data = pd.DataFrame({ ... 'unit': [...], # Unit identifier ... 'period': [...], # Time period ... 'outcome': [...], # Outcome variable ... 'treated': [...] # 1 if unit is ever treated, 0 otherwise ... }) >>> >>> # Fit SDID model >>> sdid = SyntheticDiD() >>> results = sdid.fit( ... data, ... outcome='outcome', ... treatment='treated', ... unit='unit', ... time='period', ... post_periods=[5, 6, 7, 8] ... ) >>> >>> # View results >>> results.print_summary() >>> print(f"ATT: {results.att:.3f} (SE: {results.se:.3f})") >>> >>> # Examine unit weights >>> weights_df = results.get_unit_weights_df() >>> print(weights_df.head(10))
Notes
The SDID estimator (Arkhangelsky et al., 2021) computes:
- τ̂ = (Ȳ_treated,post - Σ_t λ_t * Y_treated,t)
Σ_j ω_j * (Ȳ_j,post - Σ_t λ_t * Y_j,t)
Where: - ω_j are unit weights (sum to 1, non-negative) - λ_t are time weights (sum to 1, non-negative)
- Unit weights ω are chosen to match pre-treatment outcomes:
min ||Σ_j ω_j * Y_j,pre - Y_treated,pre||²
This interpolates between: - Standard DiD (uniform weights): ω_j = 1/N_control - Synthetic Control (exact matching): concentrated weights
References
Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., & Wager, S. (2021). Synthetic Difference-in-Differences. American Economic Review, 111(12), 4088-4118.
- __init__(zeta_omega=None, zeta_lambda=None, alpha=0.05, variance_method='placebo', n_bootstrap=200, seed=None, lambda_reg=None, zeta=None)[source]
- fit(data, outcome, treatment, unit, time, post_periods=None, covariates=None)[source]
Fit the Synthetic Difference-in-Differences model.
- Parameters:
data (pd.DataFrame) – Panel data with observations for multiple units over multiple time periods.
outcome (str) – Name of the outcome variable column.
treatment (str) – Name of the treatment group indicator column (0/1). Should be 1 for all observations of treated units (both pre and post treatment).
unit (str) – Name of the unit identifier column.
time (str) – Name of the time period column.
post_periods (list, optional) – List of time period values that are post-treatment. If None, uses the last half of periods.
covariates (list, optional) – List of covariate column names. Covariates are residualized out before computing the SDID estimator.
- Returns:
Object containing the ATT estimate, standard error, unit weights, and time weights.
- Return type:
- Raises:
ValueError – If required parameters are missing or data validation fails.
- predict(data)
Predict outcomes using fitted model.
- Parameters:
data (pd.DataFrame) – DataFrame with same structure as training data.
- Returns:
Predicted values.
- Return type:
np.ndarray
- print_summary()
Print summary to stdout.
- Return type:
None
TripleDifference
Triple Difference (DDD) estimator for settings where treatment requires two criteria (Ortiz-Villavicencio & Sant’Anna, 2025).
- class diff_diff.TripleDifference[source]
Bases:
objectTriple Difference (DDD) estimator.
Estimates the Average Treatment effect on the Treated (ATT) when treatment requires satisfying two criteria: belonging to a treated group AND being in an eligible partition of the population.
This implementation follows Ortiz-Villavicencio & Sant’Anna (2025), which shows that naive DDD implementations (difference of two DiDs, three-way fixed effects) are invalid when covariates are needed for identification.
- Parameters:
estimation_method (str, default="dr") –
Estimation method to use: - “dr”: Doubly robust (recommended). Consistent if either the outcome
model or propensity score model is correctly specified.
”reg”: Regression adjustment (outcome regression).
”ipw”: Inverse probability weighting.
robust (bool, default=True) – Whether to use heteroskedasticity-robust standard errors. Note: influence function-based SEs are inherently robust to heteroskedasticity, so this parameter has no effect. Retained for API compatibility.
cluster (str, optional) – Column name for cluster-robust standard errors. When provided, SEs are computed using the Liang-Zeger cluster-robust variance estimator on the influence function.
alpha (float, default=0.05) – Significance level for confidence intervals.
pscore_trim (float, default=0.01) – Trimming threshold for propensity scores. Scores below this value or above (1 - pscore_trim) are clipped to avoid extreme weights.
rank_deficient_action (str, default="warn") – Action when design matrix is rank-deficient (linearly dependent columns): - “warn”: Issue warning and drop linearly dependent columns (default) - “error”: Raise ValueError - “silent”: Drop columns silently without warning
- results_
Estimation results after calling fit().
- Type:
Examples
Basic usage with a DataFrame:
>>> import pandas as pd >>> from diff_diff import TripleDifference >>> >>> # Data where treatment affects women (partition=1) in states >>> # that enacted a policy (group=1) >>> data = pd.DataFrame({ ... 'outcome': [...], ... 'group': [1, 1, 0, 0, ...], # 1=policy state, 0=control state ... 'partition': [1, 0, 1, 0, ...], # 1=women, 0=men ... 'post': [0, 0, 1, 1, ...], # 1=post-treatment period ... }) >>> >>> # Fit using doubly robust estimation >>> ddd = TripleDifference(estimation_method="dr") >>> results = ddd.fit( ... data, ... outcome='outcome', ... group='group', ... partition='partition', ... time='post' ... ) >>> print(results.att) # ATT estimate
With covariates (properly handled unlike naive DDD):
>>> results = ddd.fit( ... data, ... outcome='outcome', ... group='group', ... partition='partition', ... time='post', ... covariates=['age', 'income'] ... )
Notes
The DDD estimator is appropriate when:
Treatment affects only units satisfying BOTH criteria: - Belonging to a treated group (G=1), e.g., states with a policy - Being in an eligible partition (P=1), e.g., women, low-income
The DDD parallel trends assumption holds: the differential trend between eligible and ineligible partitions would have been the same across treated and control groups, absent treatment.
This is weaker than requiring separate parallel trends for two DiDs, as biases can cancel out in the differencing.
References
Methods
fit(data, outcome, group, partition, time[, ...])Fit the Triple Difference model.
Get estimator parameters (sklearn-compatible).
set_params(**params)Set estimator parameters (sklearn-compatible).
- __init__(estimation_method='dr', robust=True, cluster=None, alpha=0.05, pscore_trim=0.01, rank_deficient_action='warn')[source]
- results_: TripleDifferenceResults | None
- fit(data, outcome, group, partition, time, covariates=None)[source]
Fit the Triple Difference model.
- Parameters:
data (pd.DataFrame) – DataFrame containing all variables.
outcome (str) – Name of the outcome variable column.
group (str) – Name of the group indicator column (0/1). 1 = treated group (e.g., states that enacted policy). 0 = control group.
partition (str) – Name of the partition/eligibility indicator column (0/1). 1 = eligible partition (e.g., women, targeted demographic). 0 = ineligible partition.
time (str) – Name of the time period indicator column (0/1). 1 = post-treatment period. 0 = pre-treatment period.
covariates (list of str, optional) – List of covariate column names to adjust for. These are properly incorporated using the selected estimation method (unlike naive DDD implementations).
- Returns:
Object containing estimation results.
- Return type:
- Raises:
ValueError – If required columns are missing or data validation fails.
- get_params()[source]
Get estimator parameters (sklearn-compatible).
- Returns:
Estimator parameters.
- Return type:
Dict[str, Any]
TripleDifferenceResults
Results container for Triple Difference estimation.
- class diff_diff.triple_diff.TripleDifferenceResults[source]
Bases:
objectResults from Triple Difference (DDD) estimation.
Provides access to the estimated average treatment effect on the treated (ATT), standard errors, confidence intervals, and diagnostic information.
- att
Average Treatment effect on the Treated (ATT). This is the effect on units in the treated group (G=1) and eligible partition (P=1) after treatment (T=1).
- Type:
- estimation_method
Estimation method used: “dr” (doubly robust), “reg” (regression adjustment), or “ipw” (inverse probability weighting).
- Type:
- print_summary(alpha=None)[source]
Print the summary to stdout.
- Parameters:
alpha (float | None)
- Return type:
None
- to_dict()[source]
Convert results to a dictionary.
- Returns:
Dictionary containing all estimation results.
- Return type:
Dict[str, Any]
- to_dataframe()[source]
Convert results to a pandas DataFrame.
- Returns:
DataFrame with estimation results.
- Return type:
pd.DataFrame
- __init__(att, se, t_stat, p_value, conf_int, n_obs, n_treated_eligible, n_treated_ineligible, n_control_eligible, n_control_ineligible, estimation_method, alpha=0.05, group_means=None, pscore_stats=None, r_squared=None, covariate_balance=None, inference_method='analytical', n_bootstrap=None, n_clusters=None)
- Parameters:
att (float)
se (float)
t_stat (float)
p_value (float)
n_obs (int)
n_treated_eligible (int)
n_treated_ineligible (int)
n_control_eligible (int)
n_control_ineligible (int)
estimation_method (str)
alpha (float)
r_squared (float | None)
covariate_balance (DataFrame | None)
inference_method (str)
n_bootstrap (int | None)
n_clusters (int | None)
- Return type:
None
Convenience Function
- diff_diff.triple_difference(data, outcome, group, partition, time, covariates=None, estimation_method='dr', robust=True, cluster=None, alpha=0.05, rank_deficient_action='warn')[source]
Estimate Triple Difference (DDD) treatment effect.
Convenience function that creates a TripleDifference estimator and fits it to the data in one step.
- Parameters:
data (pd.DataFrame) – DataFrame containing all variables.
outcome (str) – Name of the outcome variable column.
group (str) – Name of the group indicator column (0/1). 1 = treated group (e.g., states that enacted policy).
partition (str) – Name of the partition/eligibility indicator column (0/1). 1 = eligible partition (e.g., women, targeted demographic).
time (str) – Name of the time period indicator column (0/1). 1 = post-treatment period.
covariates (list of str, optional) – List of covariate column names to adjust for.
estimation_method (str, default="dr") – Estimation method: “dr” (doubly robust), “reg” (regression), or “ipw” (inverse probability weighting).
robust (bool, default=True) – Whether to use heteroskedasticity-robust standard errors. Note: influence function-based SEs are inherently robust to heteroskedasticity, so this parameter has no effect. Retained for API compatibility.
cluster (str, optional) – Column name for cluster-robust standard errors.
alpha (float, default=0.05) – Significance level for confidence intervals.
rank_deficient_action (str, default="warn") – Action when design matrix is rank-deficient: - “warn”: Issue warning and drop linearly dependent columns (default) - “error”: Raise ValueError - “silent”: Drop columns silently without warning
- Returns:
Object containing estimation results.
- Return type:
Examples
>>> from diff_diff import triple_difference >>> results = triple_difference( ... data, ... outcome='earnings', ... group='policy_state', ... partition='female', ... time='post_policy', ... covariates=['age', 'education'] ... ) >>> print(f"ATT: {results.att:.3f} (SE: {results.se:.3f})")