Triple Difference (DDD)
Triple Difference estimator for designs where treatment requires two criteria.
This module implements the methodology from Ortiz-Villavicencio & Sant’Anna (2025), which correctly handles covariate adjustment in DDD designs. Unlike naive implementations that difference two DiDs, this approach provides valid estimates when identification requires conditioning on covariates.
When to use DDD instead of DiD:
DDD allows for violations of parallel trends that are:
Group-specific (e.g., economic shocks affecting treatment states)
Partition-specific (e.g., trends affecting women everywhere)
As long as these biases are additive, DDD differences them out. The key assumption is that the differential trend between eligible and ineligible units would be the same across groups.
Reference: Ortiz-Villavicencio, M., & Sant’Anna, P. H. C. (2025). Better Understanding Triple Differences Estimators. Working Paper. arXiv:2505.09942
TripleDifference
Main estimator class for Triple Difference designs.
- class diff_diff.TripleDifference[source]
Bases:
objectTriple Difference (DDD) estimator.
Estimates the Average Treatment effect on the Treated (ATT) when treatment requires satisfying two criteria: belonging to a treated group AND being in an eligible partition of the population.
This implementation follows Ortiz-Villavicencio & Sant’Anna (2025), which shows that naive DDD implementations (difference of two DiDs, three-way fixed effects) are invalid when covariates are needed for identification.
- Parameters:
estimation_method (str, default="dr") –
Estimation method to use: - “dr”: Doubly robust (recommended). Consistent if either the outcome
model or propensity score model is correctly specified.
”reg”: Regression adjustment (outcome regression).
”ipw”: Inverse probability weighting.
robust (bool, default=True) – Whether to use heteroskedasticity-robust standard errors. Note: influence function-based SEs are inherently robust to heteroskedasticity, so this parameter has no effect. Retained for API compatibility.
cluster (str, optional) – Column name for cluster-robust standard errors. When provided, SEs are computed using the Liang-Zeger cluster-robust variance estimator on the influence function.
alpha (float, default=0.05) – Significance level for confidence intervals.
pscore_trim (float, default=0.01) – Trimming threshold for propensity scores. Scores below this value or above (1 - pscore_trim) are clipped to avoid extreme weights.
rank_deficient_action (str, default="warn") – Action when design matrix is rank-deficient (linearly dependent columns): - “warn”: Issue warning and drop linearly dependent columns (default) - “error”: Raise ValueError - “silent”: Drop columns silently without warning
- results_
Estimation results after calling fit().
- Type:
Examples
Basic usage with a DataFrame:
>>> import pandas as pd >>> from diff_diff import TripleDifference >>> >>> # Data where treatment affects women (partition=1) in states >>> # that enacted a policy (group=1) >>> data = pd.DataFrame({ ... 'outcome': [...], ... 'group': [1, 1, 0, 0, ...], # 1=policy state, 0=control state ... 'partition': [1, 0, 1, 0, ...], # 1=women, 0=men ... 'post': [0, 0, 1, 1, ...], # 1=post-treatment period ... }) >>> >>> # Fit using doubly robust estimation >>> ddd = TripleDifference(estimation_method="dr") >>> results = ddd.fit( ... data, ... outcome='outcome', ... group='group', ... partition='partition', ... time='post' ... ) >>> print(results.att) # ATT estimate
With covariates (properly handled unlike naive DDD):
>>> results = ddd.fit( ... data, ... outcome='outcome', ... group='group', ... partition='partition', ... time='post', ... covariates=['age', 'income'] ... )
Notes
The DDD estimator is appropriate when:
Treatment affects only units satisfying BOTH criteria: - Belonging to a treated group (G=1), e.g., states with a policy - Being in an eligible partition (P=1), e.g., women, low-income
The DDD parallel trends assumption holds: the differential trend between eligible and ineligible partitions would have been the same across treated and control groups, absent treatment.
This is weaker than requiring separate parallel trends for two DiDs, as biases can cancel out in the differencing.
References
Methods
fit(data, outcome, group, partition, time[, ...])Fit the Triple Difference model.
Get estimator parameters (sklearn-compatible).
set_params(**params)Set estimator parameters (sklearn-compatible).
- __init__(estimation_method='dr', robust=True, cluster=None, alpha=0.05, pscore_trim=0.01, rank_deficient_action='warn')[source]
- results_: TripleDifferenceResults | None
- fit(data, outcome, group, partition, time, covariates=None)[source]
Fit the Triple Difference model.
- Parameters:
data (pd.DataFrame) – DataFrame containing all variables.
outcome (str) – Name of the outcome variable column.
group (str) – Name of the group indicator column (0/1). 1 = treated group (e.g., states that enacted policy). 0 = control group.
partition (str) – Name of the partition/eligibility indicator column (0/1). 1 = eligible partition (e.g., women, targeted demographic). 0 = ineligible partition.
time (str) – Name of the time period indicator column (0/1). 1 = post-treatment period. 0 = pre-treatment period.
covariates (list of str, optional) – List of covariate column names to adjust for. These are properly incorporated using the selected estimation method (unlike naive DDD implementations).
- Returns:
Object containing estimation results.
- Return type:
- Raises:
ValueError – If required columns are missing or data validation fails.
- get_params()[source]
Get estimator parameters (sklearn-compatible).
- Returns:
Estimator parameters.
- Return type:
Dict[str, Any]
TripleDifferenceResults
Results container for Triple Difference estimation.
- class diff_diff.TripleDifferenceResults[source]
Bases:
objectResults from Triple Difference (DDD) estimation.
Provides access to the estimated average treatment effect on the treated (ATT), standard errors, confidence intervals, and diagnostic information.
- att
Average Treatment effect on the Treated (ATT). This is the effect on units in the treated group (G=1) and eligible partition (P=1) after treatment (T=1).
- Type:
- estimation_method
Estimation method used: “dr” (doubly robust), “reg” (regression adjustment), or “ipw” (inverse probability weighting).
- Type:
Methods
summary([alpha])Generate a formatted summary of the estimation results.
print_summary([alpha])Print the summary to stdout.
to_dict()Convert results to a dictionary.
Convert results to a pandas DataFrame.
- print_summary(alpha=None)[source]
Print the summary to stdout.
- Parameters:
alpha (float | None)
- Return type:
None
- to_dict()[source]
Convert results to a dictionary.
- Returns:
Dictionary containing all estimation results.
- Return type:
Dict[str, Any]
- to_dataframe()[source]
Convert results to a pandas DataFrame.
- Returns:
DataFrame with estimation results.
- Return type:
pd.DataFrame
- __init__(att, se, t_stat, p_value, conf_int, n_obs, n_treated_eligible, n_treated_ineligible, n_control_eligible, n_control_ineligible, estimation_method, alpha=0.05, group_means=None, pscore_stats=None, r_squared=None, covariate_balance=None, inference_method='analytical', n_bootstrap=None, n_clusters=None)
- Parameters:
att (float)
se (float)
t_stat (float)
p_value (float)
n_obs (int)
n_treated_eligible (int)
n_treated_ineligible (int)
n_control_eligible (int)
n_control_ineligible (int)
estimation_method (str)
alpha (float)
r_squared (float | None)
covariate_balance (DataFrame | None)
inference_method (str)
n_bootstrap (int | None)
n_clusters (int | None)
- Return type:
None
Convenience Function
- diff_diff.triple_difference(data, outcome, group, partition, time, covariates=None, estimation_method='dr', robust=True, cluster=None, alpha=0.05, rank_deficient_action='warn')[source]
Estimate Triple Difference (DDD) treatment effect.
Convenience function that creates a TripleDifference estimator and fits it to the data in one step.
- Parameters:
data (pd.DataFrame) – DataFrame containing all variables.
outcome (str) – Name of the outcome variable column.
group (str) – Name of the group indicator column (0/1). 1 = treated group (e.g., states that enacted policy).
partition (str) – Name of the partition/eligibility indicator column (0/1). 1 = eligible partition (e.g., women, targeted demographic).
time (str) – Name of the time period indicator column (0/1). 1 = post-treatment period.
covariates (list of str, optional) – List of covariate column names to adjust for.
estimation_method (str, default="dr") – Estimation method: “dr” (doubly robust), “reg” (regression), or “ipw” (inverse probability weighting).
robust (bool, default=True) – Whether to use heteroskedasticity-robust standard errors. Note: influence function-based SEs are inherently robust to heteroskedasticity, so this parameter has no effect. Retained for API compatibility.
cluster (str, optional) – Column name for cluster-robust standard errors.
alpha (float, default=0.05) – Significance level for confidence intervals.
rank_deficient_action (str, default="warn") – Action when design matrix is rank-deficient: - “warn”: Issue warning and drop linearly dependent columns (default) - “error”: Raise ValueError - “silent”: Drop columns silently without warning
- Returns:
Object containing estimation results.
- Return type:
Examples
>>> from diff_diff import triple_difference >>> results = triple_difference( ... data, ... outcome='earnings', ... group='policy_state', ... partition='female', ... time='post_policy', ... covariates=['age', 'education'] ... ) >>> print(f"ATT: {results.att:.3f} (SE: {results.se:.3f})")
Estimation Methods
The estimator supports three estimation methods:
Method |
Description |
When to use |
|---|---|---|
|
Doubly robust |
Recommended. Consistent if either outcome or propensity model is correct |
|
Regression adjustment |
Simple outcome regression with full interactions |
|
Inverse probability weighting |
When propensity score model is well-specified |
Example Usage
Basic usage:
from diff_diff import TripleDifference
ddd = TripleDifference(estimation_method='dr')
results = ddd.fit(
data,
outcome='wages',
group='policy_state', # 1=state enacted policy, 0=control state
partition='female', # 1=women (affected by policy), 0=men
time='post' # 1=post-policy, 0=pre-policy
)
results.print_summary()
With covariates:
results = ddd.fit(
data,
outcome='wages',
group='policy_state',
partition='female',
time='post',
covariates=['age', 'education', 'experience']
)
Using the convenience function:
from diff_diff import triple_difference
results = triple_difference(
data,
outcome='wages',
group='policy_state',
partition='female',
time='post',
estimation_method='dr'
)