Interactive notebook
This tutorial is a Jupyter notebook. You can view it on GitHub or download it to run locally.
Triply Robust Panel (TROP) Estimator#
This notebook demonstrates the Triply Robust Panel (TROP) estimator (Athey, Imbens, Qu & Viviano, 2025), which combines three robustness components:
Nuclear Norm Regularized Factor Model: Estimates interactive fixed effects via matrix completion with nuclear norm penalty
Exponential Distance-Based Unit Weights: ω_j = exp(-λ_unit × dist(j,i)) where dist(j,i) is the root mean squared difference in outcomes between units j and i, computed only on periods where both units are untreated and excluding the target period t (Equation 3 in the paper)
Exponential Time Decay Weights: θ_s = exp(-λ_time × |s-t|) weighting by proximity to treatment
Weights: The observation-specific weights ω and θ are importance weights that control the relative contribution of each observation to counterfactual estimation. Higher weights indicate more relevant observations for the target counterfactual.
TROP is particularly useful when:
There may be unobserved time-varying confounders with factor structure
Standard DiD or SDID may be biased due to latent factors
You want robust inference under factor confounding
We’ll cover:
When to use TROP
Basic estimation with LOOCV tuning
Understanding tuning parameters
Examining factor structure
Comparing TROP vs SDID
[ ]:
import numpy as np
import pandas as pd
from diff_diff import TROP, trop, SyntheticDiD
# For nicer plots (optional)
try:
import matplotlib.pyplot as plt
plt.style.use('seaborn-v0_8-whitegrid')
HAS_MATPLOTLIB = True
except ImportError:
HAS_MATPLOTLIB = False
print("matplotlib not installed - visualization examples will be skipped")
1. When to Use TROP#
Consider TROP when:
You suspect factor structure in the data (e.g., economic cycles, regional shocks)
Unobserved confounders affect units differently over time
Standard parallel trends assumption may be violated due to common factors
You have a reasonably long pre-treatment period to estimate factors
The key difference from SDID is that TROP explicitly models and removes interactive fixed effects (factor contributions) before computing treatment effects.
[ ]:
# Generate factor model data using the library function
from diff_diff import generate_factor_data
# True parameters for verification
true_att = 2.0
n_factors = 2
n_pre = 6 # Reduced from 10 for faster execution
n_post = 3 # Reduced from 5
# Generate panel data with factor structure
# This creates a scenario where standard DiD/SDID may be biased,
# but TROP should recover the true treatment effect.
df = generate_factor_data(
n_units=30, # Reduced from 50 for faster execution
n_pre=n_pre,
n_post=n_post,
n_treated=6, # Reduced from 10
n_factors=n_factors,
treatment_effect=true_att,
factor_strength=1.5, # Strong factor confounding
treated_loading_shift=0.5,
unit_fe_sd=1.0,
noise_sd=0.5,
seed=42
)
print(f"Dataset: {len(df)} observations")
print(f"Treated units: 6")
print(f"Control units: 24")
print(f"Pre-treatment periods: {n_pre}")
print(f"Post-treatment periods: {n_post}")
print(f"True treatment effect: {true_att}")
print(f"True number of factors: {n_factors}")
[ ]:
if HAS_MATPLOTLIB:
# Visualize the data
fig, ax = plt.subplots(figsize=(12, 6))
# Identify treated vs control units
treated_units = df.groupby('unit')['treated'].max()
control_unit_ids = treated_units[treated_units == 0].index[:20] # First 20 controls
treated_unit_ids = treated_units[treated_units == 1].index[:5] # First 5 treated
# Plot control units (gray, thin lines)
for unit_id in control_unit_ids:
unit_data = df[df['unit'] == unit_id]
ax.plot(unit_data['period'], unit_data['outcome'],
color='gray', alpha=0.3, linewidth=0.5)
# Plot treated units (colored, thick lines)
colors = plt.cm.Reds(np.linspace(0.4, 0.9, 5))
for i, unit_id in enumerate(treated_unit_ids):
unit_data = df[df['unit'] == unit_id]
ax.plot(unit_data['period'], unit_data['outcome'],
color=colors[i], linewidth=2, label=f'Treated {i+1}')
# Mark treatment time
ax.axvline(x=n_pre - 0.5, color='black', linestyle='--', label='Treatment')
ax.set_xlabel('Period')
ax.set_ylabel('Outcome')
ax.set_title('Panel Data with Factor Structure')
ax.legend(loc='upper left')
plt.tight_layout()
plt.show()
2. Basic TROP Estimation#
TROP uses leave-one-out cross-validation (LOOCV) to select three tuning parameters:
λ_time: Time weight decay (higher = focus on periods near treatment)
λ_unit: Unit weight decay (higher = focus on similar units)
λ_nn: Nuclear norm regularization (higher = lower rank factor model)
By default, TROP searches over a grid of values for each parameter.
[ ]:
# Fit TROP with automatic tuning via LOOCV
trop_est = TROP(
lambda_time_grid=[0.0, 1.0], # Reduced time decay grid
lambda_unit_grid=[0.0, 1.0], # Reduced unit distance grid
lambda_nn_grid=[0.0, 0.1], # Reduced nuclear norm grid
n_bootstrap=50, # Reduced bootstrap replications for SE
seed=42
)
# Note: TROP infers treatment periods from the treatment indicator column.
# The 'treated' column should be an absorbing state (D=1 for all periods
# during and after treatment starts).
# For SDID comparison later, we keep post_periods for SDID
post_periods = list(range(n_pre, n_pre + n_post))
results = trop_est.fit(
df,
outcome='outcome',
treatment='treated',
unit='unit',
time='period'
)
print(results.summary())
[ ]:
# Check the key results
print(f"True ATT: {true_att:.4f}")
print(f"Estimated ATT: {results.att:.4f}")
print(f"Bias: {results.att - true_att:.4f}")
print()
print(f"Selected tuning parameters:")
print(f" λ_time: {results.lambda_time:.2f}")
print(f" λ_unit: {results.lambda_unit:.2f}")
print(f" λ_nn: {results.lambda_nn:.2f}")
print(f"\nEffective rank of factor matrix: {results.effective_rank:.2f}")
print(f"True rank: {n_factors}")
3. Understanding the Tuning Parameters#
The three tuning parameters control different aspects of the estimation:
λ_time (Time Decay)#
Controls how much weight to place on periods close to treatment:
λ_time = 0: Equal weight to all pre-treatment periods
λ_time > 0: More weight on recent pre-treatment periods
λ_unit (Unit Distance)#
Controls how much weight to place on similar control units:
λ_unit = 0: Equal weight to all control units
λ_unit > 0: More weight on control units with similar pre-treatment trajectories
The distance between units j and i for target observation (i, t) is computed as the root mean squared difference in outcomes, using only periods where:
Both units are untreated (D_js = D_is = 0)
The target period t is excluded (following Equation 3 in the paper: 1{u ≠ t})
This ensures the distance measure is based purely on pre-treatment comparability, not contaminated by the treatment period itself.
λ_nn (Nuclear Norm)#
Controls the rank of the factor model:
λ_nn = 0: No regularization (full rank)
λ_nn > 0: Encourages low-rank factor structure
[ ]:
# Effect of different nuclear norm regularization levels
print("Effect of nuclear norm regularization (λ_nn):")
print("="*65)
print(f"{'λ_nn':>10} {'ATT':>12} {'Bias':>12} {'Eff. Rank':>15}")
print("-"*65)
for lambda_nn in [0.0, 0.1, 1.0]: # Reduced grid
trop_fixed = TROP(
lambda_time_grid=[1.0], # Fixed
lambda_unit_grid=[1.0], # Fixed
lambda_nn_grid=[lambda_nn], # Vary this
n_bootstrap=20, # Reduced for faster execution
seed=42
)
res = trop_fixed.fit(
df,
outcome='outcome',
treatment='treated',
unit='unit',
time='period'
)
bias = res.att - true_att
print(f"{lambda_nn:>10.1f} {res.att:>12.4f} {bias:>12.4f} {res.effective_rank:>15.2f}")
4. Examining the Factor Structure#
TROP estimates a low-rank factor matrix L that captures interactive fixed effects. We can examine this structure.
[ ]:
# Examine the factor matrix
L = results.factor_matrix
print(f"Factor matrix shape: {L.shape} (periods x units)")
print(f"Effective rank: {results.effective_rank:.2f}")
# Compute singular values to see rank structure
U, s, Vt = np.linalg.svd(L, full_matrices=False)
print(f"\nSingular values (top 5): {s[:5].round(2)}")
print(f"Variance explained by top 2: {(s[:2]**2).sum() / (s**2).sum() * 100:.1f}%")
[ ]:
if HAS_MATPLOTLIB:
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
# Scree plot of singular values
ax1 = axes[0]
ax1.bar(range(1, min(11, len(s)+1)), s[:10])
ax1.set_xlabel('Component')
ax1.set_ylabel('Singular Value')
ax1.set_title('Scree Plot of Factor Matrix')
ax1.axhline(y=0, color='gray', linestyle='-', linewidth=0.5)
# Heatmap of factor matrix
ax2 = axes[1]
im = ax2.imshow(L, aspect='auto', cmap='RdBu_r', vmin=-2, vmax=2)
ax2.set_xlabel('Unit')
ax2.set_ylabel('Period')
ax2.set_title('Factor Matrix L (Interactive Fixed Effects)')
ax2.axhline(y=n_pre - 0.5, color='black', linestyle='--', linewidth=2)
plt.colorbar(im, ax=ax2, label='L_it')
plt.tight_layout()
plt.show()
5. Examining Unit and Time Effects#
TROP also estimates traditional unit and time fixed effects (α_i and β_t).
[ ]:
# Unit effects
unit_effects_df = results.get_unit_effects_df()
print("Unit effects (first 10):")
print(unit_effects_df.head(10).to_string(index=False))
[ ]:
# Time effects
time_effects_df = results.get_time_effects_df()
print("Time effects:")
print(time_effects_df.to_string(index=False))
[ ]:
if HAS_MATPLOTLIB:
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
# Unit effects
ax1 = axes[0]
ax1.bar(range(len(unit_effects_df)), unit_effects_df['effect'])
ax1.axvline(x=9.5, color='red', linestyle='--', label='Treated/Control boundary')
ax1.set_xlabel('Unit')
ax1.set_ylabel('Effect')
ax1.set_title('Unit Fixed Effects (α_i)')
ax1.legend()
# Time effects
ax2 = axes[1]
ax2.plot(time_effects_df['time'], time_effects_df['effect'], 'o-', linewidth=2)
ax2.axvline(x=n_pre - 0.5, color='black', linestyle='--', label='Treatment')
ax2.set_xlabel('Period')
ax2.set_ylabel('Effect')
ax2.set_title('Time Fixed Effects (β_t)')
ax2.legend()
plt.tight_layout()
plt.show()
6. Comparing TROP vs SDID#
Let’s compare TROP with Synthetic DiD to see the benefit of factor adjustment when the DGP has factor structure.
[ ]:
# SDID (no factor adjustment)
# Note: SDID uses 'treat' (unit-level ever-treated indicator)
sdid = SyntheticDiD(
n_bootstrap=50, # Reduced for faster execution
seed=42
)
# SDID still uses post_periods parameter
sdid_results = sdid.fit(
df,
outcome='outcome',
treatment='treat', # Unit-level ever-treated indicator
unit='unit',
time='period',
post_periods=post_periods
)
# TROP (with factor adjustment)
# Note: TROP uses 'treated' (observation-level treatment indicator)
# and infers treatment periods automatically
trop_est2 = TROP(
lambda_nn_grid=[0.0, 0.1], # Reduced grid for faster execution
n_bootstrap=50, # Reduced for faster execution
seed=42
)
trop_results = trop_est2.fit(
df,
outcome='outcome',
treatment='treated', # Observation-level indicator
unit='unit',
time='period'
)
print("Comparison: SDID vs TROP")
print("="*60)
print(f"True ATT: {true_att:.4f}")
print()
print(f"Synthetic DiD (no factor adjustment):")
print(f" ATT: {sdid_results.att:.4f}")
print(f" SE: {sdid_results.se:.4f}")
print(f" Bias: {sdid_results.att - true_att:.4f}")
print()
print(f"TROP (with factor adjustment):")
print(f" ATT: {trop_results.att:.4f}")
print(f" SE: {trop_results.se:.4f}")
print(f" Bias: {trop_results.att - true_att:.4f}")
print(f" Effective rank: {trop_results.effective_rank:.2f}")
7. Monte Carlo Comparison#
Let’s run a small Monte Carlo simulation to compare TROP and SDID under the factor DGP.
[ ]:
# Monte Carlo comparison (reduced for faster tutorial execution)
n_sims = 5 # Reduced from 20 for faster validation
trop_estimates = []
sdid_estimates = []
print(f"Running {n_sims} simulations...")
for sim in range(n_sims):
# Generate new data using the library function
# (includes both 'treated' and 'treat' columns)
sim_data = generate_factor_data(
n_units=50,
n_pre=10,
n_post=5,
n_treated=10,
n_factors=2,
treatment_effect=2.0,
factor_strength=1.5,
noise_sd=0.5,
seed=100 + sim
)
# TROP (uses observation-level 'treated')
# Note: TROP infers treatment periods from the treatment indicator
try:
trop_m = TROP(
lambda_time_grid=[1.0],
lambda_unit_grid=[1.0],
lambda_nn_grid=[0.1],
n_bootstrap=10,
seed=42 + sim
)
trop_res = trop_m.fit(
sim_data,
outcome='outcome',
treatment='treated',
unit='unit',
time='period'
)
trop_estimates.append(trop_res.att)
except Exception as e:
print(f"TROP failed on sim {sim}: {e}")
# SDID (uses unit-level 'treat')
# Note: SDID still uses post_periods parameter
try:
sdid_m = SyntheticDiD(n_bootstrap=10, seed=42 + sim)
sdid_res = sdid_m.fit(
sim_data,
outcome='outcome',
treatment='treat', # Unit-level ever-treated indicator
unit='unit',
time='period',
post_periods=list(range(10, 15))
)
sdid_estimates.append(sdid_res.att)
except Exception as e:
print(f"SDID failed on sim {sim}: {e}")
print(f"\nMonte Carlo Results (True ATT = {true_att})")
print("="*60)
print(f"{'Estimator':<15} {'Mean':>12} {'Bias':>12} {'RMSE':>12}")
print("-"*60)
if trop_estimates:
trop_mean = np.mean(trop_estimates)
trop_bias = trop_mean - true_att
trop_rmse = np.sqrt(np.mean([(e - true_att)**2 for e in trop_estimates]))
print(f"{'TROP':<15} {trop_mean:>12.4f} {trop_bias:>12.4f} {trop_rmse:>12.4f}")
if sdid_estimates:
sdid_mean = np.mean(sdid_estimates)
sdid_bias = sdid_mean - true_att
sdid_rmse = np.sqrt(np.mean([(e - true_att)**2 for e in sdid_estimates]))
print(f"{'SDID':<15} {sdid_mean:>12.4f} {sdid_bias:>12.4f} {sdid_rmse:>12.4f}")
[ ]:
if HAS_MATPLOTLIB and trop_estimates and sdid_estimates:
# Visualize Monte Carlo results
fig, ax = plt.subplots(figsize=(10, 6))
ax.hist(sdid_estimates, bins=15, alpha=0.6, label='SDID', color='blue')
ax.hist(trop_estimates, bins=15, alpha=0.6, label='TROP', color='red')
ax.axvline(x=true_att, color='black', linewidth=2, linestyle='--', label=f'True ATT = {true_att}')
ax.set_xlabel('Estimated ATT')
ax.set_ylabel('Frequency')
ax.set_title('Monte Carlo Distribution of Estimates')
ax.legend()
plt.tight_layout()
plt.show()
8. Using the Convenience Function#
For quick estimation, you can use the trop() convenience function.
[ ]:
# One-liner estimation with default tuning grid
# Note: TROP infers treatment periods from the treatment indicator
quick_results = trop(
df,
outcome='outcome',
treatment='treated',
unit='unit',
time='period',
n_bootstrap=20, # Reduced for faster execution
seed=42
)
print(f"Quick estimation:")
print(f" ATT: {quick_results.att:.4f}")
print(f" SE: {quick_results.se:.4f}")
print(f" λ_time: {quick_results.lambda_time:.2f}")
print(f" λ_unit: {quick_results.lambda_unit:.2f}")
print(f" λ_nn: {quick_results.lambda_nn:.2f}")
print(f" Effective rank: {quick_results.effective_rank:.2f}")
9. Variance Estimation#
TROP uses unit-level stratified block bootstrap for variance estimation, as specified in Algorithm 3 of Athey et al. (2025). Control and treated units are sampled separately to preserve the treatment ratio. The number of bootstrap replications is controlled by the n_bootstrap parameter (default: 200).
[ ]:
# Bootstrap variance with different numbers of replications
print("Bootstrap replications comparison:")
print("="*50)
for n_boot in [20, 50, 100]:
trop_var = TROP(
lambda_time_grid=[1.0],
lambda_unit_grid=[1.0],
lambda_nn_grid=[0.1],
n_bootstrap=n_boot,
seed=42
)
res = trop_var.fit(
df,
outcome='outcome',
treatment='treated',
unit='unit',
time='period'
)
print(f"\nn_bootstrap={n_boot}:")
print(f" ATT: {res.att:.4f}")
print(f" SE: {res.se:.4f}")
print(f" 95% CI: [{res.conf_int[0]:.4f}, {res.conf_int[1]:.4f}]")
[ ]:
# Compare estimation methods
print("Estimation method comparison:")
print("="*60)
import time
# Local method (default)
start = time.time()
trop_local = TROP(
method='local',
lambda_time_grid=[0.0, 1.0],
lambda_unit_grid=[0.0, 1.0],
lambda_nn_grid=[0.0, 0.1],
n_bootstrap=20,
seed=42
)
results_local = trop_local.fit(
df,
outcome='outcome',
treatment='treated',
unit='unit',
time='period'
)
local_time = time.time() - start
# Global method
start = time.time()
trop_global = TROP(
method='global',
lambda_time_grid=[0.0, 1.0],
lambda_unit_grid=[0.0, 1.0],
lambda_nn_grid=[0.0, 0.1],
n_bootstrap=20,
seed=42
)
results_global = trop_global.fit(
df,
outcome='outcome',
treatment='treated',
unit='unit',
time='period'
)
global_time = time.time() - start
print(f"\n{'Method':<15} {'ATT':>10} {'SE':>10} {'Time (s)':>12}")
print("-"*60)
print(f"{'Local':<15} {results_local.att:>10.4f} {results_local.se:>10.4f} {local_time:>12.2f}")
print(f"{'Global':<15} {results_global.att:>10.4f} {results_global.se:>10.4f} {global_time:>12.2f}")
print(f"\nTrue ATT: {true_att}")
print(f"Local bias: {results_local.att - true_att:.4f}")
print(f"Global bias: {results_global.att - true_att:.4f}")
10. Estimation Methods: Local vs Global#
TROP supports two estimation methods via the method parameter:
Local Method (method='local', default):
Follows Algorithm 2 from the paper
Computes observation-specific weights for each treated observation
Fits a model per treated observation, then averages the individual effects
More flexible, allows for heterogeneous treatment effects
Computationally intensive (N_treated optimizations)
Global Method (method='global'):
Fits a single model on control data using (1-W) masked weights (per paper Eq. 2)
Extracts per-observation treatment effects as post-hoc residuals: τ_it = Y_it - μ - α_i - β_t - L_it
ATT = mean(τ_it) over treated observations
Faster (single optimization) with global weights
Note: method='twostep' is a deprecated alias for method='local', and method='joint' is a deprecated alias for method='global'. Both will be removed in v3.0.
10. Results Export#
TROP results can be easily exported to different formats.
[ ]:
# Convert to dictionary
results_dict = results.to_dict()
print("Results as dictionary:")
for key, value in results_dict.items():
if isinstance(value, float):
print(f" {key}: {value:.4f}")
else:
print(f" {key}: {value}")
Summary#
Key takeaways for TROP:
Best use cases: Factor confounding, unobserved time-varying confounders with interactive effects
Factor estimation: Nuclear norm regularization with LOOCV for tuning
Three tuning parameters: λ_time, λ_unit, λ_nn selected automatically via LOOCV
Unit weights: Exponential distance-based weighting of control units, where distance is computed as RMS outcome difference on control periods excluding the target period
Time weights: Exponential decay weighting of pre-treatment periods
Weights: Importance weights controlling relative contribution of observations (higher = more relevant)
Estimation methods:
method='local'(default): Per-observation estimation, allows heterogeneous effectsmethod='global': Single model with (1-W) masking, post-hoc heterogeneous effects, faster
When to use TROP vs SDID:
Use SDID when parallel trends is plausible and factors are not a concern
Use TROP when you suspect factor confounding (regional shocks, economic cycles, latent factors)
Running both provides a useful robustness check
When to use local vs global method:
Use local (default) for maximum flexibility with per-observation weights
Use global for faster estimation with global weights
Reference:
Athey, S., Imbens, G. W., Qu, Z., & Viviano, D. (2025). Triply Robust Panel Estimators. Working Paper. https://arxiv.org/abs/2508.21536
[ ]:
# Individual treatment effects
treatment_effects_df = results.get_treatment_effects_df()
print("\nIndividual treatment effects (first 10):")
print(treatment_effects_df.head(10).to_string(index=False))
Summary#
Key takeaways for TROP:
Best use cases: Factor confounding, unobserved time-varying confounders with interactive effects
Factor estimation: Nuclear norm regularization with LOOCV for tuning
Three tuning parameters: λ_time, λ_unit, λ_nn selected automatically via LOOCV
Unit weights: Exponential distance-based weighting of control units, where distance is computed as RMS outcome difference on control periods excluding the target period
Time weights: Exponential decay weighting of pre-treatment periods
Weights: Importance weights controlling relative contribution of observations (higher = more relevant)
When to use TROP vs SDID:
Use SDID when parallel trends is plausible and factors are not a concern
Use TROP when you suspect factor confounding (regional shocks, economic cycles, latent factors)
Running both provides a useful robustness check
Reference:
Athey, S., Imbens, G. W., Qu, Z., & Viviano, D. (2025). Triply Robust Panel Estimators. Working Paper. https://arxiv.org/abs/2508.21536