Interactive notebook

This tutorial is a Jupyter notebook. You can view it on GitHub or download it to run locally.

Triply Robust Panel (TROP) Estimator#

This notebook demonstrates the Triply Robust Panel (TROP) estimator (Athey, Imbens, Qu & Viviano, 2025), which combines three robustness components:

  1. Nuclear Norm Regularized Factor Model: Estimates interactive fixed effects via matrix completion with nuclear norm penalty

  2. Exponential Distance-Based Unit Weights: ω_j = exp(-λ_unit × dist(j,i)) where dist(j,i) is the root mean squared difference in outcomes between units j and i, computed only on periods where both units are untreated and excluding the target period t (Equation 3 in the paper)

  3. Exponential Time Decay Weights: θ_s = exp(-λ_time × |s-t|) weighting by proximity to treatment

Weights: The observation-specific weights ω and θ are importance weights that control the relative contribution of each observation to counterfactual estimation. Higher weights indicate more relevant observations for the target counterfactual.

TROP is particularly useful when:

  • There may be unobserved time-varying confounders with factor structure

  • Standard DiD or SDID may be biased due to latent factors

  • You want robust inference under factor confounding

We’ll cover:

  1. When to use TROP

  2. Basic estimation with LOOCV tuning

  3. Understanding tuning parameters

  4. Examining factor structure

  5. Comparing TROP vs SDID

[ ]:
import numpy as np
import pandas as pd
from diff_diff import TROP, trop, SyntheticDiD

# For nicer plots (optional)
try:
    import matplotlib.pyplot as plt
    plt.style.use('seaborn-v0_8-whitegrid')
    HAS_MATPLOTLIB = True
except ImportError:
    HAS_MATPLOTLIB = False
    print("matplotlib not installed - visualization examples will be skipped")

1. When to Use TROP#

Consider TROP when:

  • You suspect factor structure in the data (e.g., economic cycles, regional shocks)

  • Unobserved confounders affect units differently over time

  • Standard parallel trends assumption may be violated due to common factors

  • You have a reasonably long pre-treatment period to estimate factors

The key difference from SDID is that TROP explicitly models and removes interactive fixed effects (factor contributions) before computing treatment effects.

[ ]:
# Generate factor model data using the library function
from diff_diff import generate_factor_data

# True parameters for verification
true_att = 2.0
n_factors = 2
n_pre = 6   # Reduced from 10 for faster execution
n_post = 3  # Reduced from 5

# Generate panel data with factor structure
# This creates a scenario where standard DiD/SDID may be biased,
# but TROP should recover the true treatment effect.
df = generate_factor_data(
    n_units=30,           # Reduced from 50 for faster execution
    n_pre=n_pre,
    n_post=n_post,
    n_treated=6,          # Reduced from 10
    n_factors=n_factors,
    treatment_effect=true_att,
    factor_strength=1.5,  # Strong factor confounding
    treated_loading_shift=0.5,
    unit_fe_sd=1.0,
    noise_sd=0.5,
    seed=42
)

print(f"Dataset: {len(df)} observations")
print(f"Treated units: 6")
print(f"Control units: 24")
print(f"Pre-treatment periods: {n_pre}")
print(f"Post-treatment periods: {n_post}")
print(f"True treatment effect: {true_att}")
print(f"True number of factors: {n_factors}")
[ ]:
if HAS_MATPLOTLIB:
    # Visualize the data
    fig, ax = plt.subplots(figsize=(12, 6))

    # Identify treated vs control units
    treated_units = df.groupby('unit')['treated'].max()
    control_unit_ids = treated_units[treated_units == 0].index[:20]  # First 20 controls
    treated_unit_ids = treated_units[treated_units == 1].index[:5]   # First 5 treated

    # Plot control units (gray, thin lines)
    for unit_id in control_unit_ids:
        unit_data = df[df['unit'] == unit_id]
        ax.plot(unit_data['period'], unit_data['outcome'],
                color='gray', alpha=0.3, linewidth=0.5)

    # Plot treated units (colored, thick lines)
    colors = plt.cm.Reds(np.linspace(0.4, 0.9, 5))
    for i, unit_id in enumerate(treated_unit_ids):
        unit_data = df[df['unit'] == unit_id]
        ax.plot(unit_data['period'], unit_data['outcome'],
                color=colors[i], linewidth=2, label=f'Treated {i+1}')

    # Mark treatment time
    ax.axvline(x=n_pre - 0.5, color='black', linestyle='--', label='Treatment')

    ax.set_xlabel('Period')
    ax.set_ylabel('Outcome')
    ax.set_title('Panel Data with Factor Structure')
    ax.legend(loc='upper left')
    plt.tight_layout()
    plt.show()

2. Basic TROP Estimation#

TROP uses leave-one-out cross-validation (LOOCV) to select three tuning parameters:

  • λ_time: Time weight decay (higher = focus on periods near treatment)

  • λ_unit: Unit weight decay (higher = focus on similar units)

  • λ_nn: Nuclear norm regularization (higher = lower rank factor model)

By default, TROP searches over a grid of values for each parameter.

[ ]:
# Fit TROP with automatic tuning via LOOCV
trop_est = TROP(
    lambda_time_grid=[0.0, 1.0],   # Reduced time decay grid
    lambda_unit_grid=[0.0, 1.0],   # Reduced unit distance grid
    lambda_nn_grid=[0.0, 0.1],     # Reduced nuclear norm grid
    n_bootstrap=50,    # Reduced bootstrap replications for SE
    seed=42
)

# Note: TROP infers treatment periods from the treatment indicator column.
# The 'treated' column should be an absorbing state (D=1 for all periods
# during and after treatment starts).

# For SDID comparison later, we keep post_periods for SDID
post_periods = list(range(n_pre, n_pre + n_post))

results = trop_est.fit(
    df,
    outcome='outcome',
    treatment='treated',
    unit='unit',
    time='period'
)

print(results.summary())
[ ]:
# Check the key results
print(f"True ATT: {true_att:.4f}")
print(f"Estimated ATT: {results.att:.4f}")
print(f"Bias: {results.att - true_att:.4f}")
print()
print(f"Selected tuning parameters:")
print(f"  λ_time: {results.lambda_time:.2f}")
print(f"  λ_unit: {results.lambda_unit:.2f}")
print(f"  λ_nn: {results.lambda_nn:.2f}")
print(f"\nEffective rank of factor matrix: {results.effective_rank:.2f}")
print(f"True rank: {n_factors}")

3. Understanding the Tuning Parameters#

The three tuning parameters control different aspects of the estimation:

λ_time (Time Decay)#

Controls how much weight to place on periods close to treatment:

  • λ_time = 0: Equal weight to all pre-treatment periods

  • λ_time > 0: More weight on recent pre-treatment periods

λ_unit (Unit Distance)#

Controls how much weight to place on similar control units:

  • λ_unit = 0: Equal weight to all control units

  • λ_unit > 0: More weight on control units with similar pre-treatment trajectories

The distance between units j and i for target observation (i, t) is computed as the root mean squared difference in outcomes, using only periods where:

  1. Both units are untreated (D_js = D_is = 0)

  2. The target period t is excluded (following Equation 3 in the paper: 1{u ≠ t})

This ensures the distance measure is based purely on pre-treatment comparability, not contaminated by the treatment period itself.

λ_nn (Nuclear Norm)#

Controls the rank of the factor model:

  • λ_nn = 0: No regularization (full rank)

  • λ_nn > 0: Encourages low-rank factor structure

[ ]:
# Effect of different nuclear norm regularization levels
print("Effect of nuclear norm regularization (λ_nn):")
print("="*65)
print(f"{'λ_nn':>10} {'ATT':>12} {'Bias':>12} {'Eff. Rank':>15}")
print("-"*65)

for lambda_nn in [0.0, 0.1, 1.0]:  # Reduced grid
    trop_fixed = TROP(
        lambda_time_grid=[1.0],  # Fixed
        lambda_unit_grid=[1.0],  # Fixed
        lambda_nn_grid=[lambda_nn],  # Vary this
        n_bootstrap=20,  # Reduced for faster execution
        seed=42
    )

    res = trop_fixed.fit(
        df,
        outcome='outcome',
        treatment='treated',
        unit='unit',
        time='period'
    )

    bias = res.att - true_att
    print(f"{lambda_nn:>10.1f} {res.att:>12.4f} {bias:>12.4f} {res.effective_rank:>15.2f}")

4. Examining the Factor Structure#

TROP estimates a low-rank factor matrix L that captures interactive fixed effects. We can examine this structure.

[ ]:
# Examine the factor matrix
L = results.factor_matrix
print(f"Factor matrix shape: {L.shape} (periods x units)")
print(f"Effective rank: {results.effective_rank:.2f}")

# Compute singular values to see rank structure
U, s, Vt = np.linalg.svd(L, full_matrices=False)
print(f"\nSingular values (top 5): {s[:5].round(2)}")
print(f"Variance explained by top 2: {(s[:2]**2).sum() / (s**2).sum() * 100:.1f}%")
[ ]:
if HAS_MATPLOTLIB:
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))

    # Scree plot of singular values
    ax1 = axes[0]
    ax1.bar(range(1, min(11, len(s)+1)), s[:10])
    ax1.set_xlabel('Component')
    ax1.set_ylabel('Singular Value')
    ax1.set_title('Scree Plot of Factor Matrix')
    ax1.axhline(y=0, color='gray', linestyle='-', linewidth=0.5)

    # Heatmap of factor matrix
    ax2 = axes[1]
    im = ax2.imshow(L, aspect='auto', cmap='RdBu_r', vmin=-2, vmax=2)
    ax2.set_xlabel('Unit')
    ax2.set_ylabel('Period')
    ax2.set_title('Factor Matrix L (Interactive Fixed Effects)')
    ax2.axhline(y=n_pre - 0.5, color='black', linestyle='--', linewidth=2)
    plt.colorbar(im, ax=ax2, label='L_it')

    plt.tight_layout()
    plt.show()

5. Examining Unit and Time Effects#

TROP also estimates traditional unit and time fixed effects (α_i and β_t).

[ ]:
# Unit effects
unit_effects_df = results.get_unit_effects_df()
print("Unit effects (first 10):")
print(unit_effects_df.head(10).to_string(index=False))
[ ]:
# Time effects
time_effects_df = results.get_time_effects_df()
print("Time effects:")
print(time_effects_df.to_string(index=False))
[ ]:
if HAS_MATPLOTLIB:
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))

    # Unit effects
    ax1 = axes[0]
    ax1.bar(range(len(unit_effects_df)), unit_effects_df['effect'])
    ax1.axvline(x=9.5, color='red', linestyle='--', label='Treated/Control boundary')
    ax1.set_xlabel('Unit')
    ax1.set_ylabel('Effect')
    ax1.set_title('Unit Fixed Effects (α_i)')
    ax1.legend()

    # Time effects
    ax2 = axes[1]
    ax2.plot(time_effects_df['time'], time_effects_df['effect'], 'o-', linewidth=2)
    ax2.axvline(x=n_pre - 0.5, color='black', linestyle='--', label='Treatment')
    ax2.set_xlabel('Period')
    ax2.set_ylabel('Effect')
    ax2.set_title('Time Fixed Effects (β_t)')
    ax2.legend()

    plt.tight_layout()
    plt.show()

6. Comparing TROP vs SDID#

Let’s compare TROP with Synthetic DiD to see the benefit of factor adjustment when the DGP has factor structure.

[ ]:
# SDID (no factor adjustment)
# Note: SDID uses 'treat' (unit-level ever-treated indicator)
sdid = SyntheticDiD(
    n_bootstrap=50,  # Reduced for faster execution
    seed=42
)

# SDID still uses post_periods parameter
sdid_results = sdid.fit(
    df,
    outcome='outcome',
    treatment='treat',  # Unit-level ever-treated indicator
    unit='unit',
    time='period',
    post_periods=post_periods
)

# TROP (with factor adjustment)
# Note: TROP uses 'treated' (observation-level treatment indicator)
# and infers treatment periods automatically
trop_est2 = TROP(
    lambda_nn_grid=[0.0, 0.1],  # Reduced grid for faster execution
    n_bootstrap=50,  # Reduced for faster execution
    seed=42
)

trop_results = trop_est2.fit(
    df,
    outcome='outcome',
    treatment='treated',  # Observation-level indicator
    unit='unit',
    time='period'
)

print("Comparison: SDID vs TROP")
print("="*60)
print(f"True ATT: {true_att:.4f}")
print()
print(f"Synthetic DiD (no factor adjustment):")
print(f"  ATT: {sdid_results.att:.4f}")
print(f"  SE: {sdid_results.se:.4f}")
print(f"  Bias: {sdid_results.att - true_att:.4f}")
print()
print(f"TROP (with factor adjustment):")
print(f"  ATT: {trop_results.att:.4f}")
print(f"  SE: {trop_results.se:.4f}")
print(f"  Bias: {trop_results.att - true_att:.4f}")
print(f"  Effective rank: {trop_results.effective_rank:.2f}")

7. Monte Carlo Comparison#

Let’s run a small Monte Carlo simulation to compare TROP and SDID under the factor DGP.

[ ]:
# Monte Carlo comparison (reduced for faster tutorial execution)
n_sims = 5  # Reduced from 20 for faster validation
trop_estimates = []
sdid_estimates = []

print(f"Running {n_sims} simulations...")

for sim in range(n_sims):
    # Generate new data using the library function
    # (includes both 'treated' and 'treat' columns)
    sim_data = generate_factor_data(
        n_units=50,
        n_pre=10,
        n_post=5,
        n_treated=10,
        n_factors=2,
        treatment_effect=2.0,
        factor_strength=1.5,
        noise_sd=0.5,
        seed=100 + sim
    )

    # TROP (uses observation-level 'treated')
    # Note: TROP infers treatment periods from the treatment indicator
    try:
        trop_m = TROP(
            lambda_time_grid=[1.0],
            lambda_unit_grid=[1.0],
            lambda_nn_grid=[0.1],
            n_bootstrap=10,
            seed=42 + sim
        )
        trop_res = trop_m.fit(
            sim_data,
            outcome='outcome',
            treatment='treated',
            unit='unit',
            time='period'
        )
        trop_estimates.append(trop_res.att)
    except Exception as e:
        print(f"TROP failed on sim {sim}: {e}")

    # SDID (uses unit-level 'treat')
    # Note: SDID still uses post_periods parameter
    try:
        sdid_m = SyntheticDiD(n_bootstrap=10, seed=42 + sim)
        sdid_res = sdid_m.fit(
            sim_data,
            outcome='outcome',
            treatment='treat',  # Unit-level ever-treated indicator
            unit='unit',
            time='period',
            post_periods=list(range(10, 15))
        )
        sdid_estimates.append(sdid_res.att)
    except Exception as e:
        print(f"SDID failed on sim {sim}: {e}")

print(f"\nMonte Carlo Results (True ATT = {true_att})")
print("="*60)
print(f"{'Estimator':<15} {'Mean':>12} {'Bias':>12} {'RMSE':>12}")
print("-"*60)

if trop_estimates:
    trop_mean = np.mean(trop_estimates)
    trop_bias = trop_mean - true_att
    trop_rmse = np.sqrt(np.mean([(e - true_att)**2 for e in trop_estimates]))
    print(f"{'TROP':<15} {trop_mean:>12.4f} {trop_bias:>12.4f} {trop_rmse:>12.4f}")

if sdid_estimates:
    sdid_mean = np.mean(sdid_estimates)
    sdid_bias = sdid_mean - true_att
    sdid_rmse = np.sqrt(np.mean([(e - true_att)**2 for e in sdid_estimates]))
    print(f"{'SDID':<15} {sdid_mean:>12.4f} {sdid_bias:>12.4f} {sdid_rmse:>12.4f}")
[ ]:
if HAS_MATPLOTLIB and trop_estimates and sdid_estimates:
    # Visualize Monte Carlo results
    fig, ax = plt.subplots(figsize=(10, 6))

    ax.hist(sdid_estimates, bins=15, alpha=0.6, label='SDID', color='blue')
    ax.hist(trop_estimates, bins=15, alpha=0.6, label='TROP', color='red')
    ax.axvline(x=true_att, color='black', linewidth=2, linestyle='--', label=f'True ATT = {true_att}')

    ax.set_xlabel('Estimated ATT')
    ax.set_ylabel('Frequency')
    ax.set_title('Monte Carlo Distribution of Estimates')
    ax.legend()
    plt.tight_layout()
    plt.show()

8. Using the Convenience Function#

For quick estimation, you can use the trop() convenience function.

[ ]:
# One-liner estimation with default tuning grid
# Note: TROP infers treatment periods from the treatment indicator
quick_results = trop(
    df,
    outcome='outcome',
    treatment='treated',
    unit='unit',
    time='period',
    n_bootstrap=20,  # Reduced for faster execution
    seed=42
)

print(f"Quick estimation:")
print(f"  ATT: {quick_results.att:.4f}")
print(f"  SE: {quick_results.se:.4f}")
print(f"  λ_time: {quick_results.lambda_time:.2f}")
print(f"  λ_unit: {quick_results.lambda_unit:.2f}")
print(f"  λ_nn: {quick_results.lambda_nn:.2f}")
print(f"  Effective rank: {quick_results.effective_rank:.2f}")

9. Variance Estimation#

TROP uses unit-level stratified block bootstrap for variance estimation, as specified in Algorithm 3 of Athey et al. (2025). Control and treated units are sampled separately to preserve the treatment ratio. The number of bootstrap replications is controlled by the n_bootstrap parameter (default: 200).

[ ]:
# Bootstrap variance with different numbers of replications
print("Bootstrap replications comparison:")
print("="*50)

for n_boot in [20, 50, 100]:
    trop_var = TROP(
        lambda_time_grid=[1.0],
        lambda_unit_grid=[1.0],
        lambda_nn_grid=[0.1],
        n_bootstrap=n_boot,
        seed=42
    )

    res = trop_var.fit(
        df,
        outcome='outcome',
        treatment='treated',
        unit='unit',
        time='period'
    )

    print(f"\nn_bootstrap={n_boot}:")
    print(f"  ATT: {res.att:.4f}")
    print(f"  SE: {res.se:.4f}")
    print(f"  95% CI: [{res.conf_int[0]:.4f}, {res.conf_int[1]:.4f}]")
[ ]:
# Compare estimation methods
print("Estimation method comparison:")
print("="*60)

import time

# Local method (default)
start = time.time()
trop_local = TROP(
    method='local',
    lambda_time_grid=[0.0, 1.0],
    lambda_unit_grid=[0.0, 1.0],
    lambda_nn_grid=[0.0, 0.1],
    n_bootstrap=20,
    seed=42
)
results_local = trop_local.fit(
    df,
    outcome='outcome',
    treatment='treated',
    unit='unit',
    time='period'
)
local_time = time.time() - start

# Global method
start = time.time()
trop_global = TROP(
    method='global',
    lambda_time_grid=[0.0, 1.0],
    lambda_unit_grid=[0.0, 1.0],
    lambda_nn_grid=[0.0, 0.1],
    n_bootstrap=20,
    seed=42
)
results_global = trop_global.fit(
    df,
    outcome='outcome',
    treatment='treated',
    unit='unit',
    time='period'
)
global_time = time.time() - start

print(f"\n{'Method':<15} {'ATT':>10} {'SE':>10} {'Time (s)':>12}")
print("-"*60)
print(f"{'Local':<15} {results_local.att:>10.4f} {results_local.se:>10.4f} {local_time:>12.2f}")
print(f"{'Global':<15} {results_global.att:>10.4f} {results_global.se:>10.4f} {global_time:>12.2f}")
print(f"\nTrue ATT: {true_att}")
print(f"Local bias: {results_local.att - true_att:.4f}")
print(f"Global bias: {results_global.att - true_att:.4f}")

10. Estimation Methods: Local vs Global#

TROP supports two estimation methods via the method parameter:

Local Method (method='local', default):

  • Follows Algorithm 2 from the paper

  • Computes observation-specific weights for each treated observation

  • Fits a model per treated observation, then averages the individual effects

  • More flexible, allows for heterogeneous treatment effects

  • Computationally intensive (N_treated optimizations)

Global Method (method='global'):

  • Fits a single model on control data using (1-W) masked weights (per paper Eq. 2)

  • Extracts per-observation treatment effects as post-hoc residuals: τ_it = Y_it - μ - α_i - β_t - L_it

  • ATT = mean(τ_it) over treated observations

  • Faster (single optimization) with global weights

Note: method='twostep' is a deprecated alias for method='local', and method='joint' is a deprecated alias for method='global'. Both will be removed in v3.0.

10. Results Export#

TROP results can be easily exported to different formats.

[ ]:
# Convert to dictionary
results_dict = results.to_dict()
print("Results as dictionary:")
for key, value in results_dict.items():
    if isinstance(value, float):
        print(f"  {key}: {value:.4f}")
    else:
        print(f"  {key}: {value}")

Summary#

Key takeaways for TROP:

  1. Best use cases: Factor confounding, unobserved time-varying confounders with interactive effects

  2. Factor estimation: Nuclear norm regularization with LOOCV for tuning

  3. Three tuning parameters: λ_time, λ_unit, λ_nn selected automatically via LOOCV

  4. Unit weights: Exponential distance-based weighting of control units, where distance is computed as RMS outcome difference on control periods excluding the target period

  5. Time weights: Exponential decay weighting of pre-treatment periods

  6. Weights: Importance weights controlling relative contribution of observations (higher = more relevant)

  7. Estimation methods:

    • method='local' (default): Per-observation estimation, allows heterogeneous effects

    • method='global': Single model with (1-W) masking, post-hoc heterogeneous effects, faster

When to use TROP vs SDID:

  • Use SDID when parallel trends is plausible and factors are not a concern

  • Use TROP when you suspect factor confounding (regional shocks, economic cycles, latent factors)

  • Running both provides a useful robustness check

When to use local vs global method:

  • Use local (default) for maximum flexibility with per-observation weights

  • Use global for faster estimation with global weights

Reference:

[ ]:
# Individual treatment effects
treatment_effects_df = results.get_treatment_effects_df()
print("\nIndividual treatment effects (first 10):")
print(treatment_effects_df.head(10).to_string(index=False))

Summary#

Key takeaways for TROP:

  1. Best use cases: Factor confounding, unobserved time-varying confounders with interactive effects

  2. Factor estimation: Nuclear norm regularization with LOOCV for tuning

  3. Three tuning parameters: λ_time, λ_unit, λ_nn selected automatically via LOOCV

  4. Unit weights: Exponential distance-based weighting of control units, where distance is computed as RMS outcome difference on control periods excluding the target period

  5. Time weights: Exponential decay weighting of pre-treatment periods

  6. Weights: Importance weights controlling relative contribution of observations (higher = more relevant)

When to use TROP vs SDID:

  • Use SDID when parallel trends is plausible and factors are not a concern

  • Use TROP when you suspect factor confounding (regional shocks, economic cycles, latent factors)

  • Running both provides a useful robustness check

Reference: