{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "# Triply Robust Panel (TROP) Estimator\n\nThis notebook demonstrates the **Triply Robust Panel (TROP)** estimator (Athey, Imbens, Qu & Viviano, 2025), which combines three robustness components:\n\n1. **Nuclear Norm Regularized Factor Model**: Estimates interactive fixed effects via matrix completion with nuclear norm penalty\n2. **Exponential Distance-Based Unit Weights**: \u03c9_j = exp(-\u03bb_unit \u00d7 dist(j,i)) where dist(j,i) is the root mean squared difference in outcomes between units j and i, computed only on periods where both units are untreated and excluding the target period t (Equation 3 in the paper)\n3. **Exponential Time Decay Weights**: \u03b8_s = exp(-\u03bb_time \u00d7 |s-t|) weighting by proximity to treatment\n\n**Weights**: The observation-specific weights \u03c9 and \u03b8 are importance weights that control the relative contribution of each observation to counterfactual estimation. Higher weights indicate more relevant observations for the target counterfactual.\n\nTROP is particularly useful when:\n- There may be unobserved time-varying confounders with factor structure\n- Standard DiD or SDID may be biased due to latent factors\n- You want robust inference under factor confounding\n\nWe'll cover:\n1. When to use TROP\n2. Basic estimation with LOOCV tuning\n3. Understanding tuning parameters\n4. Examining factor structure\n5. Comparing TROP vs SDID"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "from diff_diff import TROP, trop, SyntheticDiD\n",
    "\n",
    "# For nicer plots (optional)\n",
    "try:\n",
    "    import matplotlib.pyplot as plt\n",
    "    plt.style.use('seaborn-v0_8-whitegrid')\n",
    "    HAS_MATPLOTLIB = True\n",
    "except ImportError:\n",
    "    HAS_MATPLOTLIB = False\n",
    "    print(\"matplotlib not installed - visualization examples will be skipped\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. When to Use TROP\n",
    "\n",
    "Consider TROP when:\n",
    "- You suspect **factor structure** in the data (e.g., economic cycles, regional shocks)\n",
    "- **Unobserved confounders** affect units differently over time\n",
    "- Standard parallel trends assumption may be violated due to common factors\n",
    "- You have a **reasonably long pre-treatment period** to estimate factors\n",
    "\n",
    "The key difference from SDID is that TROP explicitly models and removes interactive fixed effects (factor contributions) before computing treatment effects."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Generate factor model data using the library function\n",
    "from diff_diff import generate_factor_data\n",
    "\n",
    "# True parameters for verification\n",
    "true_att = 2.0\n",
    "n_factors = 2\n",
    "n_pre = 6   # Reduced from 10 for faster execution\n",
    "n_post = 3  # Reduced from 5\n",
    "\n",
    "# Generate panel data with factor structure\n",
    "# This creates a scenario where standard DiD/SDID may be biased,\n",
    "# but TROP should recover the true treatment effect.\n",
    "df = generate_factor_data(\n",
    "    n_units=30,           # Reduced from 50 for faster execution\n",
    "    n_pre=n_pre,\n",
    "    n_post=n_post,\n",
    "    n_treated=6,          # Reduced from 10\n",
    "    n_factors=n_factors,\n",
    "    treatment_effect=true_att,\n",
    "    factor_strength=1.5,  # Strong factor confounding\n",
    "    treated_loading_shift=0.5,\n",
    "    unit_fe_sd=1.0,\n",
    "    noise_sd=0.5,\n",
    "    seed=42\n",
    ")\n",
    "\n",
    "print(f\"Dataset: {len(df)} observations\")\n",
    "print(f\"Treated units: 6\")\n",
    "print(f\"Control units: 24\")\n",
    "print(f\"Pre-treatment periods: {n_pre}\")\n",
    "print(f\"Post-treatment periods: {n_post}\")\n",
    "print(f\"True treatment effect: {true_att}\")\n",
    "print(f\"True number of factors: {n_factors}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if HAS_MATPLOTLIB:\n",
    "    # Visualize the data\n",
    "    fig, ax = plt.subplots(figsize=(12, 6))\n",
    "    \n",
    "    # Identify treated vs control units\n",
    "    treated_units = df.groupby('unit')['treated'].max()\n",
    "    control_unit_ids = treated_units[treated_units == 0].index[:20]  # First 20 controls\n",
    "    treated_unit_ids = treated_units[treated_units == 1].index[:5]   # First 5 treated\n",
    "    \n",
    "    # Plot control units (gray, thin lines)\n",
    "    for unit_id in control_unit_ids:\n",
    "        unit_data = df[df['unit'] == unit_id]\n",
    "        ax.plot(unit_data['period'], unit_data['outcome'], \n",
    "                color='gray', alpha=0.3, linewidth=0.5)\n",
    "    \n",
    "    # Plot treated units (colored, thick lines)\n",
    "    colors = plt.cm.Reds(np.linspace(0.4, 0.9, 5))\n",
    "    for i, unit_id in enumerate(treated_unit_ids):\n",
    "        unit_data = df[df['unit'] == unit_id]\n",
    "        ax.plot(unit_data['period'], unit_data['outcome'], \n",
    "                color=colors[i], linewidth=2, label=f'Treated {i+1}')\n",
    "    \n",
    "    # Mark treatment time\n",
    "    ax.axvline(x=n_pre - 0.5, color='black', linestyle='--', label='Treatment')\n",
    "    \n",
    "    ax.set_xlabel('Period')\n",
    "    ax.set_ylabel('Outcome')\n",
    "    ax.set_title('Panel Data with Factor Structure')\n",
    "    ax.legend(loc='upper left')\n",
    "    plt.tight_layout()\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Basic TROP Estimation\n",
    "\n",
    "TROP uses leave-one-out cross-validation (LOOCV) to select three tuning parameters:\n",
    "- **\u03bb_time**: Time weight decay (higher = focus on periods near treatment)\n",
    "- **\u03bb_unit**: Unit weight decay (higher = focus on similar units)\n",
    "- **\u03bb_nn**: Nuclear norm regularization (higher = lower rank factor model)\n",
    "\n",
    "By default, TROP searches over a grid of values for each parameter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Fit TROP with automatic tuning via LOOCV\n",
    "trop_est = TROP(\n",
    "    lambda_time_grid=[0.0, 1.0],   # Reduced time decay grid\n",
    "    lambda_unit_grid=[0.0, 1.0],   # Reduced unit distance grid  \n",
    "    lambda_nn_grid=[0.0, 0.1],     # Reduced nuclear norm grid\n",
    "    n_bootstrap=50,    # Reduced bootstrap replications for SE\n",
    "    seed=42\n",
    ")\n",
    "\n",
    "# Note: TROP infers treatment periods from the treatment indicator column.\n",
    "# The 'treated' column should be an absorbing state (D=1 for all periods\n",
    "# during and after treatment starts).\n",
    "\n",
    "# For SDID comparison later, we keep post_periods for SDID\n",
    "post_periods = list(range(n_pre, n_pre + n_post))\n",
    "\n",
    "results = trop_est.fit(\n",
    "    df,\n",
    "    outcome='outcome',\n",
    "    treatment='treated',\n",
    "    unit='unit',\n",
    "    time='period'\n",
    ")\n",
    "\n",
    "print(results.summary())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Check the key results\n",
    "print(f\"True ATT: {true_att:.4f}\")\n",
    "print(f\"Estimated ATT: {results.att:.4f}\")\n",
    "print(f\"Bias: {results.att - true_att:.4f}\")\n",
    "print()\n",
    "print(f\"Selected tuning parameters:\")\n",
    "print(f\"  \u03bb_time: {results.lambda_time:.2f}\")\n",
    "print(f\"  \u03bb_unit: {results.lambda_unit:.2f}\")\n",
    "print(f\"  \u03bb_nn: {results.lambda_nn:.2f}\")\n",
    "print(f\"\\nEffective rank of factor matrix: {results.effective_rank:.2f}\")\n",
    "print(f\"True rank: {n_factors}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Understanding the Tuning Parameters\n",
    "\n",
    "The three tuning parameters control different aspects of the estimation:\n",
    "\n",
    "### \u03bb_time (Time Decay)\n",
    "Controls how much weight to place on periods close to treatment:\n",
    "- **\u03bb_time = 0**: Equal weight to all pre-treatment periods\n",
    "- **\u03bb_time > 0**: More weight on recent pre-treatment periods\n",
    "\n",
    "### \u03bb_unit (Unit Distance)\n",
    "Controls how much weight to place on similar control units:\n",
    "- **\u03bb_unit = 0**: Equal weight to all control units\n",
    "- **\u03bb_unit > 0**: More weight on control units with similar pre-treatment trajectories\n",
    "\n",
    "The distance between units j and i for target observation (i, t) is computed as the root mean squared difference in outcomes, using only periods where:\n",
    "1. Both units are untreated (D_js = D_is = 0)\n",
    "2. The target period t is **excluded** (following Equation 3 in the paper: 1{u \u2260 t})\n",
    "\n",
    "This ensures the distance measure is based purely on pre-treatment comparability, not contaminated by the treatment period itself.\n",
    "\n",
    "### \u03bb_nn (Nuclear Norm)\n",
    "Controls the rank of the factor model:\n",
    "- **\u03bb_nn = 0**: No regularization (full rank)\n",
    "- **\u03bb_nn > 0**: Encourages low-rank factor structure"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Effect of different nuclear norm regularization levels\n",
    "print(\"Effect of nuclear norm regularization (\u03bb_nn):\")\n",
    "print(\"=\"*65)\n",
    "print(f\"{'\u03bb_nn':>10} {'ATT':>12} {'Bias':>12} {'Eff. Rank':>15}\")\n",
    "print(\"-\"*65)\n",
    "\n",
    "for lambda_nn in [0.0, 0.1, 1.0]:  # Reduced grid\n",
    "    trop_fixed = TROP(\n",
    "        lambda_time_grid=[1.0],  # Fixed\n",
    "        lambda_unit_grid=[1.0],  # Fixed\n",
    "        lambda_nn_grid=[lambda_nn],  # Vary this\n",
    "        n_bootstrap=20,  # Reduced for faster execution\n",
    "        seed=42\n",
    "    )\n",
    "    \n",
    "    res = trop_fixed.fit(\n",
    "        df,\n",
    "        outcome='outcome',\n",
    "        treatment='treated',\n",
    "        unit='unit',\n",
    "        time='period'\n",
    "    )\n",
    "    \n",
    "    bias = res.att - true_att\n",
    "    print(f\"{lambda_nn:>10.1f} {res.att:>12.4f} {bias:>12.4f} {res.effective_rank:>15.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Examining the Factor Structure\n",
    "\n",
    "TROP estimates a low-rank factor matrix L that captures interactive fixed effects. We can examine this structure."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Examine the factor matrix\n",
    "L = results.factor_matrix\n",
    "print(f\"Factor matrix shape: {L.shape} (periods x units)\")\n",
    "print(f\"Effective rank: {results.effective_rank:.2f}\")\n",
    "\n",
    "# Compute singular values to see rank structure\n",
    "U, s, Vt = np.linalg.svd(L, full_matrices=False)\n",
    "print(f\"\\nSingular values (top 5): {s[:5].round(2)}\")\n",
    "print(f\"Variance explained by top 2: {(s[:2]**2).sum() / (s**2).sum() * 100:.1f}%\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if HAS_MATPLOTLIB:\n",
    "    fig, axes = plt.subplots(1, 2, figsize=(14, 5))\n",
    "    \n",
    "    # Scree plot of singular values\n",
    "    ax1 = axes[0]\n",
    "    ax1.bar(range(1, min(11, len(s)+1)), s[:10])\n",
    "    ax1.set_xlabel('Component')\n",
    "    ax1.set_ylabel('Singular Value')\n",
    "    ax1.set_title('Scree Plot of Factor Matrix')\n",
    "    ax1.axhline(y=0, color='gray', linestyle='-', linewidth=0.5)\n",
    "    \n",
    "    # Heatmap of factor matrix\n",
    "    ax2 = axes[1]\n",
    "    im = ax2.imshow(L, aspect='auto', cmap='RdBu_r', vmin=-2, vmax=2)\n",
    "    ax2.set_xlabel('Unit')\n",
    "    ax2.set_ylabel('Period')\n",
    "    ax2.set_title('Factor Matrix L (Interactive Fixed Effects)')\n",
    "    ax2.axhline(y=n_pre - 0.5, color='black', linestyle='--', linewidth=2)\n",
    "    plt.colorbar(im, ax=ax2, label='L_it')\n",
    "    \n",
    "    plt.tight_layout()\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Examining Unit and Time Effects\n",
    "\n",
    "TROP also estimates traditional unit and time fixed effects (\u03b1_i and \u03b2_t)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Unit effects\n",
    "unit_effects_df = results.get_unit_effects_df()\n",
    "print(\"Unit effects (first 10):\")\n",
    "print(unit_effects_df.head(10).to_string(index=False))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Time effects\n",
    "time_effects_df = results.get_time_effects_df()\n",
    "print(\"Time effects:\")\n",
    "print(time_effects_df.to_string(index=False))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if HAS_MATPLOTLIB:\n",
    "    fig, axes = plt.subplots(1, 2, figsize=(14, 5))\n",
    "    \n",
    "    # Unit effects\n",
    "    ax1 = axes[0]\n",
    "    ax1.bar(range(len(unit_effects_df)), unit_effects_df['effect'])\n",
    "    ax1.axvline(x=9.5, color='red', linestyle='--', label='Treated/Control boundary')\n",
    "    ax1.set_xlabel('Unit')\n",
    "    ax1.set_ylabel('Effect')\n",
    "    ax1.set_title('Unit Fixed Effects (\u03b1_i)')\n",
    "    ax1.legend()\n",
    "    \n",
    "    # Time effects\n",
    "    ax2 = axes[1]\n",
    "    ax2.plot(time_effects_df['time'], time_effects_df['effect'], 'o-', linewidth=2)\n",
    "    ax2.axvline(x=n_pre - 0.5, color='black', linestyle='--', label='Treatment')\n",
    "    ax2.set_xlabel('Period')\n",
    "    ax2.set_ylabel('Effect')\n",
    "    ax2.set_title('Time Fixed Effects (\u03b2_t)')\n",
    "    ax2.legend()\n",
    "    \n",
    "    plt.tight_layout()\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Comparing TROP vs SDID\n",
    "\n",
    "Let's compare TROP with Synthetic DiD to see the benefit of factor adjustment when the DGP has factor structure."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# SDID (no factor adjustment)\n",
    "# Note: SDID uses 'treat' (unit-level ever-treated indicator)\n",
    "sdid = SyntheticDiD(\n",
    "    n_bootstrap=50,  # Reduced for faster execution\n",
    "    seed=42\n",
    ")\n",
    "\n",
    "# SDID still uses post_periods parameter\n",
    "sdid_results = sdid.fit(\n",
    "    df,\n",
    "    outcome='outcome',\n",
    "    treatment='treat',  # Unit-level ever-treated indicator\n",
    "    unit='unit',\n",
    "    time='period',\n",
    "    post_periods=post_periods\n",
    ")\n",
    "\n",
    "# TROP (with factor adjustment)\n",
    "# Note: TROP uses 'treated' (observation-level treatment indicator)\n",
    "# and infers treatment periods automatically\n",
    "trop_est2 = TROP(\n",
    "    lambda_nn_grid=[0.0, 0.1],  # Reduced grid for faster execution\n",
    "    n_bootstrap=50,  # Reduced for faster execution\n",
    "    seed=42\n",
    ")\n",
    "\n",
    "trop_results = trop_est2.fit(\n",
    "    df,\n",
    "    outcome='outcome',\n",
    "    treatment='treated',  # Observation-level indicator\n",
    "    unit='unit',\n",
    "    time='period'\n",
    ")\n",
    "\n",
    "print(\"Comparison: SDID vs TROP\")\n",
    "print(\"=\"*60)\n",
    "print(f\"True ATT: {true_att:.4f}\")\n",
    "print()\n",
    "print(f\"Synthetic DiD (no factor adjustment):\")\n",
    "print(f\"  ATT: {sdid_results.att:.4f}\")\n",
    "print(f\"  SE: {sdid_results.se:.4f}\")\n",
    "print(f\"  Bias: {sdid_results.att - true_att:.4f}\")\n",
    "print()\n",
    "print(f\"TROP (with factor adjustment):\")\n",
    "print(f\"  ATT: {trop_results.att:.4f}\")\n",
    "print(f\"  SE: {trop_results.se:.4f}\")\n",
    "print(f\"  Bias: {trop_results.att - true_att:.4f}\")\n",
    "print(f\"  Effective rank: {trop_results.effective_rank:.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 7. Monte Carlo Comparison\n",
    "\n",
    "Let's run a small Monte Carlo simulation to compare TROP and SDID under the factor DGP."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Monte Carlo comparison (reduced for faster tutorial execution)\n",
    "n_sims = 5  # Reduced from 20 for faster validation\n",
    "trop_estimates = []\n",
    "sdid_estimates = []\n",
    "\n",
    "print(f\"Running {n_sims} simulations...\")\n",
    "\n",
    "for sim in range(n_sims):\n",
    "    # Generate new data using the library function\n",
    "    # (includes both 'treated' and 'treat' columns)\n",
    "    sim_data = generate_factor_data(\n",
    "        n_units=50,\n",
    "        n_pre=10,\n",
    "        n_post=5,\n",
    "        n_treated=10,\n",
    "        n_factors=2,\n",
    "        treatment_effect=2.0,\n",
    "        factor_strength=1.5,\n",
    "        noise_sd=0.5,\n",
    "        seed=100 + sim\n",
    "    )\n",
    "    \n",
    "    # TROP (uses observation-level 'treated')\n",
    "    # Note: TROP infers treatment periods from the treatment indicator\n",
    "    try:\n",
    "        trop_m = TROP(\n",
    "            lambda_time_grid=[1.0],\n",
    "            lambda_unit_grid=[1.0],\n",
    "            lambda_nn_grid=[0.1],\n",
    "            n_bootstrap=10, \n",
    "            seed=42 + sim\n",
    "        )\n",
    "        trop_res = trop_m.fit(\n",
    "            sim_data,\n",
    "            outcome='outcome',\n",
    "            treatment='treated',\n",
    "            unit='unit',\n",
    "            time='period'\n",
    "        )\n",
    "        trop_estimates.append(trop_res.att)\n",
    "    except Exception as e:\n",
    "        print(f\"TROP failed on sim {sim}: {e}\")\n",
    "    \n",
    "    # SDID (uses unit-level 'treat')\n",
    "    # Note: SDID still uses post_periods parameter\n",
    "    try:\n",
    "        sdid_m = SyntheticDiD(n_bootstrap=10, seed=42 + sim)\n",
    "        sdid_res = sdid_m.fit(\n",
    "            sim_data,\n",
    "            outcome='outcome',\n",
    "            treatment='treat',  # Unit-level ever-treated indicator\n",
    "            unit='unit',\n",
    "            time='period',\n",
    "            post_periods=list(range(10, 15))\n",
    "        )\n",
    "        sdid_estimates.append(sdid_res.att)\n",
    "    except Exception as e:\n",
    "        print(f\"SDID failed on sim {sim}: {e}\")\n",
    "\n",
    "print(f\"\\nMonte Carlo Results (True ATT = {true_att})\")\n",
    "print(\"=\"*60)\n",
    "print(f\"{'Estimator':<15} {'Mean':>12} {'Bias':>12} {'RMSE':>12}\")\n",
    "print(\"-\"*60)\n",
    "\n",
    "if trop_estimates:\n",
    "    trop_mean = np.mean(trop_estimates)\n",
    "    trop_bias = trop_mean - true_att\n",
    "    trop_rmse = np.sqrt(np.mean([(e - true_att)**2 for e in trop_estimates]))\n",
    "    print(f\"{'TROP':<15} {trop_mean:>12.4f} {trop_bias:>12.4f} {trop_rmse:>12.4f}\")\n",
    "\n",
    "if sdid_estimates:\n",
    "    sdid_mean = np.mean(sdid_estimates)\n",
    "    sdid_bias = sdid_mean - true_att\n",
    "    sdid_rmse = np.sqrt(np.mean([(e - true_att)**2 for e in sdid_estimates]))\n",
    "    print(f\"{'SDID':<15} {sdid_mean:>12.4f} {sdid_bias:>12.4f} {sdid_rmse:>12.4f}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "if HAS_MATPLOTLIB and trop_estimates and sdid_estimates:\n",
    "    # Visualize Monte Carlo results\n",
    "    fig, ax = plt.subplots(figsize=(10, 6))\n",
    "    \n",
    "    ax.hist(sdid_estimates, bins=15, alpha=0.6, label='SDID', color='blue')\n",
    "    ax.hist(trop_estimates, bins=15, alpha=0.6, label='TROP', color='red')\n",
    "    ax.axvline(x=true_att, color='black', linewidth=2, linestyle='--', label=f'True ATT = {true_att}')\n",
    "    \n",
    "    ax.set_xlabel('Estimated ATT')\n",
    "    ax.set_ylabel('Frequency')\n",
    "    ax.set_title('Monte Carlo Distribution of Estimates')\n",
    "    ax.legend()\n",
    "    plt.tight_layout()\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 8. Using the Convenience Function\n",
    "\n",
    "For quick estimation, you can use the `trop()` convenience function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# One-liner estimation with default tuning grid\n",
    "# Note: TROP infers treatment periods from the treatment indicator\n",
    "quick_results = trop(\n",
    "    df,\n",
    "    outcome='outcome',\n",
    "    treatment='treated',\n",
    "    unit='unit',\n",
    "    time='period',\n",
    "    n_bootstrap=20,  # Reduced for faster execution\n",
    "    seed=42\n",
    ")\n",
    "\n",
    "print(f\"Quick estimation:\")\n",
    "print(f\"  ATT: {quick_results.att:.4f}\")\n",
    "print(f\"  SE: {quick_results.se:.4f}\")\n",
    "print(f\"  \u03bb_time: {quick_results.lambda_time:.2f}\")\n",
    "print(f\"  \u03bb_unit: {quick_results.lambda_unit:.2f}\")\n",
    "print(f\"  \u03bb_nn: {quick_results.lambda_nn:.2f}\")\n",
    "print(f\"  Effective rank: {quick_results.effective_rank:.2f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "## 9. Variance Estimation\n\nTROP uses **unit-level stratified block bootstrap** for variance estimation, as specified in Algorithm 3 of Athey et al. (2025). Control and treated units are sampled separately to preserve the treatment ratio. The number of bootstrap replications is controlled by the `n_bootstrap` parameter (default: 200)."
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "# Bootstrap variance with different numbers of replications\nprint(\"Bootstrap replications comparison:\")\nprint(\"=\"*50)\n\nfor n_boot in [20, 50, 100]:\n    trop_var = TROP(\n        lambda_time_grid=[1.0],\n        lambda_unit_grid=[1.0], \n        lambda_nn_grid=[0.1],\n        n_bootstrap=n_boot,\n        seed=42\n    )\n    \n    res = trop_var.fit(\n        df,\n        outcome='outcome',\n        treatment='treated',\n        unit='unit',\n        time='period'\n    )\n    \n    print(f\"\\nn_bootstrap={n_boot}:\")\n    print(f\"  ATT: {res.att:.4f}\")\n    print(f\"  SE: {res.se:.4f}\")\n    print(f\"  95% CI: [{res.conf_int[0]:.4f}, {res.conf_int[1]:.4f}]\")"
  },
  {
   "cell_type": "code",
   "source": "# Compare estimation methods\nprint(\"Estimation method comparison:\")\nprint(\"=\"*60)\n\nimport time\n\n# Local method (default)\nstart = time.time()\ntrop_local = TROP(\n    method='local',\n    lambda_time_grid=[0.0, 1.0],\n    lambda_unit_grid=[0.0, 1.0], \n    lambda_nn_grid=[0.0, 0.1],\n    n_bootstrap=20,\n    seed=42\n)\nresults_local = trop_local.fit(\n    df,\n    outcome='outcome',\n    treatment='treated',\n    unit='unit',\n    time='period'\n)\nlocal_time = time.time() - start\n\n# Global method\nstart = time.time()\ntrop_global = TROP(\n    method='global',\n    lambda_time_grid=[0.0, 1.0],\n    lambda_unit_grid=[0.0, 1.0], \n    lambda_nn_grid=[0.0, 0.1],\n    n_bootstrap=20,\n    seed=42\n)\nresults_global = trop_global.fit(\n    df,\n    outcome='outcome',\n    treatment='treated',\n    unit='unit',\n    time='period'\n)\nglobal_time = time.time() - start\n\nprint(f\"\\n{'Method':<15} {'ATT':>10} {'SE':>10} {'Time (s)':>12}\")\nprint(\"-\"*60)\nprint(f\"{'Local':<15} {results_local.att:>10.4f} {results_local.se:>10.4f} {local_time:>12.2f}\")\nprint(f\"{'Global':<15} {results_global.att:>10.4f} {results_global.se:>10.4f} {global_time:>12.2f}\")\nprint(f\"\\nTrue ATT: {true_att}\")\nprint(f\"Local bias: {results_local.att - true_att:.4f}\")\nprint(f\"Global bias: {results_global.att - true_att:.4f}\")",
   "metadata": {},
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "#",
    "#",
    " ",
    "1",
    "0",
    ".",
    " ",
    "E",
    "s",
    "t",
    "i",
    "m",
    "a",
    "t",
    "i",
    "o",
    "n",
    " ",
    "M",
    "e",
    "t",
    "h",
    "o",
    "d",
    "s",
    ":",
    " ",
    "L",
    "o",
    "c",
    "a",
    "l",
    " ",
    "v",
    "s",
    " ",
    "G",
    "l",
    "o",
    "b",
    "a",
    "l",
    "\n",
    "\n",
    "T",
    "R",
    "O",
    "P",
    " ",
    "s",
    "u",
    "p",
    "p",
    "o",
    "r",
    "t",
    "s",
    " ",
    "t",
    "w",
    "o",
    " ",
    "e",
    "s",
    "t",
    "i",
    "m",
    "a",
    "t",
    "i",
    "o",
    "n",
    " ",
    "m",
    "e",
    "t",
    "h",
    "o",
    "d",
    "s",
    " ",
    "v",
    "i",
    "a",
    " ",
    "t",
    "h",
    "e",
    " ",
    "`",
    "m",
    "e",
    "t",
    "h",
    "o",
    "d",
    "`",
    " ",
    "p",
    "a",
    "r",
    "a",
    "m",
    "e",
    "t",
    "e",
    "r",
    ":",
    "\n",
    "\n",
    "*",
    "*",
    "L",
    "o",
    "c",
    "a",
    "l",
    " ",
    "M",
    "e",
    "t",
    "h",
    "o",
    "d",
    "*",
    "*",
    " ",
    "(",
    "`",
    "m",
    "e",
    "t",
    "h",
    "o",
    "d",
    "=",
    "'",
    "l",
    "o",
    "c",
    "a",
    "l",
    "'",
    "`",
    ",",
    " ",
    "d",
    "e",
    "f",
    "a",
    "u",
    "l",
    "t",
    ")",
    ":",
    "\n",
    "-",
    " ",
    "F",
    "o",
    "l",
    "l",
    "o",
    "w",
    "s",
    " ",
    "A",
    "l",
    "g",
    "o",
    "r",
    "i",
    "t",
    "h",
    "m",
    " ",
    "2",
    " ",
    "f",
    "r",
    "o",
    "m",
    " ",
    "t",
    "h",
    "e",
    " ",
    "p",
    "a",
    "p",
    "e",
    "r",
    "\n",
    "-",
    " ",
    "C",
    "o",
    "m",
    "p",
    "u",
    "t",
    "e",
    "s",
    " ",
    "o",
    "b",
    "s",
    "e",
    "r",
    "v",
    "a",
    "t",
    "i",
    "o",
    "n",
    "-",
    "s",
    "p",
    "e",
    "c",
    "i",
    "f",
    "i",
    "c",
    " ",
    "w",
    "e",
    "i",
    "g",
    "h",
    "t",
    "s",
    " ",
    "f",
    "o",
    "r",
    " ",
    "e",
    "a",
    "c",
    "h",
    " ",
    "t",
    "r",
    "e",
    "a",
    "t",
    "e",
    "d",
    " ",
    "o",
    "b",
    "s",
    "e",
    "r",
    "v",
    "a",
    "t",
    "i",
    "o",
    "n",
    "\n",
    "-",
    " ",
    "F",
    "i",
    "t",
    "s",
    " ",
    "a",
    " ",
    "m",
    "o",
    "d",
    "e",
    "l",
    " ",
    "p",
    "e",
    "r",
    " ",
    "t",
    "r",
    "e",
    "a",
    "t",
    "e",
    "d",
    " ",
    "o",
    "b",
    "s",
    "e",
    "r",
    "v",
    "a",
    "t",
    "i",
    "o",
    "n",
    ",",
    " ",
    "t",
    "h",
    "e",
    "n",
    " ",
    "a",
    "v",
    "e",
    "r",
    "a",
    "g",
    "e",
    "s",
    " ",
    "t",
    "h",
    "e",
    " ",
    "i",
    "n",
    "d",
    "i",
    "v",
    "i",
    "d",
    "u",
    "a",
    "l",
    " ",
    "e",
    "f",
    "f",
    "e",
    "c",
    "t",
    "s",
    "\n",
    "-",
    " ",
    "M",
    "o",
    "r",
    "e",
    " ",
    "f",
    "l",
    "e",
    "x",
    "i",
    "b",
    "l",
    "e",
    ",",
    " ",
    "a",
    "l",
    "l",
    "o",
    "w",
    "s",
    " ",
    "f",
    "o",
    "r",
    " ",
    "h",
    "e",
    "t",
    "e",
    "r",
    "o",
    "g",
    "e",
    "n",
    "e",
    "o",
    "u",
    "s",
    " ",
    "t",
    "r",
    "e",
    "a",
    "t",
    "m",
    "e",
    "n",
    "t",
    " ",
    "e",
    "f",
    "f",
    "e",
    "c",
    "t",
    "s",
    "\n",
    "-",
    " ",
    "C",
    "o",
    "m",
    "p",
    "u",
    "t",
    "a",
    "t",
    "i",
    "o",
    "n",
    "a",
    "l",
    "l",
    "y",
    " ",
    "i",
    "n",
    "t",
    "e",
    "n",
    "s",
    "i",
    "v",
    "e",
    " ",
    "(",
    "N",
    "_",
    "t",
    "r",
    "e",
    "a",
    "t",
    "e",
    "d",
    " ",
    "o",
    "p",
    "t",
    "i",
    "m",
    "i",
    "z",
    "a",
    "t",
    "i",
    "o",
    "n",
    "s",
    ")",
    "\n",
    "\n",
    "*",
    "*",
    "G",
    "l",
    "o",
    "b",
    "a",
    "l",
    " ",
    "M",
    "e",
    "t",
    "h",
    "o",
    "d",
    "*",
    "*",
    " ",
    "(",
    "`",
    "m",
    "e",
    "t",
    "h",
    "o",
    "d",
    "=",
    "'",
    "g",
    "l",
    "o",
    "b",
    "a",
    "l",
    "'",
    "`",
    ")",
    ":",
    "\n",
    "-",
    " ",
    "F",
    "i",
    "t",
    "s",
    " ",
    "a",
    " ",
    "s",
    "i",
    "n",
    "g",
    "l",
    "e",
    " ",
    "m",
    "o",
    "d",
    "e",
    "l",
    " ",
    "o",
    "n",
    " ",
    "c",
    "o",
    "n",
    "t",
    "r",
    "o",
    "l",
    " ",
    "d",
    "a",
    "t",
    "a",
    " ",
    "u",
    "s",
    "i",
    "n",
    "g",
    " ",
    "(",
    "1",
    "-",
    "W",
    ")",
    " ",
    "m",
    "a",
    "s",
    "k",
    "e",
    "d",
    " ",
    "w",
    "e",
    "i",
    "g",
    "h",
    "t",
    "s",
    " ",
    "(",
    "p",
    "e",
    "r",
    " ",
    "p",
    "a",
    "p",
    "e",
    "r",
    " ",
    "E",
    "q",
    ".",
    " ",
    "2",
    ")",
    "\n",
    "-",
    " ",
    "E",
    "x",
    "t",
    "r",
    "a",
    "c",
    "t",
    "s",
    " ",
    "p",
    "e",
    "r",
    "-",
    "o",
    "b",
    "s",
    "e",
    "r",
    "v",
    "a",
    "t",
    "i",
    "o",
    "n",
    " ",
    "t",
    "r",
    "e",
    "a",
    "t",
    "m",
    "e",
    "n",
    "t",
    " ",
    "e",
    "f",
    "f",
    "e",
    "c",
    "t",
    "s",
    " ",
    "a",
    "s",
    " ",
    "p",
    "o",
    "s",
    "t",
    "-",
    "h",
    "o",
    "c",
    " ",
    "r",
    "e",
    "s",
    "i",
    "d",
    "u",
    "a",
    "l",
    "s",
    ":",
    " ",
    "\u03c4",
    "_",
    "i",
    "t",
    " ",
    "=",
    " ",
    "Y",
    "_",
    "i",
    "t",
    " ",
    "-",
    " ",
    "\u03bc",
    " ",
    "-",
    " ",
    "\u03b1",
    "_",
    "i",
    " ",
    "-",
    " ",
    "\u03b2",
    "_",
    "t",
    " ",
    "-",
    " ",
    "L",
    "_",
    "i",
    "t",
    "\n",
    "-",
    " ",
    "A",
    "T",
    "T",
    " ",
    "=",
    " ",
    "m",
    "e",
    "a",
    "n",
    "(",
    "\u03c4",
    "_",
    "i",
    "t",
    ")",
    " ",
    "o",
    "v",
    "e",
    "r",
    " ",
    "t",
    "r",
    "e",
    "a",
    "t",
    "e",
    "d",
    " ",
    "o",
    "b",
    "s",
    "e",
    "r",
    "v",
    "a",
    "t",
    "i",
    "o",
    "n",
    "s",
    "\n",
    "-",
    " ",
    "F",
    "a",
    "s",
    "t",
    "e",
    "r",
    " ",
    "(",
    "s",
    "i",
    "n",
    "g",
    "l",
    "e",
    " ",
    "o",
    "p",
    "t",
    "i",
    "m",
    "i",
    "z",
    "a",
    "t",
    "i",
    "o",
    "n",
    ")",
    " ",
    "w",
    "i",
    "t",
    "h",
    " ",
    "g",
    "l",
    "o",
    "b",
    "a",
    "l",
    " ",
    "w",
    "e",
    "i",
    "g",
    "h",
    "t",
    "s",
    "\n",
    "\n",
    "N",
    "o",
    "t",
    "e",
    ":",
    " ",
    "`",
    "m",
    "e",
    "t",
    "h",
    "o",
    "d",
    "=",
    "'",
    "t",
    "w",
    "o",
    "s",
    "t",
    "e",
    "p",
    "'",
    "`",
    " ",
    "i",
    "s",
    " ",
    "a",
    " ",
    "d",
    "e",
    "p",
    "r",
    "e",
    "c",
    "a",
    "t",
    "e",
    "d",
    " ",
    "a",
    "l",
    "i",
    "a",
    "s",
    " ",
    "f",
    "o",
    "r",
    " ",
    "`",
    "m",
    "e",
    "t",
    "h",
    "o",
    "d",
    "=",
    "'",
    "l",
    "o",
    "c",
    "a",
    "l",
    "'",
    "`",
    ",",
    " ",
    "a",
    "n",
    "d",
    " ",
    "`",
    "m",
    "e",
    "t",
    "h",
    "o",
    "d",
    "=",
    "'",
    "j",
    "o",
    "i",
    "n",
    "t",
    "'",
    "`",
    " ",
    "i",
    "s",
    " ",
    "a",
    " ",
    "d",
    "e",
    "p",
    "r",
    "e",
    "c",
    "a",
    "t",
    "e",
    "d",
    " ",
    "a",
    "l",
    "i",
    "a",
    "s",
    " ",
    "f",
    "o",
    "r",
    " ",
    "`",
    "m",
    "e",
    "t",
    "h",
    "o",
    "d",
    "=",
    "'",
    "g",
    "l",
    "o",
    "b",
    "a",
    "l",
    "'",
    "`",
    ".",
    " ",
    "B",
    "o",
    "t",
    "h",
    " ",
    "w",
    "i",
    "l",
    "l",
    " ",
    "b",
    "e",
    " ",
    "r",
    "e",
    "m",
    "o",
    "v",
    "e",
    "d",
    " ",
    "i",
    "n",
    " ",
    "v",
    "3",
    ".",
    "0",
    "."
   ],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 10. Results Export\n",
    "\n",
    "TROP results can be easily exported to different formats."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Convert to dictionary\n",
    "results_dict = results.to_dict()\n",
    "print(\"Results as dictionary:\")\n",
    "for key, value in results_dict.items():\n",
    "    if isinstance(value, float):\n",
    "        print(f\"  {key}: {value:.4f}\")\n",
    "    else:\n",
    "        print(f\"  {key}: {value}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "## Summary\n\nKey takeaways for TROP:\n\n1. **Best use cases**: Factor confounding, unobserved time-varying confounders with interactive effects\n2. **Factor estimation**: Nuclear norm regularization with LOOCV for tuning\n3. **Three tuning parameters**: \u03bb_time, \u03bb_unit, \u03bb_nn selected automatically via LOOCV\n4. **Unit weights**: Exponential distance-based weighting of control units, where distance is computed as RMS outcome difference on control periods excluding the target period\n5. **Time weights**: Exponential decay weighting of pre-treatment periods\n6. **Weights**: Importance weights controlling relative contribution of observations (higher = more relevant)\n7. **Estimation methods**:\n   - `method='local'` (default): Per-observation estimation, allows heterogeneous effects\n   - `method='global'`: Single model with (1-W) masking, post-hoc heterogeneous effects, faster\n\n**When to use TROP vs SDID**:\n- Use **SDID** when parallel trends is plausible and factors are not a concern\n- Use **TROP** when you suspect factor confounding (regional shocks, economic cycles, latent factors)\n- Running both provides a useful robustness check\n\n**When to use local vs global method**:\n- Use **local** (default) for maximum flexibility with per-observation weights\n- Use **global** for faster estimation with global weights\n\n**Reference**:\n- Athey, S., Imbens, G. W., Qu, Z., & Viviano, D. (2025). Triply Robust Panel Estimators. *Working Paper*. https://arxiv.org/abs/2508.21536"
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Individual treatment effects\n",
    "treatment_effects_df = results.get_treatment_effects_df()\n",
    "print(\"\\nIndividual treatment effects (first 10):\")\n",
    "print(treatment_effects_df.head(10).to_string(index=False))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": "## Summary\n\nKey takeaways for TROP:\n\n1. **Best use cases**: Factor confounding, unobserved time-varying confounders with interactive effects\n2. **Factor estimation**: Nuclear norm regularization with LOOCV for tuning\n3. **Three tuning parameters**: \u03bb_time, \u03bb_unit, \u03bb_nn selected automatically via LOOCV\n4. **Unit weights**: Exponential distance-based weighting of control units, where distance is computed as RMS outcome difference on control periods excluding the target period\n5. **Time weights**: Exponential decay weighting of pre-treatment periods\n6. **Weights**: Importance weights controlling relative contribution of observations (higher = more relevant)\n\n**When to use TROP vs SDID**:\n- Use **SDID** when parallel trends is plausible and factors are not a concern\n- Use **TROP** when you suspect factor confounding (regional shocks, economic cycles, latent factors)\n- Running both provides a useful robustness check\n\n**Reference**:\n- Athey, S., Imbens, G. W., Qu, Z., & Viviano, D. (2025). Triply Robust Panel Estimators. *Working Paper*. https://arxiv.org/abs/2508.21536"
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}