{ "cells": [ { "cell_type": "markdown", "id": "cell-0", "metadata": {}, "source": [ "# Honest DiD: Sensitivity Analysis for Parallel Trends\n", "\n", "The **parallel trends assumption** is crucial for difference-in-differences (DiD) validity, but it is fundamentally untestable. **Honest DiD** (Rambachan & Roth 2023) provides a framework for:\n", "\n", "1. Relaxing the parallel trends assumption\n", "2. Computing bounds on treatment effects under potential violations\n", "3. Constructing robust confidence intervals that remain valid even if parallel trends is violated\n", "4. Computing \"breakdown values\" showing how much violation is needed to nullify results\n", "\n", "This notebook covers:\n", "1. Motivation: Why standard event studies can be misleading\n", "2. Basic usage with `HonestDiD`\n", "3. Interpreting bounds and breakdown values\n", "4. Sensitivity analysis over a grid of M values\n", "5. Visualization\n", "6. Advanced: Smoothness restrictions" ] }, { "cell_type": "code", "execution_count": null, "id": "cell-1", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "from diff_diff import MultiPeriodDiD\n", "from diff_diff.honest_did import (\n", " HonestDiD,\n", " compute_honest_did,\n", " DeltaSD,\n", " DeltaRM,\n", ")\n", "\n", "# For plots\n", "try:\n", " import matplotlib.pyplot as plt\n", " plt.style.use('seaborn-v0_8-whitegrid')\n", " HAS_MATPLOTLIB = True\n", "except ImportError:\n", " HAS_MATPLOTLIB = False\n", " print(\"matplotlib not installed - visualization examples will be skipped\")" ] }, { "cell_type": "markdown", "id": "cell-2", "metadata": {}, "source": [ "## 1. Motivation: The Problem with Pre-trend Testing\n", "\n", "Researchers often test for parallel trends by checking if pre-treatment coefficients are statistically significant. However, this approach has serious problems:\n", "\n", "1. **Low power**: With typical sample sizes, we may fail to detect real violations\n", "2. **Pre-test bias**: Conditioning on passing a pre-trends test biases inference\n", "3. **Post-treatment violations**: Even if pre-trends look good, post-treatment violations can occur\n", "\n", "**Honest DiD addresses these issues by:**\n", "- Not requiring parallel trends to hold exactly\n", "- Allowing for bounded violations related to observed pre-trends\n", "- Providing valid inference under these weaker assumptions" ] }, { "cell_type": "markdown", "id": "cell-3", "metadata": {}, "source": [ "## 2. Generate Example Data\n", "\n", "We'll create panel data with:\n", "- A true treatment effect of 5.0\n", "- Some pre-trend violations (to make results interesting)" ] }, { "cell_type": "code", "execution_count": null, "id": "cell-4", "metadata": {}, "outputs": [], "source": [ "def generate_did_data(n_units=200, n_periods=10, true_att=5.0, \n", " pre_trend_violation=0.3, seed=42):\n", " \"\"\"\n", " Generate panel data with potential parallel trends violations.\n", " \n", " Parameters\n", " ----------\n", " pre_trend_violation : float\n", " Magnitude of differential pre-trend between treated and control.\n", " 0 = perfect parallel trends, larger = more violation.\n", " \"\"\"\n", " np.random.seed(seed)\n", " treatment_time = n_periods // 2\n", " \n", " data = []\n", " for unit in range(n_units):\n", " is_treated = unit < n_units // 2\n", " unit_effect = np.random.normal(0, 2)\n", " \n", " for period in range(n_periods):\n", " # Common time trend\n", " time_effect = period * 1.0\n", " \n", " # Add differential pre-trend for treated (parallel trends violation)\n", " if is_treated:\n", " time_effect += pre_trend_violation * (period - treatment_time)\n", " \n", " y = 10.0 + unit_effect + time_effect\n", " \n", " # Treatment effect\n", " post = period >= treatment_time\n", " if is_treated and post:\n", " y += true_att\n", " \n", " y += np.random.normal(0, 1)\n", " \n", " data.append({\n", " 'unit': unit,\n", " 'period': period,\n", " 'treated': int(is_treated),\n", " 'post': int(post),\n", " 'outcome': y\n", " })\n", " \n", " return pd.DataFrame(data)\n", "\n", "# Generate data with mild pre-trend violation\n", "df = generate_did_data(pre_trend_violation=0.2)\n", "print(f\"Generated {len(df)} observations\")\n", "print(f\"Treatment time: period 5\")\n", "print(f\"True ATT: 5.0\")" ] }, { "cell_type": "markdown", "id": "cell-5", "metadata": {}, "source": [ "## 3. Fit Standard Event Study\n", "\n", "First, let's estimate a standard event study using `MultiPeriodDiD`." ] }, { "cell_type": "code", "execution_count": null, "id": "cell-6", "metadata": {}, "outputs": [], "source": [ "# Fit event study\n", "mp_did = MultiPeriodDiD()\n", "event_results = mp_did.fit(\n", " df,\n", " outcome='outcome',\n", " treatment='treated',\n", " time='period',\n", " post_periods=[5, 6, 7, 8, 9]\n", ")\n", "\n", "print(event_results.summary())" ] }, { "cell_type": "code", "execution_count": null, "id": "cell-7", "metadata": {}, "outputs": [], "source": [ "from diff_diff.visualization import plot_event_study\n", "\n", "if HAS_MATPLOTLIB:\n", " fig, ax = plt.subplots(figsize=(10, 6))\n", " plot_event_study(\n", " event_results,\n", " ax=ax,\n", " title='Standard Event Study',\n", " show=False\n", " )\n", " plt.tight_layout()\n", " plt.show()" ] }, { "cell_type": "markdown", "id": "cell-8", "metadata": {}, "source": [ "## 4. Basic Honest DiD: Relative Magnitudes\n", "\n", "The **relative magnitudes** approach bounds post-treatment violations by M times the maximum observed pre-treatment violation:\n", "\n", "$$|\\delta_{post}| \\leq \\bar{M} \\times \\max(|\\delta_{pre}|)$$\n", "\n", "Where:\n", "- $\\delta_t$ is the violation of parallel trends at time $t$\n", "- $\\bar{M} = 1$ means post-treatment violations can be as bad as the worst pre-treatment violation\n", "- $\\bar{M} = 0$ is equivalent to assuming parallel trends holds exactly" ] }, { "cell_type": "code", "execution_count": null, "id": "cell-9", "metadata": {}, "outputs": [], "source": [ "# Create HonestDiD estimator\n", "honest = HonestDiD(\n", " method='relative_magnitude',\n", " M=1.0, # Post-treatment violations up to 1x max pre-treatment violation\n", " alpha=0.05\n", ")\n", "\n", "# Compute bounds\n", "honest_results = honest.fit(event_results)\n", "\n", "print(honest_results.summary())" ] }, { "cell_type": "markdown", "id": "cell-10", "metadata": {}, "source": [ "### Interpreting the Results\n", "\n", "The output shows:\n", "\n", "1. **Original Estimate**: The point estimate assuming parallel trends (standard DiD)\n", "\n", "2. **Identified Set**: The range of treatment effects consistent with the data *and* our assumptions about violations. Wider with larger M.\n", "\n", "3. **Robust CI**: A confidence interval that covers the true effect with 95% probability *regardless* of which value in the identified set is correct.\n", "\n", "4. **Effect robust to violations**: Whether the robust CI excludes zero. If yes, the effect is significant even under potential violations." ] }, { "cell_type": "code", "execution_count": null, "id": "cell-11", "metadata": {}, "outputs": [], "source": [ "# Key results\n", "print(f\"Original estimate: {honest_results.original_estimate:.4f}\")\n", "print(f\"Identified set: [{honest_results.lb:.4f}, {honest_results.ub:.4f}]\")\n", "print(f\"Robust 95% CI: [{honest_results.ci_lb:.4f}, {honest_results.ci_ub:.4f}]\")\n", "print(f\"CI width: {honest_results.ci_width:.4f}\")\n", "print(f\"\")\n", "print(f\"Effect robust to M={honest_results.M} violations: {honest_results.is_significant}\")" ] }, { "cell_type": "markdown", "id": "cell-12", "metadata": {}, "source": [ "## 5. Sensitivity Analysis\n", "\n", "A key feature of Honest DiD is examining how results change as we allow larger violations. This helps answer: \"How much would parallel trends need to be violated to overturn our conclusions?\"" ] }, { "cell_type": "code", "execution_count": null, "id": "cell-13", "metadata": {}, "outputs": [], "source": [ "# Run sensitivity analysis over a grid of M values\n", "sensitivity = honest.sensitivity_analysis(\n", " event_results,\n", " M_grid=[0, 0.25, 0.5, 0.75, 1.0, 1.5, 2.0, 3.0]\n", ")\n", "\n", "print(sensitivity.summary())" ] }, { "cell_type": "code", "execution_count": null, "id": "cell-14", "metadata": {}, "outputs": [], "source": [ "# Key takeaway: the breakdown value\n", "print(f\"Breakdown value: {sensitivity.breakdown_M}\")\n", "print(\"\")\n", "if sensitivity.breakdown_M is not None:\n", " print(f\"The result is robust to violations up to M = {sensitivity.breakdown_M:.2f}\")\n", " print(f\"This means post-treatment trend violations could be up to \")\n", " print(f\"{sensitivity.breakdown_M:.1f}x the worst pre-treatment violation \")\n", " print(f\"and we'd still conclude the effect is positive.\")\n", "else:\n", " print(\"No breakdown found - effect is always significant!\")" ] }, { "cell_type": "code", "execution_count": null, "id": "cell-15", "metadata": {}, "outputs": [], "source": [ "# Visualize the sensitivity analysis\n", "if HAS_MATPLOTLIB:\n", " fig, ax = plt.subplots(figsize=(10, 6))\n", " sensitivity.plot(ax=ax, show=False)\n", " plt.tight_layout()\n", " plt.show()" ] }, { "cell_type": "markdown", "id": "cell-16", "metadata": {}, "source": [ "### Reading the Sensitivity Plot\n", "\n", "- **X-axis (M)**: How much we allow post-treatment violations relative to pre-treatment violations\n", "- **Shaded region**: The identified set (range of possible treatment effects)\n", "- **Blue lines**: Robust confidence interval\n", "- **Red dashed line**: Breakdown value (where CI first includes zero)\n", "- **Black line**: Original estimate (under parallel trends)\n", "\n", "As M increases:\n", "- The identified set widens (more possible violations)\n", "- Eventually, the CI includes zero (we can no longer rule out no effect)" ] }, { "cell_type": "markdown", "id": "cell-17", "metadata": {}, "source": [ "## 6. Different Restriction Parameters\n", "\n", "Let's compare results for different values of M:" ] }, { "cell_type": "code", "execution_count": null, "id": "cell-18", "metadata": {}, "outputs": [], "source": [ "# Compare different M values\n", "M_values = [0, 0.5, 1.0, 2.0]\n", "\n", "print(f\"{'M':<8} {'CI Lower':>12} {'CI Upper':>12} {'Significant':>12}\")\n", "print(\"-\" * 48)\n", "\n", "for M in M_values:\n", " result = honest.fit(event_results, M=M)\n", " sig = \"Yes\" if result.is_significant else \"No\"\n", " print(f\"{M:<8.2f} {result.ci_lb:>12.4f} {result.ci_ub:>12.4f} {sig:>12}\")" ] }, { "cell_type": "markdown", "id": "cell-19", "metadata": {}, "source": [ "## 7. Breakdown Value\n", "\n", "The **breakdown value** is the smallest M where the robust CI first includes zero. It tells us how robust our conclusion is to parallel trends violations." ] }, { "cell_type": "code", "execution_count": null, "id": "cell-20", "metadata": {}, "outputs": [], "source": [ "# Compute breakdown value directly\n", "breakdown = honest.breakdown_value(event_results, tol=0.01)\n", "\n", "if breakdown is not None:\n", " print(f\"Breakdown value: M = {breakdown:.3f}\")\n", " print(\"\")\n", " print(\"Interpretation:\")\n", " print(f\" - For M < {breakdown:.2f}: Effect is statistically significant\")\n", " print(f\" - For M >= {breakdown:.2f}: Cannot rule out zero effect\")\n", " print(\"\")\n", " print(\"Is this robust enough?\")\n", " if breakdown >= 1.0:\n", " print(f\" Yes! Result holds even if post-treatment violations \")\n", " print(f\" are as bad as observed pre-treatment violations.\")\n", " else:\n", " print(f\" Somewhat. Result requires post-treatment violations \")\n", " print(f\" to be smaller than pre-treatment violations.\")\n", "else:\n", " print(\"No breakdown found - effect is always significant!\")" ] }, { "cell_type": "markdown", "id": "cell-21", "metadata": {}, "source": [ "## 8. Smoothness Restrictions\n", "\n", "An alternative approach restricts the **second differences** of the trend violations:\n", "\n", "$$|\\delta_{t+1} - 2\\delta_t + \\delta_{t-1}| \\leq M$$\n", "\n", "This says violations must change smoothly over time:\n", "- $M = 0$: Violations must follow a linear trend (linear extrapolation of pre-trends)\n", "- $M > 0$: Allows some non-linearity in how violations evolve" ] }, { "cell_type": "code", "execution_count": null, "id": "cell-22", "metadata": {}, "outputs": [], "source": [ "# Smoothness restriction\n", "honest_smooth = HonestDiD(\n", " method='smoothness',\n", " M=0.5, # Allow some curvature\n", " alpha=0.05\n", ")\n", "\n", "smooth_results = honest_smooth.fit(event_results)\n", "print(smooth_results.summary())" ] }, { "cell_type": "code", "execution_count": null, "id": "cell-23", "metadata": {}, "outputs": [], "source": [ "# Compare smoothness vs relative magnitudes\n", "print(\"Comparison of Methods (M=1.0)\")\n", "print(\"=\" * 60)\n", "\n", "rm_result = HonestDiD(method='relative_magnitude', M=1.0).fit(event_results)\n", "sd_result = HonestDiD(method='smoothness', M=1.0).fit(event_results)\n", "\n", "print(f\"{'Method':<25} {'CI Lower':>12} {'CI Upper':>12} {'Width':>10}\")\n", "print(\"-\" * 60)\n", "print(f\"{'Relative Magnitudes':<25} {rm_result.ci_lb:>12.4f} {rm_result.ci_ub:>12.4f} {rm_result.ci_width:>10.4f}\")\n", "print(f\"{'Smoothness':<25} {sd_result.ci_lb:>12.4f} {sd_result.ci_ub:>12.4f} {sd_result.ci_width:>10.4f}\")" ] }, { "cell_type": "markdown", "id": "cell-24", "metadata": {}, "source": [ "## 9. Using the Convenience Function\n", "\n", "For quick analysis, use `compute_honest_did()`:" ] }, { "cell_type": "code", "execution_count": null, "id": "cell-25", "metadata": {}, "outputs": [], "source": [ "# One-liner for quick bounds\n", "quick_result = compute_honest_did(\n", " event_results,\n", " method='relative_magnitude',\n", " M=1.0\n", ")\n", "\n", "print(f\"Quick bounds: [{quick_result.ci_lb:.3f}, {quick_result.ci_ub:.3f}]\")" ] }, { "cell_type": "markdown", "id": "cell-26", "metadata": {}, "source": [ "## 10. Converting Results to DataFrames\n", "\n", "Results can be exported for further analysis:" ] }, { "cell_type": "code", "execution_count": null, "id": "cell-27", "metadata": {}, "outputs": [], "source": [ "# Single result to DataFrame\n", "print(\"Single result:\")\n", "print(honest_results.to_dataframe())" ] }, { "cell_type": "code", "execution_count": null, "id": "cell-28", "metadata": {}, "outputs": [], "source": [ "# Sensitivity analysis to DataFrame\n", "print(\"\\nSensitivity analysis:\")\n", "sensitivity.to_dataframe()" ] }, { "cell_type": "markdown", "id": "cell-29", "metadata": {}, "source": [ "## Summary\n", "\n", "**Key Takeaways:**\n", "\n", "1. **Honest DiD** provides robust inference without assuming parallel trends holds exactly\n", "\n", "2. **Relative magnitudes** (M̄) bounds post-treatment violations by a multiple of observed pre-treatment violations\n", " - M̄=0: Standard parallel trends\n", " - M̄=1: Violations as bad as worst pre-period\n", " - M̄>1: Even larger violations allowed\n", "\n", "3. **Smoothness** (M) bounds the curvature of violations over time\n", " - M=0: Linear extrapolation of pre-trends\n", " - M>0: Allows non-linear changes\n", "\n", "4. **Breakdown value** tells you how robust your conclusion is\n", "\n", "5. **Best practices:**\n", " - Report results for multiple M values\n", " - Include the sensitivity plot in publications\n", " - Discuss what violation magnitudes are plausible in your setting\n", " - Use breakdown value to assess robustness\n", "\n", "**Related Tutorials:**\n", "- `04_parallel_trends.ipynb` - Standard parallel trends testing\n", "- `06_power_analysis.ipynb` - Power analysis for study design\n", "- `07_pretrends_power.ipynb` - Pre-trends power analysis (Roth 2022) - assess what violations your pre-trends test could have detected\n", "\n", "**Reference:**\n", "\n", "Rambachan, A., & Roth, J. (2023). A More Credible Approach to Parallel Trends. \n", "*The Review of Economic Studies*, 90(5), 2555-2591. \n", "https://doi.org/10.1093/restud/rdad018" ] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 5 }