{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "cell-0",
   "metadata": {},
   "source": [
    "# Real-World Data Examples\n",
    "\n",
    "This notebook demonstrates `diff-diff` using real-world datasets from classic econometric studies. We'll cover:\n",
    "\n",
    "1. **Card & Krueger (1994)** - Classic 2x2 DiD: Effect of minimum wage on employment\n",
    "2. **Castle Doctrine Laws** - Staggered adoption: Effect of self-defense laws on homicide rates\n",
    "3. **Unilateral Divorce Laws** - Staggered adoption: Effect of no-fault divorce on divorce rates\n",
    "\n",
    "These examples show how to apply DiD methods to real policy questions and replicate findings from influential studies."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-1",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "from diff_diff import (\n",
    "    DifferenceInDifferences,\n",
    "    TwoWayFixedEffects,\n",
    "    CallawaySantAnna,\n",
    "    SunAbraham,\n",
    "    bacon_decompose,\n",
    ")\n",
    "from diff_diff.datasets import (\n",
    "    load_card_krueger,\n",
    "    load_castle_doctrine,\n",
    "    load_divorce_laws,\n",
    "    list_datasets,\n",
    ")\n",
    "from diff_diff.visualization import plot_event_study, plot_bacon, plot_group_effects\n",
    "\n",
    "# For plots\n",
    "try:\n",
    "    import matplotlib.pyplot as plt\n",
    "    plt.style.use('seaborn-v0_8-whitegrid')\n",
    "    HAS_MATPLOTLIB = True\n",
    "except ImportError:\n",
    "    HAS_MATPLOTLIB = False\n",
    "    print(\"matplotlib not installed - visualization examples will be skipped\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# List available datasets\n",
    "print(\"Available real-world datasets in diff-diff:\")\n",
    "print(\"=\" * 60)\n",
    "for name, desc in list_datasets().items():\n",
    "    print(f\"  {name}: {desc}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cell-3",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "## 1. Card & Krueger (1994): Minimum Wage and Employment\n",
    "\n",
    "### Background\n",
    "\n",
    "On April 1, 1992, New Jersey raised its minimum wage from \\$4.25 to \\$5.05 per hour, while neighboring Pennsylvania kept its minimum wage at \\$4.25. Card and Krueger conducted a survey of fast-food restaurants in both states before and after the wage increase.\n",
    "\n",
    "**Research question**: Does raising the minimum wage reduce employment?\n",
    "\n",
    "**Design**: Classic 2x2 DiD\n",
    "- **Treatment group**: New Jersey restaurants\n",
    "- **Control group**: Pennsylvania restaurants  \n",
    "- **Pre-period**: February 1992 (before wage increase)\n",
    "- **Post-period**: November 1992 (after wage increase)\n",
    "\n",
    "**Key finding**: No significant negative effect on employment; point estimate was actually positive (+2.8 FTE employees)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-4",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load the Card-Krueger dataset\n",
    "ck = load_card_krueger()\n",
    "\n",
    "print(f\"Dataset shape: {ck.shape}\")\n",
    "print(f\"\\nStores by state:\")\n",
    "print(ck.groupby('state').size())\n",
    "print(f\"\\nFirst few rows:\")\n",
    "ck.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-5",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Summary statistics by state\n",
    "print(\"Summary Statistics by State\")\n",
    "print(\"=\" * 60)\n",
    "\n",
    "summary = ck.groupby('state').agg({\n",
    "    'emp_pre': ['mean', 'std'],\n",
    "    'emp_post': ['mean', 'std'],\n",
    "    'emp_change': ['mean', 'std'],\n",
    "    'wage_pre': 'mean',\n",
    "    'wage_post': 'mean',\n",
    "}).round(2)\n",
    "\n",
    "summary.columns = ['Emp Pre (mean)', 'Emp Pre (sd)', \n",
    "                   'Emp Post (mean)', 'Emp Post (sd)',\n",
    "                   'Emp Change (mean)', 'Emp Change (sd)',\n",
    "                   'Wage Pre', 'Wage Post']\n",
    "summary"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cell-6",
   "metadata": {},
   "source": [
    "### Preparing Data for DiD\n",
    "\n",
    "The data is in \"wide\" format (one row per store). We need to convert it to \"long\" format for the DiD estimator."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-7",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Reshape to long format\n",
    "ck_long = ck.melt(\n",
    "    id_vars=['store_id', 'state', 'chain', 'treated'],\n",
    "    value_vars=['emp_pre', 'emp_post'],\n",
    "    var_name='period',\n",
    "    value_name='employment'\n",
    ")\n",
    "\n",
    "# Create post indicator\n",
    "ck_long['post'] = (ck_long['period'] == 'emp_post').astype(int)\n",
    "\n",
    "# Drop missing employment values\n",
    "ck_long = ck_long.dropna(subset=['employment'])\n",
    "\n",
    "print(f\"Long format shape: {ck_long.shape}\")\n",
    "print(f\"\\nSample distribution:\")\n",
    "print(ck_long.groupby(['state', 'post']).size().unstack())\n",
    "ck_long.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cell-8",
   "metadata": {},
   "source": [
    "### DiD Estimation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-9",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Basic DiD estimation\n",
    "did = DifferenceInDifferences(robust=True)\n",
    "\n",
    "results = did.fit(\n",
    "    ck_long,\n",
    "    outcome='employment',\n",
    "    treatment='treated',\n",
    "    time='post'\n",
    ")\n",
    "\n",
    "print(\"Card & Krueger DiD Results\")\n",
    "print(\"=\" * 60)\n",
    "print(results.summary())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-10",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Manual calculation to verify\n",
    "print(\"\\nManual DiD Calculation:\")\n",
    "print(\"-\" * 40)\n",
    "\n",
    "nj_pre = ck_long[(ck_long['state'] == 'NJ') & (ck_long['post'] == 0)]['employment'].mean()\n",
    "nj_post = ck_long[(ck_long['state'] == 'NJ') & (ck_long['post'] == 1)]['employment'].mean()\n",
    "pa_pre = ck_long[(ck_long['state'] == 'PA') & (ck_long['post'] == 0)]['employment'].mean()\n",
    "pa_post = ck_long[(ck_long['state'] == 'PA') & (ck_long['post'] == 1)]['employment'].mean()\n",
    "\n",
    "print(f\"NJ (pre):  {nj_pre:.2f}\")\n",
    "print(f\"NJ (post): {nj_post:.2f}\")\n",
    "print(f\"NJ change: {nj_post - nj_pre:.2f}\")\n",
    "print()\n",
    "print(f\"PA (pre):  {pa_pre:.2f}\")\n",
    "print(f\"PA (post): {pa_post:.2f}\")\n",
    "print(f\"PA change: {pa_post - pa_pre:.2f}\")\n",
    "print()\n",
    "print(f\"DiD estimate: {(nj_post - nj_pre) - (pa_post - pa_pre):.2f}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-11",
   "metadata": {},
   "outputs": [],
   "source": [
    "# With chain fixed effects for better precision\n",
    "did_fe = DifferenceInDifferences(robust=True)\n",
    "\n",
    "results_fe = did_fe.fit(\n",
    "    ck_long,\n",
    "    outcome='employment',\n",
    "    treatment='treated',\n",
    "    time='post',\n",
    "    fixed_effects=['chain']\n",
    ")\n",
    "\n",
    "print(\"DiD with Chain Fixed Effects\")\n",
    "print(\"=\" * 60)\n",
    "print(results_fe.summary())\n",
    "print(f\"\\nNote: Adding chain FE controls for systematic differences across chains.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cell-12",
   "metadata": {},
   "source": [
    "### Interpretation\n",
    "\n",
    "The DiD estimate suggests that New Jersey's minimum wage increase did **not** lead to a decrease in employment. If anything, the point estimate is slightly positive, though not statistically significant.\n",
    "\n",
    "This result challenged the traditional economic view that minimum wage increases necessarily reduce employment, and sparked extensive debate and follow-up research."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-13",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Visualization: Employment trends\n",
    "if HAS_MATPLOTLIB:\n",
    "    fig, axes = plt.subplots(1, 2, figsize=(12, 5))\n",
    "    \n",
    "    # Mean employment by state and period\n",
    "    means = ck_long.groupby(['state', 'post'])['employment'].mean().unstack()\n",
    "    means.columns = ['Feb 1992', 'Nov 1992']\n",
    "    \n",
    "    ax = axes[0]\n",
    "    x = [0, 1]\n",
    "    ax.plot(x, means.loc['NJ'], 'o-', label='NJ (Treated)', color='#2ecc71', linewidth=2, markersize=8)\n",
    "    ax.plot(x, means.loc['PA'], 's--', label='PA (Control)', color='#3498db', linewidth=2, markersize=8)\n",
    "    ax.axvline(x=0.5, color='red', linestyle=':', alpha=0.5, label='Min wage increase')\n",
    "    ax.set_xticks([0, 1])\n",
    "    ax.set_xticklabels(['Feb 1992\\n(Pre)', 'Nov 1992\\n(Post)'])\n",
    "    ax.set_ylabel('Mean FTE Employment')\n",
    "    ax.set_title('Employment Trends: NJ vs PA')\n",
    "    ax.legend()\n",
    "    ax.grid(True, alpha=0.3)\n",
    "    \n",
    "    # Distribution of employment changes\n",
    "    ax = axes[1]\n",
    "    nj_changes = ck[ck['state'] == 'NJ']['emp_change'].dropna()\n",
    "    pa_changes = ck[ck['state'] == 'PA']['emp_change'].dropna()\n",
    "    ax.hist(nj_changes, bins=20, alpha=0.6, label='NJ', color='#2ecc71')\n",
    "    ax.hist(pa_changes, bins=20, alpha=0.6, label='PA', color='#3498db')\n",
    "    ax.axvline(nj_changes.mean(), color='#27ae60', linestyle='--', linewidth=2)\n",
    "    ax.axvline(pa_changes.mean(), color='#2980b9', linestyle='--', linewidth=2)\n",
    "    ax.set_xlabel('Employment Change (FTE)')\n",
    "    ax.set_ylabel('Frequency')\n",
    "    ax.set_title('Distribution of Employment Changes')\n",
    "    ax.legend()\n",
    "    \n",
    "    plt.tight_layout()\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cell-14",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "## 2. Castle Doctrine Laws: Staggered Adoption\n",
    "\n",
    "### Background\n",
    "\n",
    "Castle Doctrine (or \"Stand Your Ground\") laws expand self-defense rights by removing the duty to retreat before using deadly force. These laws were adopted by different U.S. states at different times, creating a **staggered adoption** design.\n",
    "\n",
    "**Research question**: Do Castle Doctrine laws affect homicide rates?\n",
    "\n",
    "**Design**: Staggered DiD\n",
    "- **Treatment**: Adoption of Castle Doctrine law\n",
    "- **Cohorts**: States adopting in 2005, 2006, 2007, 2008, 2009\n",
    "- **Control**: States that never adopted during the study period\n",
    "\n",
    "**Key finding**: Cheng & Hoekstra (2013) found an approximately 8% increase in homicide rates following adoption."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-15",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load the Castle Doctrine dataset\n",
    "castle = load_castle_doctrine()\n",
    "\n",
    "print(f\"Dataset shape: {castle.shape}\")\n",
    "print(f\"Years: {castle['year'].min()} to {castle['year'].max()}\")\n",
    "print(f\"States: {castle['state'].nunique()}\")\n",
    "castle.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-16",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Treatment timing\n",
    "cohort_summary = castle.drop_duplicates('state')[['state', 'first_treat']].sort_values('first_treat')\n",
    "\n",
    "print(\"Treatment Cohorts\")\n",
    "print(\"=\" * 40)\n",
    "cohort_counts = cohort_summary.groupby('first_treat').size()\n",
    "for cohort, n in cohort_counts.items():\n",
    "    if cohort == 0:\n",
    "        print(f\"Never treated: {n} states\")\n",
    "    else:\n",
    "        print(f\"Adopted in {cohort}: {n} states\")\n",
    "\n",
    "print(f\"\\nTotal: {len(cohort_summary)} states\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cell-17",
   "metadata": {},
   "source": [
    "### Why Standard TWFE Fails Here\n",
    "\n",
    "With staggered adoption and potentially heterogeneous treatment effects, traditional TWFE can give biased estimates. Let's see why using the Goodman-Bacon decomposition."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-18",
   "metadata": {},
   "outputs": [],
   "source": [
    "# TWFE estimation (potentially biased)\n",
    "twfe = TwoWayFixedEffects()\n",
    "\n",
    "# Need to create numeric state IDs for TWFE\n",
    "castle['state_id'] = castle['state'].astype('category').cat.codes\n",
    "\n",
    "results_twfe = twfe.fit(\n",
    "    castle,\n",
    "    outcome='homicide_rate',\n",
    "    treatment='treated',\n",
    "    unit='state_id',\n",
    "    time='year'\n",
    ")\n",
    "\n",
    "print(\"TWFE Results (potentially biased)\")\n",
    "print(\"=\" * 60)\n",
    "print(f\"ATT: {results_twfe.att:.4f}\")\n",
    "print(f\"SE:  {results_twfe.se:.4f}\")\n",
    "print(f\"\\nNote: TWFE may be biased with staggered adoption.\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-19",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Goodman-Bacon decomposition reveals the problem\n",
    "bacon_results = bacon_decompose(\n",
    "    castle,\n",
    "    outcome='homicide_rate',\n",
    "    unit='state',\n",
    "    time='year',\n",
    "    first_treat='first_treat'\n",
    ")\n",
    "\n",
    "bacon_results.print_summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-20",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Visualize the decomposition\n",
    "if HAS_MATPLOTLIB:\n",
    "    fig, axes = plt.subplots(1, 2, figsize=(14, 5))\n",
    "    \n",
    "    plot_bacon(bacon_results, ax=axes[0], plot_type='scatter', show=False)\n",
    "    plot_bacon(bacon_results, ax=axes[1], plot_type='bar', show=False)\n",
    "    \n",
    "    plt.tight_layout()\n",
    "    plt.show()\n",
    "    \n",
    "    forbidden_weight = bacon_results.total_weight_later_vs_earlier\n",
    "    print(f\"\\n{forbidden_weight:.1%} of TWFE weight comes from 'forbidden comparisons'\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cell-21",
   "metadata": {},
   "source": [
    "### Callaway-Sant'Anna Estimator\n",
    "\n",
    "The CS estimator properly handles staggered adoption by:\n",
    "1. Computing group-time effects ATT(g,t) for each cohort and time period\n",
    "2. Only using not-yet-treated or never-treated units as controls\n",
    "3. Properly aggregating effects"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-22",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Callaway-Sant'Anna estimation\n",
    "cs = CallawaySantAnna(\n",
    "    control_group='never_treated',\n",
    "    n_bootstrap=199,\n",
    "    seed=42\n",
    ")\n",
    "\n",
    "results_cs = cs.fit(\n",
    "    castle,\n",
    "    outcome='homicide_rate',\n",
    "    unit='state',\n",
    "    time='year',\n",
    "    first_treat='first_treat',\n",
    "    aggregate='all'  # Compute all aggregations (simple, event_study, group)\n",
    ")\n",
    "\n",
    "print(results_cs.summary())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-23",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Aggregated Results\n",
    "print(\"Aggregated Results\")\n",
    "print(\"=\" * 60)\n",
    "\n",
    "# Overall ATT (simple aggregation is computed automatically)\n",
    "print(f\"\\nOverall ATT: {results_cs.overall_att:.4f} (SE: {results_cs.overall_se:.4f})\")\n",
    "print(f\"95% CI: [{results_cs.overall_conf_int[0]:.4f}, {results_cs.overall_conf_int[1]:.4f}]\")\n",
    "\n",
    "# By cohort (group_effects is populated when aggregate='group' or 'all')\n",
    "print(\"\\nEffects by Adoption Cohort:\")\n",
    "for cohort in sorted(results_cs.group_effects.keys()):\n",
    "    eff = results_cs.group_effects[cohort]\n",
    "    print(f\"  Cohort {cohort}: {eff['effect']:>7.4f} (SE: {eff['se']:.4f})\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-24",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Event study aggregation (event_study_effects is populated when aggregate='event_study' or 'all')\n",
    "print(\"Event Study Results (Effect by Years Since Adoption)\")\n",
    "print(\"=\" * 60)\n",
    "print(f\"{'Event Time':>12} {'ATT':>10} {'SE':>10} {'95% CI':>25}\")\n",
    "print(\"-\" * 60)\n",
    "\n",
    "for e in sorted(results_cs.event_study_effects.keys()):\n",
    "    eff = results_cs.event_study_effects[e]\n",
    "    ci = eff['conf_int']\n",
    "    sig = '*' if eff['p_value'] < 0.05 else ''\n",
    "    print(f\"{e:>12} {eff['effect']:>10.4f} {eff['se']:>10.4f} [{ci[0]:>8.4f}, {ci[1]:>8.4f}] {sig}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-25",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Event study visualization\n",
    "if HAS_MATPLOTLIB:\n",
    "    fig, ax = plt.subplots(figsize=(10, 6))\n",
    "    plot_event_study(\n",
    "        results=results_cs,\n",
    "        ax=ax,\n",
    "        title='Castle Doctrine Laws: Effect on Homicide Rates',\n",
    "        xlabel='Years Since Law Adoption',\n",
    "        ylabel='Effect on Homicide Rate (per 100k)'\n",
    "    )\n",
    "    plt.tight_layout()\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cell-26",
   "metadata": {},
   "source": [
    "### Robustness Check: Sun-Abraham Estimator\n",
    "\n",
    "Running both CS and Sun-Abraham provides a useful robustness check."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-27",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Sun-Abraham estimation\n",
    "sa = SunAbraham(control_group='never_treated')\n",
    "\n",
    "results_sa = sa.fit(\n",
    "    castle,\n",
    "    outcome='homicide_rate',\n",
    "    unit='state',\n",
    "    time='year',\n",
    "    first_treat='first_treat'\n",
    ")\n",
    "\n",
    "results_sa.print_summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-28",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Compare CS and SA\n",
    "cs_name = \"Callaway-Sant'Anna\"\n",
    "sa_name = \"Sun-Abraham\"\n",
    "twfe_name = \"TWFE (potentially biased)\"\n",
    "\n",
    "print(\"Robustness Check: CS vs Sun-Abraham\")\n",
    "print(\"=\" * 60)\n",
    "print(f\"{'Estimator':<25} {'Overall ATT':>15} {'SE':>10}\")\n",
    "print(\"-\" * 60)\n",
    "print(f\"{cs_name:<25} {results_cs.overall_att:>15.4f} {results_cs.overall_se:>10.4f}\")\n",
    "print(f\"{sa_name:<25} {results_sa.overall_att:>15.4f} {results_sa.overall_se:>10.4f}\")\n",
    "print(f\"{twfe_name:<25} {results_twfe.att:>15.4f} {results_twfe.se:>10.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cell-29",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "## 3. Unilateral Divorce Laws: Long Panel with Staggered Adoption\n",
    "\n",
    "### Background\n",
    "\n",
    "Unilateral (no-fault) divorce laws allow one spouse to obtain a divorce without the other's consent. These laws were adopted at different times across U.S. states, primarily between 1969 and 1985.\n",
    "\n",
    "**Research question**: How did unilateral divorce laws affect divorce rates?\n",
    "\n",
    "**Design**: Staggered DiD with long panel\n",
    "- **Treatment**: Adoption of unilateral divorce law\n",
    "- **Time period**: 1968-1988\n",
    "- **Cohorts**: States adopting in different years\n",
    "\n",
    "**Key finding**: Wolfers (2006) found an initial spike in divorce rates that faded over time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-30",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load divorce laws dataset\n",
    "divorce = load_divorce_laws()\n",
    "\n",
    "print(f\"Dataset shape: {divorce.shape}\")\n",
    "print(f\"Years: {divorce['year'].min()} to {divorce['year'].max()}\")\n",
    "print(f\"States: {divorce['state'].nunique()}\")\n",
    "divorce.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-31",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Treatment timing distribution\n",
    "cohort_summary = divorce.drop_duplicates('state')[['state', 'first_treat']].sort_values('first_treat')\n",
    "\n",
    "print(\"Adoption Timeline\")\n",
    "print(\"=\" * 50)\n",
    "\n",
    "cohort_counts = cohort_summary[cohort_summary['first_treat'] > 0].groupby('first_treat').size()\n",
    "never_treated = (cohort_summary['first_treat'] == 0).sum()\n",
    "\n",
    "for year, n in cohort_counts.items():\n",
    "    print(f\"{year}: {n} state(s)\")\n",
    "print(f\"\\nNever adopted: {never_treated} states\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-32",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Callaway-Sant'Anna estimation\n",
    "cs_divorce = CallawaySantAnna(\n",
    "    control_group='never_treated',\n",
    "    n_bootstrap=199,\n",
    "    seed=42\n",
    ")\n",
    "\n",
    "results_divorce = cs_divorce.fit(\n",
    "    divorce,\n",
    "    outcome='divorce_rate',\n",
    "    unit='state',\n",
    "    time='year',\n",
    "    first_treat='first_treat',\n",
    "    aggregate='all'  # Compute all aggregations (simple, event_study, group)\n",
    ")\n",
    "\n",
    "print(results_divorce.summary())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-33",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Event study results (event_study_effects is populated when aggregate='event_study' or 'all')\n",
    "print(\"Event Study: Effect of Unilateral Divorce on Divorce Rates\")\n",
    "print(\"=\" * 65)\n",
    "print(f\"{'Years Since':>12} {'Effect':>10} {'SE':>10} {'Significant':>12}\")\n",
    "print(\"-\" * 65)\n",
    "\n",
    "for e in sorted(results_divorce.event_study_effects.keys()):\n",
    "    eff = results_divorce.event_study_effects[e]\n",
    "    sig = 'Yes' if eff['p_value'] < 0.05 else 'No'\n",
    "    print(f\"{e:>12} {eff['effect']:>10.4f} {eff['se']:>10.4f} {sig:>12}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-34",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Event study visualization\n",
    "if HAS_MATPLOTLIB:\n",
    "    fig, ax = plt.subplots(figsize=(12, 6))\n",
    "    plot_event_study(\n",
    "        results=results_divorce,\n",
    "        ax=ax,\n",
    "        title='Unilateral Divorce Laws: Effect on Divorce Rates',\n",
    "        xlabel='Years Since Law Adoption',\n",
    "        ylabel='Effect on Divorce Rate (per 1,000)'\n",
    "    )\n",
    "    plt.tight_layout()\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cell-35",
   "metadata": {},
   "source": [
    "### Dynamic Effects Pattern\n",
    "\n",
    "Notice the pattern in the event study:\n",
    "1. **Pre-treatment**: Effects near zero (validating parallel trends)\n",
    "2. **Short-run**: Spike in divorce rates immediately after adoption\n",
    "3. **Medium-run**: Effects diminish over time\n",
    "4. **Long-run**: Effects may return close to zero\n",
    "\n",
    "This \"spike and fade\" pattern was documented by Wolfers (2006) and suggests that unilateral divorce primarily moved forward divorces that would have happened anyway (\"harvesting effect\")."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cell-36",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Effects by cohort (group_effects is populated when aggregate='group' or 'all')\n",
    "print(\"Effects by Adoption Cohort\")\n",
    "print(\"=\" * 50)\n",
    "\n",
    "for cohort in sorted(results_divorce.group_effects.keys()):\n",
    "    eff = results_divorce.group_effects[cohort]\n",
    "    sig = '*' if eff['p_value'] < 0.05 else ''\n",
    "    print(f\"Cohort {cohort}: {eff['effect']:>7.4f} (SE: {eff['se']:.4f}) {sig}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cell-37",
   "metadata": {},
   "source": [
    "---\n",
    "\n",
    "## Summary\n",
    "\n",
    "### Key Takeaways\n",
    "\n",
    "1. **Card-Krueger (1994)**\n",
    "   - Classic 2x2 DiD design\n",
    "   - Simple before/after, treatment/control comparison\n",
    "   - Key insight: Minimum wage increases don't necessarily reduce employment\n",
    "\n",
    "2. **Castle Doctrine Laws**\n",
    "   - Staggered adoption across states\n",
    "   - TWFE can be biased; use CS or Sun-Abraham\n",
    "   - Bacon decomposition reveals the problem with TWFE\n",
    "   - Finding: Laws associated with increased homicide rates\n",
    "\n",
    "3. **Unilateral Divorce Laws**\n",
    "   - Long panel with many cohorts\n",
    "   - Dynamic treatment effects (spike and fade)\n",
    "   - Event study reveals time-varying patterns\n",
    "\n",
    "### When to Use Which Estimator\n",
    "\n",
    "| Design | Recommended Estimator |\n",
    "|--------|----------------------|\n",
    "| Classic 2x2 | `DifferenceInDifferences` |\n",
    "| Panel with 2 periods | `DifferenceInDifferences` or `TwoWayFixedEffects` |\n",
    "| Staggered adoption | `CallawaySantAnna` or `SunAbraham` |\n",
    "| Heterogeneous timing | Always use `CallawaySantAnna` / `SunAbraham` |\n",
    "| Few never-treated | `CallawaySantAnna(control_group='not_yet_treated')` |\n",
    "\n",
    "### References\n",
    "\n",
    "- Card, D., & Krueger, A. B. (1994). Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania. *American Economic Review*, 84(4), 772-793.\n",
    "\n",
    "- Cheng, C., & Hoekstra, M. (2013). Does Strengthening Self-Defense Law Deter Crime or Escalate Violence? Evidence from Expansions to Castle Doctrine. *Journal of Human Resources*, 48(3), 821-854.\n",
    "\n",
    "- Stevenson, B., & Wolfers, J. (2006). Bargaining in the Shadow of the Law: Divorce Laws and Family Distress. *Quarterly Journal of Economics*, 121(1), 267-288.\n",
    "\n",
    "- Wolfers, J. (2006). Did Unilateral Divorce Laws Raise Divorce Rates? A Reconciliation and New Results. *American Economic Review*, 96(5), 1802-1820.\n",
    "\n",
    "- Callaway, B., & Sant'Anna, P. H. (2021). Difference-in-differences with multiple time periods. *Journal of Econometrics*, 225(2), 200-230.\n",
    "\n",
    "- Goodman-Bacon, A. (2021). Difference-in-differences with variation in treatment timing. *Journal of Econometrics*, 225(2), 254-277."
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}