Interactive notebook

This tutorial is a Jupyter notebook. You can view it on GitHub or download it to run locally.

Basic Difference-in-Differences with diff-diff#

This notebook demonstrates how to use the diff-diff library for basic 2x2 Difference-in-Differences (DiD) analysis. We’ll cover:

  1. Setting up a basic DiD estimation

  2. Using both column-name and formula interfaces

  3. Interpreting results

  4. Adding covariates

  5. Using fixed effects

  6. Cluster-robust and wild bootstrap inference

[ ]:
import numpy as np
import pandas as pd
from diff_diff import DifferenceInDifferences, TwoWayFixedEffects
from diff_diff.prep import generate_did_data

1. Generate Sample Data#

The generate_did_data function creates synthetic panel data with a known treatment effect, which is useful for learning and testing.

[ ]:
# Generate synthetic DiD data with known ATT of 5.0
data = generate_did_data(
    n_units=100,
    n_periods=2,
    treatment_effect=5.0,
    treatment_fraction=0.5,
    treatment_period=1,  # Period 1 is post-treatment (periods are 0 and 1)
    noise_sd=1.0,
    seed=42
)

print(f"Dataset shape: {data.shape}")
data.head(10)
[ ]:
# Examine the data structure
print("Treatment and time distribution:")
print(data.groupby(['treated', 'post']).size().unstack(fill_value=0))

2. Basic DiD Estimation#

The DifferenceInDifferences estimator provides an sklearn-like interface with a fit() method.

[ ]:
# Create the estimator
did = DifferenceInDifferences()

# Fit using column names
results = did.fit(
    data,
    outcome="outcome",
    treatment="treated",
    time="post"
)

# Print the summary
print(results.summary())

Understanding the Results#

The key results are:

  • ATT (Average Treatment Effect on the Treated): The estimated causal effect of the treatment

  • SE: Standard error of the estimate

  • t-stat: T-statistic for testing H0: ATT = 0

  • p-value: Two-sided p-value

  • 95% CI: Confidence interval for the ATT

[ ]:
# Access individual components
print(f"Estimated ATT: {results.att:.4f}")
print(f"True ATT: 5.0")
print(f"Standard Error: {results.se:.4f}")
print(f"95% CI: [{results.conf_int[0]:.4f}, {results.conf_int[1]:.4f}]")
print(f"P-value: {results.p_value:.4f}")
print(f"Is significant at 5% level: {results.is_significant}")
print(f"Significance stars: {results.significance_stars}")

3. Using the Formula Interface#

For those familiar with R, diff-diff supports a formula interface similar to R’s notation.

[ ]:
# Using formula interface (R-style)
did_formula = DifferenceInDifferences()
results_formula = did_formula.fit(
    data,
    formula="outcome ~ treated * post"
)

print(results_formula.summary())
[ ]:
# Verify both methods give the same result
print(f"Column-name ATT: {results.att:.6f}")
print(f"Formula ATT: {results_formula.att:.6f}")
print(f"Difference: {abs(results.att - results_formula.att):.2e}")

4. Adding Covariates#

You can include additional control variables to improve precision and reduce bias from observed confounders.

[ ]:
# Add some covariates to our data
np.random.seed(42)
data['size'] = np.random.normal(100, 20, len(data))
data['age'] = np.random.normal(10, 3, len(data))

# Fit with covariates
did_cov = DifferenceInDifferences()
results_cov = did_cov.fit(
    data,
    outcome="outcome",
    treatment="treated",
    time="post",
    covariates=["size", "age"]
)

print(results_cov.summary())
[ ]:
# All coefficient estimates are available
print("All coefficients:")
for name, coef in results_cov.coefficients.items():
    print(f"  {name}: {coef:.4f}")

5. Fixed Effects#

Fixed effects control for time-invariant unobserved heterogeneity. diff-diff supports two approaches:

  1. Dummy variables (fixed_effects): Creates indicator variables for each level

  2. Within-transformation (absorb): Demeans data by group (more efficient for high-dimensional FE)

[ ]:
# Generate data with more structure
np.random.seed(42)
n_units = 50
n_periods = 4

panel_data = []
for unit in range(n_units):
    is_treated = unit < n_units // 2
    state = unit % 5  # 5 states
    unit_effect = np.random.normal(0, 2)

    for period in range(n_periods):
        post = 1 if period >= 2 else 0
        y = 10.0 + unit_effect + period * 0.5 + state * 1.5
        if is_treated and post:
            y += 4.0  # True ATT = 4.0
        y += np.random.normal(0, 0.5)

        panel_data.append({
            'unit': unit,
            'state': f'state_{state}',
            'period': period,
            'treated': int(is_treated),
            'post': post,
            'outcome': y
        })

panel_df = pd.DataFrame(panel_data)
print(f"Panel data: {panel_df.shape[0]} observations")
panel_df.head()
[ ]:
# Using fixed effects with dummy variables
did_fe = DifferenceInDifferences()
results_fe = did_fe.fit(
    panel_df,
    outcome="outcome",
    treatment="treated",
    time="post",
    fixed_effects=["state"]
)

print(results_fe.summary())
[ ]:
# Using absorbed fixed effects (within-transformation)
# This is more efficient for high-dimensional fixed effects
did_absorb = DifferenceInDifferences()
results_absorb = did_absorb.fit(
    panel_df,
    outcome="outcome",
    treatment="treated",
    time="post",
    absorb=["unit"]  # Absorb unit fixed effects
)

print(results_absorb.summary())

6. Two-Way Fixed Effects (TWFE)#

For panel data, the TwoWayFixedEffects estimator automatically includes both unit and time fixed effects using within-transformation.

[ ]:
# Two-Way Fixed Effects estimator
twfe = TwoWayFixedEffects()
results_twfe = twfe.fit(
    panel_df,
    outcome="outcome",
    treatment="treated",
    time="period",  # Use actual time periods
    unit="unit"
)

print(results_twfe.summary())

7. Robust Inference#

Cluster-Robust Standard Errors#

When observations are correlated within clusters (e.g., units over time), use cluster-robust standard errors.

[ ]:
# Create clustered data
np.random.seed(42)
n_clusters = 20
obs_per_cluster = 10

clustered_data = []
for cluster in range(n_clusters):
    is_treated = cluster < n_clusters // 2
    cluster_effect = np.random.normal(0, 2)

    for obs in range(obs_per_cluster):
        for period in [0, 1]:
            y = 10.0 + cluster_effect
            if period == 1:
                y += 3.0
            if is_treated and period == 1:
                y += 2.5  # True ATT = 2.5
            y += np.random.normal(0, 0.5)

            clustered_data.append({
                'cluster': cluster,
                'obs': obs,
                'period': period,
                'treated': int(is_treated),
                'post': period,
                'outcome': y
            })

clustered_df = pd.DataFrame(clustered_data)
print(f"Clustered data: {clustered_df.shape[0]} observations in {n_clusters} clusters")
[ ]:
# Compare standard errors: robust vs cluster-robust
did_robust = DifferenceInDifferences(robust=True)
did_cluster = DifferenceInDifferences(cluster="cluster")

results_robust = did_robust.fit(
    clustered_df,
    outcome="outcome",
    treatment="treated",
    time="post"
)

results_cluster = did_cluster.fit(
    clustered_df,
    outcome="outcome",
    treatment="treated",
    time="post"
)

print(f"ATT (both methods): {results_robust.att:.4f}")
print(f"Robust SE (HC1): {results_robust.se:.4f}")
print(f"Cluster-robust SE: {results_cluster.se:.4f}")
print(f"\nCluster-robust SE is {results_cluster.se / results_robust.se:.2f}x larger")

Wild Cluster Bootstrap#

For better inference with few clusters (<50), use the wild cluster bootstrap.

[ ]:
# Wild cluster bootstrap inference
did_bootstrap = DifferenceInDifferences(
    cluster="cluster",
    inference="wild_bootstrap",
    n_bootstrap=999,
    bootstrap_weights="rademacher",
    seed=42
)

results_bootstrap = did_bootstrap.fit(
    clustered_df,
    outcome="outcome",
    treatment="treated",
    time="post"
)

print(results_bootstrap.summary())
[ ]:
# Compare inference methods
print("Comparison of inference methods:")
print(f"{'Method':<25} {'SE':>10} {'p-value':>10} {'95% CI':>25}")
print("-" * 70)
print(f"{'Cluster-robust (analytical)':<25} {results_cluster.se:>10.4f} {results_cluster.p_value:>10.4f} [{results_cluster.conf_int[0]:>8.4f}, {results_cluster.conf_int[1]:>8.4f}]")
print(f"{'Wild cluster bootstrap':<25} {results_bootstrap.se:>10.4f} {results_bootstrap.p_value:>10.4f} [{results_bootstrap.conf_int[0]:>8.4f}, {results_bootstrap.conf_int[1]:>8.4f}]")

8. Exporting Results#

Results can be exported to various formats for reporting.

[ ]:
# Export to dictionary
result_dict = results.to_dict()
print("As dictionary:")
for key, value in result_dict.items():
    if isinstance(value, float):
        print(f"  {key}: {value:.4f}")
    else:
        print(f"  {key}: {value}")
[ ]:
# Export to DataFrame (useful for combining multiple estimates)
result_df = results.to_dataframe()
print("\nAs DataFrame:")
result_df

Summary#

In this notebook, we covered:

  • Basic DiD estimation with both column-name and formula interfaces

  • Adding covariates to control for observed confounders

  • Fixed effects using dummy variables or within-transformation

  • Two-Way Fixed Effects for panel data

  • Cluster-robust standard errors for correlated observations

  • Wild cluster bootstrap for robust inference with few clusters

For more advanced topics, see the other example notebooks:

  • 02_staggered_did.ipynb - Staggered adoption with Callaway-Sant’Anna

  • 03_synthetic_did.ipynb - Synthetic Difference-in-Differences

  • 04_parallel_trends.ipynb - Testing and diagnostics