Interactive notebook

This tutorial is a Jupyter notebook. You can view it on GitHub or download it to run locally.

Synthetic Difference-in-Differences (SDID)#

This notebook demonstrates Synthetic Difference-in-Differences (Arkhangelsky et al., 2021), which combines:

Synthetic Control: Reweight control units to match treated units in pre-treatment periods
Difference-in-Differences: Use time variation to control for unobserved confounders

SDID is particularly useful when:

You have few treated units (even just one)
You want to construct a better counterfactual than simple averaging
You’re concerned about parallel trends violations

We’ll cover:

When to use Synthetic DiD
Basic estimation
Understanding unit and time weights
Inference (bootstrap and placebo)
Tuning regularization

[ ]:

import numpy as np
import pandas as pd
from diff_diff import SyntheticDiD, DifferenceInDifferences

# For nicer plots (optional)
try:
    import matplotlib.pyplot as plt
    plt.style.use('seaborn-v0_8-whitegrid')
    HAS_MATPLOTLIB = True
except ImportError:
    HAS_MATPLOTLIB = False
    print("matplotlib not installed - visualization examples will be skipped")

1. When to Use Synthetic DiD#

Consider SDID when:

You have few treated units (1-10 is common)
You have a reasonably long pre-treatment period (5+ periods ideal)
You have many potential control units
Standard DiD assumptions (parallel trends) may be questionable

[ ]:

# Construct panel data with explicit unit characteristics
# Only observation-level noise is random — structural properties are deterministic
np.random.seed(42)

n_treated = 3
n_control = 40
n_pre = 10
n_post = 4
n_periods = n_pre + n_post
true_att = 5.0

# Control unit design:
# - "Similar" controls (10 units): intercepts near 50, trends near 2.0
#   These will be identified by SDID and receive high weights
# - "Diverse" controls (30 units): wide intercept range, lower trends
#   These bias standard DiD but SDID correctly down-weights them
similar_intercepts = np.linspace(45, 55, 10)
similar_trends = np.linspace(1.7, 2.3, 10)

diverse_intercepts = np.linspace(30, 70, 30)
diverse_trends = np.linspace(0.3, 1.0, 30)

control_intercepts = np.concatenate([similar_intercepts, diverse_intercepts])
control_trends = np.concatenate([similar_trends, diverse_trends])

# Treated units: intercepts near 50, trends around 2.0
treated_intercepts = [48, 50, 52]
treated_trends = [1.8, 2.0, 2.2]

data = []

# Control units
for i in range(n_control):
    for period in range(n_periods):
        y = control_intercepts[i] + control_trends[i] * period + np.random.normal(0, 0.5)
        data.append({
            'unit': i,
            'period': period,
            'treated': 0,
            'outcome': y
        })

# Treated units
for i in range(n_treated):
    unit_id = n_control + i
    for period in range(n_periods):
        y = treated_intercepts[i] + treated_trends[i] * period
        if period >= n_pre:
            y += true_att
        y += np.random.normal(0, 0.5)
        data.append({
            'unit': unit_id,
            'period': period,
            'treated': 1,
            'outcome': y
        })

df = pd.DataFrame(data)
print(f"Dataset: {len(df)} observations")
print(f"Treated units: {n_treated}")
print(f"Control units: {n_control}")
print(f"Pre-treatment periods: {n_pre}")
print(f"Post-treatment periods: {n_post}")

[ ]:

if HAS_MATPLOTLIB:
    # Visualize the data
    fig, ax = plt.subplots(figsize=(12, 6))

    # Plot control units (gray, thin lines)
    for unit in df[df['treated'] == 0]['unit'].unique():
        unit_data = df[df['unit'] == unit]
        ax.plot(unit_data['period'], unit_data['outcome'],
                color='gray', alpha=0.3, linewidth=0.5)

    # Plot treated units (colored, thick lines)
    colors = ['red', 'orange', 'darkred']
    for i, unit in enumerate(df[df['treated'] == 1]['unit'].unique()):
        unit_data = df[df['unit'] == unit]
        ax.plot(unit_data['period'], unit_data['outcome'],
                color=colors[i], linewidth=2, label=f'Treated {i+1}')

    # Mark treatment time
    ax.axvline(x=n_pre - 0.5, color='black', linestyle='--', label='Treatment')

    ax.set_xlabel('Period')
    ax.set_ylabel('Outcome')
    ax.set_title('Panel Data: Treated Units vs Control Units')
    ax.legend(loc='upper left')
    plt.tight_layout()
    plt.show()

2. Basic Synthetic DiD Estimation#

[ ]:

# Fit Synthetic DiD
sdid = SyntheticDiD(
    n_bootstrap=999,  # Number of bootstrap replications
    seed=42
)

results = sdid.fit(
    df,
    outcome="outcome",
    treatment="treated",
    unit="unit",
    time="period",
    post_periods=list(range(n_pre, n_periods))  # Post-treatment periods
)

print(results.summary())

3. Comparing SDID with Standard DiD#

Let’s compare the Synthetic DiD estimate with standard DiD to see the difference.

[ ]:

# Create post indicator for standard DiD
df['post'] = (df['period'] >= n_pre).astype(int)

# Standard DiD
did = DifferenceInDifferences()
results_did = did.fit(
    df,
    outcome="outcome",
    treatment="treated",
    time="post"
)

print("Comparison of Estimators")
print("=" * 50)
print(f"True ATT: {true_att}")
print(f"")
print(f"Standard DiD:")
print(f"  ATT: {results_did.att:.4f}")
print(f"  SE:  {results_did.se:.4f}")
print(f"  Bias: {results_did.att - true_att:.4f}")
print(f"")
print(f"Synthetic DiD:")
print(f"  ATT: {results.att:.4f}")
print(f"  SE:  {results.se:.4f}")
print(f"  Bias: {results.att - true_att:.4f}")

4. Understanding Unit Weights#

SDID assigns weights to control units based on how well they match the treated units in the pre-treatment period. Units with higher weights are more similar to the treated units.

[ ]:

# View unit weights
weights_df = results.get_unit_weights_df()
print("Top 10 control units by weight:")
print(weights_df.sort_values('weight', ascending=False).head(10))

[ ]:

# Check weight properties
print(f"\nWeight statistics:")
print(f"  Sum of weights: {weights_df['weight'].sum():.6f}")
print(f"  Number of non-zero weights: {(weights_df['weight'] > 0.01).sum()}")
print(f"  Max weight: {weights_df['weight'].max():.4f}")
print(f"  Effective number of controls: {1 / (weights_df['weight'] ** 2).sum():.1f}")

[ ]:

if HAS_MATPLOTLIB:
    # Visualize unit weights
    fig, ax = plt.subplots(figsize=(12, 5))

    sorted_weights = weights_df.sort_values('weight', ascending=True)
    ax.barh(range(len(sorted_weights)), sorted_weights['weight'])
    ax.set_yticks(range(len(sorted_weights)))
    ax.set_yticklabels(sorted_weights['unit'])
    ax.set_xlabel('Weight')
    ax.set_ylabel('Control Unit')
    ax.set_title('Synthetic Control Unit Weights')
    plt.tight_layout()
    plt.show()

5. Understanding Time Weights#

SDID also computes time weights that determine how much each pre-treatment period contributes to the baseline. Periods where treated and control outcomes are more similar get higher weight.

[ ]:

# View time weights
time_weights_df = results.get_time_weights_df()
print("Time weights:")
print(time_weights_df)

[ ]:

if HAS_MATPLOTLIB:
    # Visualize time weights
    fig, ax = plt.subplots(figsize=(10, 5))

    ax.bar(time_weights_df['period'], time_weights_df['weight'])
    ax.set_xlabel('Pre-treatment Period')
    ax.set_ylabel('Weight')
    ax.set_title('Time Weights for Pre-treatment Periods')
    ax.set_xticks(time_weights_df['period'])
    plt.tight_layout()
    plt.show()

6. Pre-treatment Fit#

A key diagnostic is how well the synthetic control matches the treated units in the pre-treatment period.

[ ]:

print(f"Pre-treatment fit (RMSE): {results.pre_treatment_fit:.4f}")
print(f"\nLower values indicate better fit.")

[ ]:

if HAS_MATPLOTLIB:
    # Compare treated vs synthetic control trajectories
    fig, ax = plt.subplots(figsize=(12, 6))

    # Compute weighted control outcome
    weights_dict = dict(zip(weights_df['unit'], weights_df['weight']))

    # Get treated mean
    treated_mean = df[df['treated'] == 1].groupby('period')['outcome'].mean()

    # Get synthetic control (weighted average of controls)
    control_data = df[df['treated'] == 0].copy()
    control_data['weight'] = control_data['unit'].map(weights_dict)
    synthetic = control_data.groupby('period').apply(
        lambda x: np.average(x['outcome'], weights=x['weight'])
    )

    # Simple average control
    simple_control = df[df['treated'] == 0].groupby('period')['outcome'].mean()

    ax.plot(treated_mean.index, treated_mean.values, 'o-',
            linewidth=2, markersize=8, color='red', label='Treated')
    ax.plot(synthetic.index, synthetic.values, 's--',
            linewidth=2, markersize=8, color='blue', label='Synthetic Control')
    ax.plot(simple_control.index, simple_control.values, '^:',
            linewidth=1, markersize=6, color='gray', alpha=0.7, label='Simple Average Control')

    ax.axvline(x=n_pre - 0.5, color='black', linestyle='--', alpha=0.5)
    ax.fill_between([n_pre - 0.5, n_periods - 0.5], ax.get_ylim()[0], ax.get_ylim()[1],
                    alpha=0.1, color='green')

    ax.set_xlabel('Period')
    ax.set_ylabel('Outcome')
    ax.set_title('Treated vs Synthetic Control')
    ax.legend()
    plt.tight_layout()
    plt.show()

7. Inference Methods#

SDID supports three inference methods:

Placebo (variance_method="placebo", default): Placebo-based variance using Algorithm 4 from Arkhangelsky et al. (2021). Library default (R’s default is bootstrap — we deviate because placebo is unconditionally available on pweight-only survey designs and sidesteps the refit bootstrap slowdown).
Bootstrap (variance_method="bootstrap"): Paper-faithful pairs bootstrap (Algorithm 2 step 2) — re-estimates ω and λ via Frank-Wolfe on each draw. Matches R’s default synthdid::vcov(method="bootstrap") behavior. Expect ~5–30× slower per fit than placebo (panel-size dependent).
Jackknife (variance_method="jackknife"): Algorithm 3 — fixed-weight leave-one-out. Deterministic; no bootstrap replications.

[ ]:

# Placebo-based inference
sdid_placebo = SyntheticDiD(
    variance_method="placebo",  # Use placebo inference
    n_bootstrap=200,  # Number of placebo replications
    seed=42
)

results_placebo = sdid_placebo.fit(
    df,
    outcome="outcome",
    treatment="treated",
    unit="unit",
    time="period",
    post_periods=list(range(n_pre, n_periods))
)

print("Placebo-based inference:")
print(f"ATT: {results_placebo.att:.4f}")
print(f"SE: {results_placebo.se:.4f}")
print(f"Number of placebo effects: {len(results_placebo.variance_effects)}")

[ ]:

if HAS_MATPLOTLIB:
    # Visualize placebo distribution
    fig, ax = plt.subplots(figsize=(10, 6))

    ax.hist(results_placebo.variance_effects, bins=20, alpha=0.7,
            edgecolor='black', label='Placebo effects')
    ax.axvline(x=results_placebo.att, color='red', linewidth=2,
               linestyle='--', label=f'Actual ATT = {results_placebo.att:.2f}')
    ax.axvline(x=0, color='gray', linewidth=1, linestyle=':')

    ax.set_xlabel('Effect')
    ax.set_ylabel('Frequency')
    ax.set_title('Distribution of Placebo Effects')
    ax.legend()
    plt.tight_layout()
    plt.show()

8. Tuning Regularization#

By default, SDID auto-computes regularization from the data noise level, matching R’s synthdid package:

zeta_omega: Unit weight regularization = (N1 * T1)^0.25 * noise_level
zeta_lambda: Time weight regularization = 1e-6 * noise_level

You can override these with explicit values:

[ ]:

# Compare different unit weight regularization levels
results_list = []

for zeta_omega in [0.1, 1.0, 10.0]:
    sdid_reg = SyntheticDiD(
        zeta_omega=zeta_omega,
        variance_method="placebo",
        n_bootstrap=200,
        seed=42
    )

    res = sdid_reg.fit(
        df,
        outcome="outcome",
        treatment="treated",
        unit="unit",
        time="period",
        post_periods=list(range(n_pre, n_periods))
    )

    weights = list(res.unit_weights.values())
    eff_n = 1 / sum(w**2 for w in weights) if sum(w**2 for w in weights) > 0 else 0

    results_list.append({
        'zeta_omega': zeta_omega,
        'ATT': res.att,
        'SE': res.se,
        'Eff. N controls': eff_n,
        'Pre-fit RMSE': res.pre_treatment_fit
    })

reg_df = pd.DataFrame(results_list)
print("Effect of unit weight regularization:")
print(reg_df.to_string(index=False))

9. Single Treated Unit Case#

SDID is particularly useful when you have only one treated unit (like the classic synthetic control case).

[ ]:

# Filter to single treated unit
single_treated = df[(df['treated'] == 0) | (df['unit'] == n_control)].copy()

print(f"Single treated unit analysis:")
print(f"  Treated units: 1")
print(f"  Control units: {n_control}")

[ ]:

# Fit SDID with single treated unit
sdid_single = SyntheticDiD(
    variance_method="placebo",
    n_bootstrap=200,
    seed=42
)

results_single = sdid_single.fit(
    single_treated,
    outcome="outcome",
    treatment="treated",
    unit="unit",
    time="period",
    post_periods=list(range(n_pre, n_periods))
)

print(results_single.summary())

10. Including Covariates#

You can include covariates to improve the synthetic control match.

[ ]:

# Add covariates
df['size'] = np.random.normal(100, 20, len(df))
df['age'] = np.random.normal(10, 3, len(df))

# Fit with covariates
sdid_cov = SyntheticDiD(
    n_bootstrap=199,
    seed=42
)

results_cov = sdid_cov.fit(
    df,
    outcome="outcome",
    treatment="treated",
    unit="unit",
    time="period",
    post_periods=list(range(n_pre, n_periods)),
    covariates=["size", "age"]
)

print(f"With covariates:")
print(f"  ATT: {results_cov.att:.4f}")
print(f"  SE: {results_cov.se:.4f}")

Summary#

Key takeaways for Synthetic DiD:

Best use cases: Few treated units, many controls, long pre-period
Unit weights: Identify which controls are most similar to treated (Frank-Wolfe with sparsification)
Time weights: Determine which pre-periods are most informative (Frank-Wolfe on collapsed form)
Pre-treatment fit: Lower RMSE indicates better synthetic match
Inference options:
- Placebo (variance_method="placebo", default): Placebo-based variance from controls. Library default (R’s default is bootstrap; we deviate for survey availability + perf).
- Bootstrap (variance_method="bootstrap"): Paper-faithful pairs bootstrap re-estimating ω and λ via Frank-Wolfe per draw (Algorithm 2 step 2; matches R’s default vcov). ~5–30× slower than placebo.
- Jackknife (variance_method="jackknife"): Algorithm 3 — fixed-weight leave-one-out.
Regularization: Auto-computed from data noise level by default. Override with zeta_omega/zeta_lambda.

Reference:

Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., & Wager, S. (2021). Synthetic difference-in-differences. American Economic Review, 111(12), 4088-4118.