Diagnostics#

Placebo tests and diagnostic tools for validating DiD assumptions.

run_placebo_test#

Main dispatcher for running different types of placebo tests.

diff_diff.run_placebo_test(data, outcome, treatment, time, unit=None, test_type='fake_timing', fake_treatment_period=None, fake_treatment_group=None, post_periods=None, n_permutations=1000, alpha=0.05, seed=None, **estimator_kwargs)[source]#

Run a placebo test to validate DiD assumptions.

Placebo tests provide evidence on the validity of the parallel trends assumption by testing whether “fake” treatments produce significant effects. A significant placebo effect suggests the parallel trends assumption may be violated.

Parameters:

data (pd.DataFrame) – Panel data for DiD analysis.
outcome (str) – Name of outcome variable column.
treatment (str) – Name of treatment indicator column (0/1).
time (str) – Name of time period column.
unit (str, optional) – Name of unit identifier column. Required for some test types.
test_type (str, default="fake_timing") – Type of placebo test: - “fake_timing”: Assign treatment at a fake (earlier) time period - “fake_group”: Run DiD designating some control units as “fake treated” - “permutation”: Randomly reassign treatment and compute distribution - “leave_one_out”: Drop each treated unit and re-estimate
fake_treatment_period (any, optional) – For “fake_timing”: The fake treatment period to test. Should be a pre-treatment period.
fake_treatment_group (list, optional) – For “fake_group”: List of control unit IDs to designate as fake treated.
post_periods (list, optional) – List of post-treatment periods. Required for fake_timing test.
n_permutations (int, default=1000) – For “permutation”: Number of random treatment assignments.
alpha (float, default=0.05) – Significance level.
seed (int, optional) – Random seed for reproducibility.
**estimator_kwargs – Additional arguments passed to the DiD estimator.

Returns:

Object containing placebo effect estimates, p-values, and diagnostics.

Return type:

PlaceboTestResults

Examples

Fake timing test:

>>> results = run_placebo_test(
...     data, outcome='sales', treatment='treated', time='period',
...     test_type='fake_timing',
...     fake_treatment_period=1,  # Pre-treatment period
...     post_periods=[2, 3, 4]
... )
>>> if results.is_significant:
...     print("Warning: Pre-treatment differential trends detected!")

Permutation test:

>>> results = run_placebo_test(
...     data, outcome='sales', treatment='treated', time='period',
...     unit='unit_id',
...     test_type='permutation',
...     n_permutations=1000,
...     seed=42
... )
>>> print(f"Permutation p-value: {results.p_value:.4f}")

References

Bertrand, M., Duflo, E., & Mullainathan, S. (2004). How Much Should We Trust Differences-in-Differences Estimates? The Quarterly Journal of Economics, 119(1), 249-275.

placebo_timing_test#

Test using fake treatment timing.

diff_diff.placebo_timing_test(data, outcome, treatment, time, fake_treatment_period, post_periods=None, alpha=0.05, **estimator_kwargs)[source]#

Test for pre-treatment effects by moving treatment timing earlier.

Creates a fake “post” indicator using pre-treatment data only, then estimates a DiD model. A significant effect suggests pre-existing differential trends.

Parameters:

data (pd.DataFrame) – Panel data.
outcome (str) – Outcome variable column.
treatment (str) – Treatment indicator column.
time (str) – Time period column.
fake_treatment_period (any) – Period to use as fake treatment timing (should be a pre-treatment period).
post_periods (list, optional) – List of actual post-treatment periods. If None, infers from data.
alpha (float, default=0.05) – Significance level.
**estimator_kwargs – Arguments passed to DifferenceInDifferences.

Returns:

Results of the fake timing placebo test.

Return type:

PlaceboTestResults

Example#

from diff_diff import placebo_timing_test

# Test if effect exists at a fake treatment time
result = placebo_timing_test(
    data,
    outcome='y',
    treatment='treated',
    time='period',
    fake_treatment_period=3  # Test earlier period
)

print(f"Placebo effect: {result.placebo_effect:.3f}")
print(f"p-value: {result.p_value:.3f}")

placebo_group_test#

Test using fake treatment groups (DiD on never-treated).

diff_diff.placebo_group_test(data, outcome, time, unit, fake_treated_units, post_periods=None, alpha=0.05, **estimator_kwargs)[source]#

Test for differential trends among never-treated units.

Assigns some never-treated units as “fake treated” and estimates a DiD model using only never-treated data. A significant effect suggests heterogeneous trends in the control group.

Parameters:

data (pd.DataFrame) – Panel data.
outcome (str) – Outcome variable column.
time (str) – Time period column.
unit (str) – Unit identifier column.
fake_treated_units (list) – List of control unit IDs to designate as “fake treated”.
post_periods (list, optional) – List of post-treatment period values.
alpha (float, default=0.05) – Significance level.
**estimator_kwargs – Arguments passed to DifferenceInDifferences.

Returns:

Results of the fake group placebo test.

Return type:

PlaceboTestResults

Example#

from diff_diff import placebo_group_test

# Run DiD among never-treated units
result = placebo_group_test(
    data,
    outcome='y',
    time='period',
    unit='unit_id',
    fake_treated_units=[10, 11, 12]  # Assign some control units as fake-treated
)

# Should find no effect if parallel trends holds
print(f"Placebo effect: {result.placebo_effect:.3f}")

permutation_test#

Permutation-based inference for treatment effects.

diff_diff.permutation_test(data, outcome, treatment, time, unit, n_permutations=1000, alpha=0.05, seed=None, **estimator_kwargs)[source]#

Compute permutation-based p-value for DiD estimate.

Randomly reassigns treatment status at the unit level and computes the DiD estimate for each permutation. The p-value is the proportion of permuted estimates at least as extreme as the original.

Parameters:

data (pd.DataFrame) – Panel data.
outcome (str) – Outcome variable column.
treatment (str) – Treatment indicator column.
time (str) – Time period column.
unit (str) – Unit identifier column.
n_permutations (int, default=1000) – Number of random permutations.
alpha (float, default=0.05) – Significance level.
seed (int, optional) – Random seed for reproducibility.
**estimator_kwargs – Arguments passed to DifferenceInDifferences.

Returns:

Results with permutation distribution and p-value.

Return type:

PlaceboTestResults

Notes

The permutation test is exact and does not rely on asymptotic approximations, making it valid with any sample size.

Example#

from diff_diff import permutation_test, generate_did_data

panel = generate_did_data(n_units=100, n_periods=10, treatment_effect=2.0)
result = permutation_test(
    panel,
    outcome='outcome',
    treatment='treated',
    time='post',
    unit='unit',
    n_permutations=1000
)

print(f"Permutation p-value: {result.p_value:.3f}")

leave_one_out_test#

Sensitivity analysis removing individual treated units.

diff_diff.leave_one_out_test(data, outcome, treatment, time, unit, alpha=0.05, **estimator_kwargs)[source]#

Assess sensitivity by dropping each treated unit in turn.

For each treated unit, drops that unit and re-estimates the DiD model. Large variation in estimates suggests results are driven by a single unit.

Parameters:

data (pd.DataFrame) – Panel data.
outcome (str) – Outcome variable column.
treatment (str) – Treatment indicator column.
time (str) – Time period column.
unit (str) – Unit identifier column.
alpha (float, default=0.05) – Significance level.
**estimator_kwargs – Arguments passed to DifferenceInDifferences.

Returns:

Results with leave_one_out_effects dict mapping unit -> ATT estimate.

Return type:

PlaceboTestResults

Example#

from diff_diff import leave_one_out_test, generate_did_data

panel = generate_did_data(n_units=100, n_periods=10, treatment_effect=2.0)
result = leave_one_out_test(
    panel,
    outcome='outcome',
    treatment='treated',
    time='post',
    unit='unit'
)

# Check if results are driven by single units
loo = result.leave_one_out_effects
print(f"Effect range: [{min(loo.values()):.3f}, {max(loo.values()):.3f}]")

run_all_placebo_tests#

Run comprehensive suite of diagnostic tests.

diff_diff.run_all_placebo_tests(data, outcome, treatment, time, unit, pre_periods, post_periods, n_permutations=500, alpha=0.05, seed=None, **estimator_kwargs)[source]#

Run a comprehensive suite of placebo tests.

Runs fake timing tests for each pre-period, a permutation test, and a leave-one-out sensitivity analysis. If a test fails, the result will be a dict with an “error” key containing the error message.

Parameters:

data (pd.DataFrame) – Panel data.
outcome (str) – Outcome variable column.
treatment (str) – Treatment indicator column.
time (str) – Time period column.
unit (str) – Unit identifier column.
pre_periods (list) – List of pre-treatment periods.
post_periods (list) – List of post-treatment periods.
n_permutations (int, default=500) – Permutations for permutation test.
alpha (float, default=0.05) – Significance level.
seed (int, optional) – Random seed.
**estimator_kwargs – Arguments passed to estimators.

Returns:

Dictionary mapping test names to PlaceboTestResults. Keys: “fake_timing_{period}”, “permutation”, “leave_one_out”

Return type:

dict

PlaceboTestResults#

Container for placebo test results.

class diff_diff.PlaceboTestResults[source]

Bases: object

Results from a placebo test for DiD assumption validation.

test_type

Type of placebo test performed.

Type:: str

placebo_effect

Estimated placebo treatment effect.

Type:: float

se

Standard error of the placebo effect.

Type:: float

t_stat

T-statistic for the placebo effect.

Type:: float

p_value

P-value for testing placebo_effect = 0.

Type:: float

conf_int

Confidence interval for the placebo effect.

Type:: tuple

n_obs

Number of observations used in the test.

Type:: int

is_significant

Whether the placebo effect is significant at alpha=0.05.

Type:: bool

original_effect

Original ATT estimate for comparison.

Type:: float, optional

original_se

Original SE for comparison.

Type:: float, optional

permutation_distribution

Distribution of permuted effects (for permutation test).

Type:: np.ndarray, optional

leave_one_out_effects

Unit-specific effects (for leave-one-out test).

Type:: dict, optional

fake_period

The fake treatment period used (for timing test).

Type:: any, optional

fake_group

The fake treatment group used (for group test).

Type:: list, optional

test_type: str

placebo_effect: float

se: float

t_stat: float

p_value: float

conf_int: Tuple[float, float]

n_obs: int

is_significant: bool

alpha: float = 0.05

original_effect: float | None = None

original_se: float | None = None

permutation_distribution: ndarray | None = None

leave_one_out_effects: Dict[Any, float] | None = None

fake_period: Any | None = None

fake_group: List[Any] | None = None

n_permutations: int | None = None

property significance_stars: str: Return significance stars based on p-value.

summary()[source]

Generate formatted summary of placebo test results.

Return type:: str

print_summary()[source]

Print summary to stdout.

Return type:: None

to_dict()[source]

Convert results to a dictionary.

Return type:: Dict[str, Any]

to_dataframe()[source]

Convert results to a DataFrame.

Return type:: DataFrame

__init__(test_type, placebo_effect, se, t_stat, p_value, conf_int, n_obs, is_significant, alpha=0.05, original_effect=None, original_se=None, permutation_distribution=None, leave_one_out_effects=None, fake_period=None, fake_group=None, n_permutations=None)

Parameters:

test_type (str)
placebo_effect (float)
se (float)
t_stat (float)
p_value (float)
conf_int (Tuple[float, float])
n_obs (int)
is_significant (bool)
alpha (float)
original_effect (float | None)
original_se (float | None)
permutation_distribution (ndarray | None)
leave_one_out_effects (Dict[Any, float] | None)
fake_period (Any | None)
fake_group (List[Any] | None)
n_permutations (int | None)

Return type:

None