Utilities

Statistical utilities for parallel trends testing, robust standard errors, and bootstrap inference.

Wild Cluster Bootstrap

wild_bootstrap_se

Compute wild cluster bootstrap standard errors.

diff_diff.wild_bootstrap_se(X, y, residuals, cluster_ids, coefficient_index, n_bootstrap=999, weight_type='rademacher', null_hypothesis=0.0, alpha=0.05, seed=None, return_distribution=False)[source]

Compute wild cluster bootstrap standard errors and p-values.

Implements the Wild Cluster Residual (WCR) bootstrap procedure from Cameron, Gelbach, and Miller (2008). Uses the restricted residuals approach (imposing H0: coefficient = null_hypothesis) for more accurate p-value computation.

Parameters:
  • X (np.ndarray) – Design matrix of shape (n, k).

  • y (np.ndarray) – Outcome vector of shape (n,).

  • residuals (np.ndarray) – OLS residuals from unrestricted regression, shape (n,).

  • cluster_ids (np.ndarray) – Cluster identifiers of shape (n,).

  • coefficient_index (int) – Index of the coefficient for which to compute bootstrap inference. For DiD, this is typically 3 (the treatment*post interaction term).

  • n_bootstrap (int, default=999) – Number of bootstrap replications. Odd numbers are recommended for exact p-value computation.

  • weight_type (str, default="rademacher") – Type of bootstrap weights: - “rademacher”: +1 or -1 with equal probability (standard choice) - “webb”: 6-point distribution (recommended for <10 clusters) - “mammen”: Two-point distribution with skewness correction

  • null_hypothesis (float, default=0.0) – Value of the null hypothesis for p-value computation.

  • alpha (float, default=0.05) – Significance level for confidence interval.

  • seed (int, optional) – Random seed for reproducibility. If None (default), results will vary between runs.

  • return_distribution (bool, default=False) – If True, include full bootstrap distribution in results.

Returns:

Dataclass containing bootstrap SE, p-value, confidence interval, and other inference results.

Return type:

WildBootstrapResults

Raises:

ValueError – If weight_type is not recognized or if there are fewer than 2 clusters.

Warns:

UserWarning – If the number of clusters is less than 5, as bootstrap inference may be unreliable.

Examples

>>> from diff_diff.utils import wild_bootstrap_se
>>> results = wild_bootstrap_se(
...     X, y, residuals, cluster_ids,
...     coefficient_index=3,  # ATT coefficient
...     n_bootstrap=999,
...     weight_type="rademacher",
...     seed=42
... )
>>> print(f"Bootstrap SE: {results.se:.4f}")
>>> print(f"Bootstrap p-value: {results.p_value:.4f}")

References

Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2008). Bootstrap-Based Improvements for Inference with Clustered Errors. The Review of Economics and Statistics, 90(3), 414-427.

MacKinnon, J. G., & Webb, M. D. (2018). The wild bootstrap for few (treated) clusters. The Econometrics Journal, 21(2), 114-135.

Example

from diff_diff import DifferenceInDifferences, wild_bootstrap_se

# Fit model
did = DifferenceInDifferences()
results = did.fit(data, outcome='y', treated='treated', post='post')

# Bootstrap standard errors
boot_results = wild_bootstrap_se(
    data,
    outcome='y',
    treated='treated',
    post='post',
    cluster='unit_id',
    n_bootstrap=999,
    weight_type='rademacher'
)

print(f"Bootstrap SE: {boot_results.se:.3f}")
print(f"Bootstrap 95% CI: [{boot_results.ci[0]:.3f}, {boot_results.ci[1]:.3f}]")

WildBootstrapResults

Container for wild bootstrap results.

class diff_diff.WildBootstrapResults[source]

Bases: object

Results from wild cluster bootstrap inference.

se

Bootstrap standard error of the coefficient.

Type:

float

p_value

Bootstrap p-value (two-sided).

Type:

float

t_stat_original

Original t-statistic from the data.

Type:

float

ci_lower

Lower bound of the confidence interval.

Type:

float

ci_upper

Upper bound of the confidence interval.

Type:

float

n_clusters

Number of clusters in the data.

Type:

int

n_bootstrap

Number of bootstrap replications.

Type:

int

weight_type

Type of bootstrap weights used (“rademacher”, “webb”, or “mammen”).

Type:

str

alpha

Significance level used for confidence interval.

Type:

float

bootstrap_distribution

Full bootstrap distribution of coefficients (if requested).

Type:

np.ndarray, optional

References

Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2008). Bootstrap-Based Improvements for Inference with Clustered Errors. The Review of Economics and Statistics, 90(3), 414-427.

se: float
p_value: float
t_stat_original: float
ci_lower: float
ci_upper: float
n_clusters: int
n_bootstrap: int
weight_type: str
alpha: float = 0.05
bootstrap_distribution: ndarray | None = None
summary()[source]

Generate formatted summary of bootstrap results.

Return type:

str

print_summary()[source]

Print formatted summary to stdout.

Return type:

None

__init__(se, p_value, t_stat_original, ci_lower, ci_upper, n_clusters, n_bootstrap, weight_type, alpha=0.05, bootstrap_distribution=None)
Parameters:
Return type:

None

Weight Types

The wild bootstrap supports several weight distributions:

  • 'rademacher': ±1 with equal probability (default, good general choice)

  • 'mammen': Two-point distribution matching higher moments

  • 'webb': Six-point distribution, better for few clusters

# Using different weight types
boot_rad = wild_bootstrap_se(data, ..., weight_type='rademacher')
boot_webb = wild_bootstrap_se(data, ..., weight_type='webb')
boot_mammen = wild_bootstrap_se(data, ..., weight_type='mammen')

Recommendation

  • Use 'rademacher' (default) for most cases

  • Use 'webb' when you have fewer than 10 clusters

  • The n_bootstrap should typically be at least 999 for reliable inference