diff_diff.wild_bootstrap_se#

diff_diff.wild_bootstrap_se(X, y, residuals, cluster_ids, coefficient_index, n_bootstrap=999, weight_type='rademacher', null_hypothesis=0.0, alpha=0.05, seed=None, return_distribution=False)[source]

Compute wild cluster bootstrap standard errors and p-values.

Implements the Wild Cluster Residual (WCR) bootstrap procedure from Cameron, Gelbach, and Miller (2008). Uses the restricted residuals approach (imposing H0: coefficient = null_hypothesis) for more accurate p-value computation.

Parameters:
  • X (np.ndarray) – Design matrix of shape (n, k).

  • y (np.ndarray) – Outcome vector of shape (n,).

  • residuals (np.ndarray) – OLS residuals from unrestricted regression, shape (n,).

  • cluster_ids (np.ndarray) – Cluster identifiers of shape (n,).

  • coefficient_index (int) – Index of the coefficient for which to compute bootstrap inference. For DiD, this is typically 3 (the treatment*post interaction term).

  • n_bootstrap (int, default=999) – Number of bootstrap replications. Odd numbers are recommended for exact p-value computation.

  • weight_type (str, default="rademacher") – Type of bootstrap weights: - “rademacher”: +1 or -1 with equal probability (standard choice) - “webb”: 6-point distribution (recommended for <10 clusters) - “mammen”: Two-point distribution with skewness correction

  • null_hypothesis (float, default=0.0) – Value of the null hypothesis for p-value computation.

  • alpha (float, default=0.05) – Significance level for confidence interval.

  • seed (int, optional) – Random seed for reproducibility. If None (default), results will vary between runs.

  • return_distribution (bool, default=False) – If True, include full bootstrap distribution in results.

Returns:

Dataclass containing bootstrap SE, p-value, confidence interval, and other inference results.

Return type:

WildBootstrapResults

Raises:

ValueError – If weight_type is not recognized or if there are fewer than 2 clusters.

Warns:

UserWarning – If the number of clusters is less than 5, as bootstrap inference may be unreliable.

Examples

>>> from diff_diff.utils import wild_bootstrap_se
>>> results = wild_bootstrap_se(
...     X, y, residuals, cluster_ids,
...     coefficient_index=3,  # ATT coefficient
...     n_bootstrap=999,
...     weight_type="rademacher",
...     seed=42
... )
>>> print(f"Bootstrap SE: {results.se:.4f}")
>>> print(f"Bootstrap p-value: {results.p_value:.4f}")

References

Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2008). Bootstrap-Based Improvements for Inference with Clustered Errors. The Review of Economics and Statistics, 90(3), 414-427.

MacKinnon, J. G., & Webb, M. D. (2018). The wild bootstrap for few (treated) clusters. The Econometrics Journal, 21(2), 114-135.