diff_diff.wild_bootstrap_se#
- diff_diff.wild_bootstrap_se(X, y, residuals, cluster_ids, coefficient_index, n_bootstrap=999, weight_type='rademacher', null_hypothesis=0.0, alpha=0.05, seed=None, return_distribution=False)[source]
Compute wild cluster bootstrap standard errors and p-values.
Implements the Wild Cluster Residual (WCR) bootstrap procedure from Cameron, Gelbach, and Miller (2008). Uses the restricted residuals approach (imposing H0: coefficient = null_hypothesis) for more accurate p-value computation.
- Parameters:
X (np.ndarray) – Design matrix of shape (n, k).
y (np.ndarray) – Outcome vector of shape (n,).
residuals (np.ndarray) – OLS residuals from unrestricted regression, shape (n,).
cluster_ids (np.ndarray) – Cluster identifiers of shape (n,).
coefficient_index (int) – Index of the coefficient for which to compute bootstrap inference. For DiD, this is typically 3 (the treatment*post interaction term).
n_bootstrap (int, default=999) – Number of bootstrap replications. Odd numbers are recommended for exact p-value computation.
weight_type (str, default="rademacher") – Type of bootstrap weights: - “rademacher”: +1 or -1 with equal probability (standard choice) - “webb”: 6-point distribution (recommended for <10 clusters) - “mammen”: Two-point distribution with skewness correction
null_hypothesis (float, default=0.0) – Value of the null hypothesis for p-value computation.
alpha (float, default=0.05) – Significance level for confidence interval.
seed (int, optional) – Random seed for reproducibility. If None (default), results will vary between runs.
return_distribution (bool, default=False) – If True, include full bootstrap distribution in results.
- Returns:
Dataclass containing bootstrap SE, p-value, confidence interval, and other inference results.
- Return type:
- Raises:
ValueError – If weight_type is not recognized or if there are fewer than 2 clusters.
- Warns:
UserWarning – If the number of clusters is less than 5, as bootstrap inference may be unreliable.
Examples
>>> from diff_diff.utils import wild_bootstrap_se >>> results = wild_bootstrap_se( ... X, y, residuals, cluster_ids, ... coefficient_index=3, # ATT coefficient ... n_bootstrap=999, ... weight_type="rademacher", ... seed=42 ... ) >>> print(f"Bootstrap SE: {results.se:.4f}") >>> print(f"Bootstrap p-value: {results.p_value:.4f}")
References
Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2008). Bootstrap-Based Improvements for Inference with Clustered Errors. The Review of Economics and Statistics, 90(3), 414-427.
MacKinnon, J. G., & Webb, M. D. (2018). The wild bootstrap for few (treated) clusters. The Econometrics Journal, 21(2), 114-135.