diff_diff.load_mpdta#
- diff_diff.load_mpdta(force_download=False)[source]
Load the Minimum Wage Panel Dataset for DiD Analysis (mpdta).
This is a simulated dataset from the R did package that mimics county-level employment data under staggered minimum wage increases. It’s designed specifically for teaching the Callaway-Sant’Anna estimator.
- Parameters:
force_download (bool, default=False) – If True, re-download the dataset even if cached.
- Returns:
Panel dataset with columns: - countyreal : int - County identifier - year : int - Year (2003-2007) - lpop : float - Log population - lemp : float - Log employment (outcome) - first_treat : int - Year of minimum wage increase (0 = never) - treat : int - 1 if ever treated, 0 otherwise
- Return type:
pd.DataFrame
Notes
This dataset is included in the R did package and is commonly used in tutorials demonstrating the Callaway-Sant’Anna estimator.
References
Callaway, B., & Sant’Anna, P. H. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2), 200-230.
Examples
>>> from diff_diff.datasets import load_mpdta >>> from diff_diff import CallawaySantAnna >>> >>> mpdta = load_mpdta() >>> cs = CallawaySantAnna() >>> results = cs.fit( ... mpdta, ... outcome="lemp", ... unit="countyreal", ... time="year", ... first_treat="first_treat" ... )