diff_diff.load_mpdta#

diff_diff.load_mpdta(force_download=False)[source]

Load the Minimum Wage Panel Dataset for DiD Analysis (mpdta).

This is a simulated dataset from the R did package that mimics county-level employment data under staggered minimum wage increases. It’s designed specifically for teaching the Callaway-Sant’Anna estimator.

Parameters:

force_download (bool, default=False) – If True, re-download the dataset even if cached.

Returns:

Panel dataset with columns: - countyreal : int - County identifier - year : int - Year (2003-2007) - lpop : float - Log population - lemp : float - Log employment (outcome) - first_treat : int - Year of minimum wage increase (0 = never) - treat : int - 1 if ever treated, 0 otherwise

Return type:

pd.DataFrame

Notes

This dataset is included in the R did package and is commonly used in tutorials demonstrating the Callaway-Sant’Anna estimator.

References

Callaway, B., & Sant’Anna, P. H. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2), 200-230.

Examples

>>> from diff_diff.datasets import load_mpdta
>>> from diff_diff import CallawaySantAnna
>>>
>>> mpdta = load_mpdta()
>>> cs = CallawaySantAnna()
>>> results = cs.fit(
...     mpdta,
...     outcome="lemp",
...     unit="countyreal",
...     time="year",
...     first_treat="first_treat"
... )