diff_diff.make_post_indicator#

diff_diff.make_post_indicator(data, time_column, post_periods=None, treatment_start=None, new_column='post')[source]

Create a binary post-treatment indicator column.

Parameters:
  • data (pd.DataFrame) – Input DataFrame.

  • time_column (str) – Name of the time/period column.

  • post_periods (Any or list, optional) – Specific period value(s) that are post-treatment. Periods matching these values get post=1, others get post=0.

  • treatment_start (Any, optional) – The first post-treatment period. All periods >= this value get post=1. Works with numeric periods, strings (sorted alphabetically), or dates.

  • new_column (str, default="post") – Name of the new post indicator column.

Returns:

DataFrame with the new post indicator column added.

Return type:

pd.DataFrame

Examples

Using specific post periods:

>>> df = pd.DataFrame({'year': [2018, 2019, 2020, 2021], 'y': [1, 2, 3, 4]})
>>> df = make_post_indicator(df, 'year', post_periods=[2020, 2021])
>>> df['post'].tolist()
[0, 0, 1, 1]

Using treatment start:

>>> df = make_post_indicator(df, 'year', treatment_start=2020)
>>> df['post'].tolist()
[0, 0, 1, 1]

Works with date columns:

>>> df = pd.DataFrame({'date': pd.to_datetime(['2020-01-01', '2020-06-01', '2021-01-01'])})
>>> df = make_post_indicator(df, 'date', treatment_start='2020-06-01')