diff_diff.make_treatment_indicator#

diff_diff.make_treatment_indicator(data, column, treated_values=None, threshold=None, above_threshold=True, new_column='treated')[source]

Create a binary treatment indicator column from various input types.

Parameters:
  • data (pd.DataFrame) – Input DataFrame.

  • column (str) – Name of the column to use for creating the treatment indicator.

  • treated_values (Any or list, optional) – Value(s) that indicate treatment. Units with these values get treatment=1, others get treatment=0.

  • threshold (float, optional) – Numeric threshold for creating treatment. Used when the treatment is based on a continuous variable (e.g., treat firms above median size).

  • above_threshold (bool, default=True) – If True, values >= threshold are treated. If False, values <= threshold are treated. Only used when threshold is specified.

  • new_column (str, default="treated") – Name of the new treatment indicator column.

Returns:

DataFrame with the new treatment indicator column added.

Return type:

pd.DataFrame

Examples

Create treatment from categorical variable:

>>> df = pd.DataFrame({'group': ['A', 'A', 'B', 'B'], 'y': [1, 2, 3, 4]})
>>> df = make_treatment_indicator(df, 'group', treated_values='A')
>>> df['treated'].tolist()
[1, 1, 0, 0]

Create treatment from numeric threshold:

>>> df = pd.DataFrame({'size': [10, 50, 100, 200], 'y': [1, 2, 3, 4]})
>>> df = make_treatment_indicator(df, 'size', threshold=75)
>>> df['treated'].tolist()
[0, 0, 1, 1]

Treat units below a threshold:

>>> df = make_treatment_indicator(df, 'size', threshold=75, above_threshold=False)
>>> df['treated'].tolist()
[1, 1, 0, 0]