diff_diff.make_treatment_indicator#
- diff_diff.make_treatment_indicator(data, column, treated_values=None, threshold=None, above_threshold=True, new_column='treated')[source]
Create a binary treatment indicator column from various input types.
- Parameters:
data (pd.DataFrame) – Input DataFrame.
column (str) – Name of the column to use for creating the treatment indicator.
treated_values (Any or list, optional) – Value(s) that indicate treatment. Units with these values get treatment=1, others get treatment=0.
threshold (float, optional) – Numeric threshold for creating treatment. Used when the treatment is based on a continuous variable (e.g., treat firms above median size).
above_threshold (bool, default=True) – If True, values >= threshold are treated. If False, values <= threshold are treated. Only used when threshold is specified.
new_column (str, default="treated") – Name of the new treatment indicator column.
- Returns:
DataFrame with the new treatment indicator column added.
- Return type:
pd.DataFrame
Examples
Create treatment from categorical variable:
>>> df = pd.DataFrame({'group': ['A', 'A', 'B', 'B'], 'y': [1, 2, 3, 4]}) >>> df = make_treatment_indicator(df, 'group', treated_values='A') >>> df['treated'].tolist() [1, 1, 0, 0]
Create treatment from numeric threshold:
>>> df = pd.DataFrame({'size': [10, 50, 100, 200], 'y': [1, 2, 3, 4]}) >>> df = make_treatment_indicator(df, 'size', threshold=75) >>> df['treated'].tolist() [0, 0, 1, 1]
Treat units below a threshold:
>>> df = make_treatment_indicator(df, 'size', threshold=75, above_threshold=False) >>> df['treated'].tolist() [1, 1, 0, 0]