diff_diff.wide_to_long#
- diff_diff.wide_to_long(data, value_columns, id_column, time_name='period', value_name='value', time_values=None)[source]
Convert wide-format panel data to long format for DiD analysis.
Wide format has one row per unit with multiple columns for each time period. Long format has one row per unit-period combination.
- Parameters:
data (pd.DataFrame) – Wide-format DataFrame with one row per unit.
value_columns (list of str) – Column names containing the outcome values for each period. These should be in chronological order.
id_column (str) – Column name for the unit identifier.
time_name (str, default="period") – Name for the new time period column.
value_name (str, default="value") – Name for the new value/outcome column.
time_values (list, optional) – Values to use for time periods. If None, uses 0, 1, 2, … Must have same length as value_columns.
- Returns:
Long-format DataFrame with one row per unit-period.
- Return type:
pd.DataFrame
Examples
>>> wide_df = pd.DataFrame({ ... 'firm_id': [1, 2, 3], ... 'sales_2019': [100, 150, 200], ... 'sales_2020': [110, 160, 210], ... 'sales_2021': [120, 170, 220] ... }) >>> long_df = wide_to_long( ... wide_df, ... value_columns=['sales_2019', 'sales_2020', 'sales_2021'], ... id_column='firm_id', ... time_name='year', ... value_name='sales', ... time_values=[2019, 2020, 2021] ... ) >>> len(long_df) 9 >>> long_df.columns.tolist() ['firm_id', 'year', 'sales']