diff_diff.summarize_did_data#

diff_diff.summarize_did_data(data, outcome, treatment, time, unit=None)[source]

Generate summary statistics by treatment group and time period.

Parameters:
  • data (pd.DataFrame) – Input data.

  • outcome (str) – Name of outcome variable column.

  • treatment (str) – Name of treatment indicator column.

  • time (str) – Name of time/period column.

  • unit (str, optional) – Name of unit identifier column.

Returns:

Summary statistics with columns for each treatment-time combination.

Return type:

pd.DataFrame

Examples

>>> df = pd.DataFrame({
...     'y': [10, 11, 12, 13, 20, 21, 22, 23],
...     'treated': [0, 0, 1, 1, 0, 0, 1, 1],
...     'post': [0, 1, 0, 1, 0, 1, 0, 1]
... })
>>> summary = summarize_did_data(df, 'y', 'treated', 'post')
>>> print(summary)