Comparison with R Packages ========================== This guide compares diff-diff with popular R packages for DiD analysis, helping users familiar with R transition to Python. Overview -------- .. list-table:: :header-rows: 1 :widths: 25 25 25 25 * - Feature - diff-diff (Python) - did (R) - Other R * - Basic DiD - ✅ ``DifferenceInDifferences`` - ✅ ``att_gt`` - ✅ ``fixest::feols`` * - Staggered DiD - ✅ ``CallawaySantAnna`` - ✅ ``att_gt`` - ``did2s``, ``DRDID`` * - Covariate adjustment - ✅ DR, IPW, Reg - ✅ DR, IPW, Reg - ✅ Varies * - Honest DiD - ✅ ``HonestDiD`` - ``HonestDiD`` package - N/A * - Synthetic DiD - ✅ ``SyntheticDiD`` - ``synthdid`` package - N/A * - Wild bootstrap - ✅ ``wild_bootstrap_se`` - ``fwildclusterboot`` - N/A Package Correspondence ---------------------- R ``did`` Package → diff-diff ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The R ``did`` package by Callaway and Sant'Anna is the gold standard for staggered DiD. Here's how to translate common operations: **Basic estimation:** .. code-block:: r # R (did package) library(did) out <- att_gt( yname = "Y", tname = "period", idname = "id", gname = "G", data = data ) .. code-block:: python # Python (diff-diff) from diff_diff import CallawaySantAnna cs = CallawaySantAnna() results = cs.fit( data, outcome='Y', time='period', unit='id', first_treat='G' ) **With covariates (doubly robust):** .. code-block:: r # R out <- att_gt( yname = "Y", tname = "period", idname = "id", gname = "G", xformla = ~ X1 + X2, est_method = "dr", data = data ) .. code-block:: python # Python cs = CallawaySantAnna(estimation_method='dr') results = cs.fit( data, outcome='Y', time='period', unit='id', first_treat='G', covariates=['X1', 'X2'] ) **Aggregations:** .. code-block:: r # R agg_simple <- aggte(out, type = "simple") agg_dynamic <- aggte(out, type = "dynamic") agg_group <- aggte(out, type = "group") .. code-block:: python # Python overall_att = results.att # Simple aggregation event_study = results.aggregate('event_time') # Dynamic by_group = results.aggregate('group') # By cohort R ``HonestDiD`` Package → diff-diff ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The HonestDiD package implements Rambachan & Roth (2023) sensitivity analysis: **Relative magnitudes (ΔRM):** .. code-block:: r # R library(HonestDiD) delta_rm_results <- createSensitivityResults_relativeMagnitudes( betahat = beta_hat, sigma = sigma, numPrePeriods = 4, numPostPeriods = 3, Mbarvec = seq(0, 2, by = 0.5) ) .. code-block:: python # Python from diff_diff import HonestDiD, DeltaRM honest = HonestDiD(delta=DeltaRM(M_bar=1.0)) results = honest.fit(event_study_results) # Sensitivity analysis over M grid sensitivity = honest.sensitivity_analysis( event_study_results, M_grid=[0, 0.5, 1.0, 1.5, 2.0] ) **Smoothness restrictions (ΔSD):** .. code-block:: r # R delta_sd_results <- createSensitivityResults( betahat = beta_hat, sigma = sigma, numPrePeriods = 4, numPostPeriods = 3, Mvec = seq(0, 0.1, by = 0.02) ) .. code-block:: python # Python from diff_diff import HonestDiD, DeltaSD honest = HonestDiD(delta=DeltaSD(M=0.05)) results = honest.fit(event_study_results) R ``synthdid`` Package → diff-diff ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The synthdid package implements Arkhangelsky et al. (2021): .. code-block:: r # R library(synthdid) setup <- panel.matrices(data, unit = "unit", time = "time", outcome = "Y", treatment = "treatment") tau.hat <- synthdid_estimate(setup$Y, setup$N0, setup$T0) .. code-block:: python # Python from diff_diff import SyntheticDiD sdid = SyntheticDiD() results = sdid.fit( data, outcome='Y', unit='unit', time='time', treated='treatment', treatment_start=T0 ) Key Differences --------------- Design Philosophy ~~~~~~~~~~~~~~~~~ - **diff-diff**: sklearn-style API with ``fit()`` method, returning rich result objects - **R packages**: Function-based, returning lists or S3/S4 objects Inference ~~~~~~~~~ - **diff-diff**: Analytical SEs by default, wild bootstrap available - **R did**: Multiplier bootstrap by default Fixed Effects ~~~~~~~~~~~~~ - **diff-diff**: ``absorb`` parameter for high-dimensional FE (within transformation) - **R fixest**: ``feols`` with ``|`` notation for absorbed FE Output Format ~~~~~~~~~~~~~ diff-diff results have convenience methods: .. code-block:: python results.summary() # Print formatted table results.to_dict() # Dictionary representation results.to_dataframe() # pandas DataFrame Feature Comparison Table ------------------------ .. list-table:: :header-rows: 1 :widths: 40 15 15 15 15 * - Feature - diff-diff - R did - R HonestDiD - R synthdid * - Basic 2x2 DiD - ✅ - ✅ - ❌ - ❌ * - TWFE - ✅ - ❌ - ❌ - ❌ * - Staggered DiD (CS) - ✅ - ✅ - ❌ - ❌ * - Covariate adjustment - ✅ - ✅ - ❌ - ❌ * - Doubly robust - ✅ - ✅ - ❌ - ❌ * - Group-time effects - ✅ - ✅ - ❌ - ❌ * - Event study - ✅ - ✅ - ✅ - ❌ * - Synthetic DiD - ✅ - ❌ - ❌ - ✅ * - Honest DiD (ΔRM) - ✅ - ❌ - ✅ - ❌ * - Honest DiD (ΔSD) - ✅ - ❌ - ✅ - ❌ * - Wild bootstrap - ✅ - ❌ - ❌ - ❌ * - Cluster-robust SE - ✅ - ✅ - ❌ - ✅ * - Placebo tests - ✅ - ❌ - ❌ - ✅ * - Parallel trends tests - ✅ - ✅ - ❌ - ❌ * - Bacon decomposition - ✅ - ❌ - ❌ - ❌ Migration Tips -------------- 1. **Column names**: diff-diff uses string column names, similar to R packages 2. **Formula interface**: diff-diff supports R-style formulas for basic DiD: ``formula='y ~ treated * post'`` 3. **Results access**: Use ``.att``, ``.se``, ``.ci`` instead of ``$att``, ``$se`` 4. **Visualization**: ``plot_event_study()`` produces matplotlib figures similar to ``ggdid()`` output 5. **Missing data**: diff-diff requires complete data; use ``balance_panel()`` or ``dropna()`` first