DiagnosticReport#

DiagnosticReport orchestrates the library’s existing diagnostic functions (parallel trends, pre-trends power, HonestDiD sensitivity, Goodman-Bacon, design-effect, EPV, heterogeneity, and estimator-native checks for SyntheticDiD and TROP) into a single report with a stable AI-legible schema.

Construction is free; run_all() triggers the compute and caches. A second call to to_dict() or summary() reuses the cached result.

Methodology deviations (no traffic-light gates, opt-in placebo battery, estimator-native diagnostic routing, power-aware phrasing threshold) are documented in docs/methodology/REPORTING.md.

The schema carries a top-level target_parameter block (experimental) naming what the headline scalar represents per estimator. See the “Target parameter” section of docs/methodology/REPORTING.md for the per-estimator dispatch and schema shape.

Data-dependent checks (2x2 parallel trends on simple DiD, Goodman-Bacon decomposition on staggered estimators, the EfficientDiD Hausman PT-All vs PT-Post pretest) require the raw panel + column names. Pass data, outcome, treatment, unit, time, and/or first_treat and they feed the runners. Without these kwargs, those specific checks are skipped with an explicit reason while the rest of the battery still runs.

For survey-weighted fits (any result carrying survey_metadata) pass the original SurveyDesign via survey_design=<design>. It is threaded through to bacon_decompose for a fit-faithful Goodman-Bacon replay. When survey_metadata is set but survey_design is not supplied, Bacon is skipped with an explicit reason so the report never emits an unweighted decomposition for a design that differs from the estimate; alternatively supply precomputed={'bacon': <BaconDecompositionResults>} with a survey-aware result.

The simple 2x2 parallel-trends helper has no survey-aware variant and is skipped unconditionally on a survey-backed DiDResults regardless of survey_design — the helper cannot consume the design even when it is available. Supply precomputed={'parallel_trends': <dict>} with a survey-aware pretest result to opt in.

Example#

from diff_diff import CallawaySantAnna, DiagnosticReport

cs = CallawaySantAnna(base_period="universal").fit(
    df, outcome="outcome", unit="unit", time="period",
    first_treat="first_treat", aggregate="event_study",
)
dr = DiagnosticReport(
    cs,
    data=df,
    outcome="outcome",
    unit="unit",
    time="period",
    first_treat="first_treat",
)
print(dr.summary())
dr.to_dataframe()  # one row per check

API#

class diff_diff.DiagnosticReport[source]

Bases: object

Run the standard diff-diff diagnostic battery on a fitted result.

Parameters:
  • results (Any) – A fitted diff-diff results object (e.g. CallawaySantAnnaResults, DiDResults, SyntheticDiDResults). Any of the 16 result types in the library is accepted.

  • data (pandas.DataFrame, optional) – The underlying panel. Required for checks that need raw data (2x2 parallel-trends check on DiDResults; Bacon-from-scratch when results is not itself a Bacon fit; the opt-in placebo battery).

  • outcome (str, optional) – Column names identifying the panel structure.

  • treatment (str, optional) – Column names identifying the panel structure.

  • time (str, optional) – Column names identifying the panel structure.

  • unit (str, optional) – Column names identifying the panel structure.

  • first_treat (str, optional) – Column names identifying the panel structure.

  • pre_periods (list, optional) – Explicit pre- and post-treatment period labels.

  • post_periods (list, optional) – Explicit pre- and post-treatment period labels.

  • run_parallel_trends (bool) – Per-check opt-in flags. run_placebo defaults to False — the generic placebo battery is not implemented in MVP, so the placebo key remains reserved as skipped in the schema. (Exception: SyntheticControl’s in-space placebo permutation test IS implemented — run it via results.in_space_placebo(); its result is surfaced under estimator_native_diagnostics.in_space_placebo, not this generic section.) All other checks default to True and are further gated by estimator-type and instance-level applicability (see docs/methodology/REPORTING.md).

  • run_sensitivity (bool) – Per-check opt-in flags. run_placebo defaults to False — the generic placebo battery is not implemented in MVP, so the placebo key remains reserved as skipped in the schema. (Exception: SyntheticControl’s in-space placebo permutation test IS implemented — run it via results.in_space_placebo(); its result is surfaced under estimator_native_diagnostics.in_space_placebo, not this generic section.) All other checks default to True and are further gated by estimator-type and instance-level applicability (see docs/methodology/REPORTING.md).

  • run_placebo (bool) – Per-check opt-in flags. run_placebo defaults to False — the generic placebo battery is not implemented in MVP, so the placebo key remains reserved as skipped in the schema. (Exception: SyntheticControl’s in-space placebo permutation test IS implemented — run it via results.in_space_placebo(); its result is surfaced under estimator_native_diagnostics.in_space_placebo, not this generic section.) All other checks default to True and are further gated by estimator-type and instance-level applicability (see docs/methodology/REPORTING.md).

  • run_bacon (bool) – Per-check opt-in flags. run_placebo defaults to False — the generic placebo battery is not implemented in MVP, so the placebo key remains reserved as skipped in the schema. (Exception: SyntheticControl’s in-space placebo permutation test IS implemented — run it via results.in_space_placebo(); its result is surfaced under estimator_native_diagnostics.in_space_placebo, not this generic section.) All other checks default to True and are further gated by estimator-type and instance-level applicability (see docs/methodology/REPORTING.md).

  • run_design_effect (bool) – Per-check opt-in flags. run_placebo defaults to False — the generic placebo battery is not implemented in MVP, so the placebo key remains reserved as skipped in the schema. (Exception: SyntheticControl’s in-space placebo permutation test IS implemented — run it via results.in_space_placebo(); its result is surfaced under estimator_native_diagnostics.in_space_placebo, not this generic section.) All other checks default to True and are further gated by estimator-type and instance-level applicability (see docs/methodology/REPORTING.md).

  • run_heterogeneity (bool) – Per-check opt-in flags. run_placebo defaults to False — the generic placebo battery is not implemented in MVP, so the placebo key remains reserved as skipped in the schema. (Exception: SyntheticControl’s in-space placebo permutation test IS implemented — run it via results.in_space_placebo(); its result is surfaced under estimator_native_diagnostics.in_space_placebo, not this generic section.) All other checks default to True and are further gated by estimator-type and instance-level applicability (see docs/methodology/REPORTING.md).

  • run_epv (bool) – Per-check opt-in flags. run_placebo defaults to False — the generic placebo battery is not implemented in MVP, so the placebo key remains reserved as skipped in the schema. (Exception: SyntheticControl’s in-space placebo permutation test IS implemented — run it via results.in_space_placebo(); its result is surfaced under estimator_native_diagnostics.in_space_placebo, not this generic section.) All other checks default to True and are further gated by estimator-type and instance-level applicability (see docs/methodology/REPORTING.md).

  • run_pretrends_power (bool) – Per-check opt-in flags. run_placebo defaults to False — the generic placebo battery is not implemented in MVP, so the placebo key remains reserved as skipped in the schema. (Exception: SyntheticControl’s in-space placebo permutation test IS implemented — run it via results.in_space_placebo(); its result is surfaced under estimator_native_diagnostics.in_space_placebo, not this generic section.) All other checks default to True and are further gated by estimator-type and instance-level applicability (see docs/methodology/REPORTING.md).

  • sensitivity_M_grid (tuple of float, default (0.5, 1.0, 1.5, 2.0)) – Grid of M values passed to HonestDiD.sensitivity. Yields a SensitivityResults object with breakdown_M populated.

  • sensitivity_method (str, default "relative_magnitude") – HonestDiD restriction type.

  • alpha (float, default 0.05) – Significance level used across checks.

  • survey_design (SurveyDesign, optional) – The SurveyDesign object used to fit a survey-weighted estimator. Required for fit-faithful replay of Goodman-Bacon on a survey-backed fit; threaded to bacon_decompose(survey_design=...). When the fit carries survey_metadata but survey_design is not supplied, Bacon is skipped with an explicit reason rather than replaying an unweighted decomposition for a design that does not match the estimate. The simple 2x2 parallel-trends helper (utils.check_parallel_trends) has no survey-aware variant; on a survey-backed DiDResults it is skipped unconditionally regardless of survey_design. Supply precomputed={'parallel_trends': ...} with a survey-aware pretest to opt in. See docs/methodology/REPORTING.md.

  • precomputed (dict, optional) –

    Map of check name to a pre-computed result object. Accepted keys (this is the full implemented list; unsupported keys raise ValueError):

    • "parallel_trends" — a dict returned by utils.check_parallel_trends (adapted into the schema shape).

    • "sensitivity" — a SensitivityResults (grid) or HonestDiDResults (single-M) object; used verbatim and no HonestDiD.sensitivity_analysis call is made.

    • "pretrends_power" — a PreTrendsPowerResults object.

    • "bacon" — a BaconDecompositionResults object.

    Other sections (design_effect, heterogeneity, epv) are read directly from the fitted result object and do not currently accept precomputed values — there is no expensive call to bypass. placebo is reserved in the schema but opt-in / deferred in MVP for the generic battery; SyntheticControl surfaces its in-space placebo under estimator_native_diagnostics (run results.in_space_placebo()).

  • outcome_label (str, optional) – Plain-English labels used in prose rendering.

  • treatment_label (str, optional) – Plain-English labels used in prose rendering.

__init__(results, *, data=None, outcome=None, treatment=None, time=None, unit=None, first_treat=None, pre_periods=None, post_periods=None, run_parallel_trends=True, run_sensitivity=True, run_placebo=False, run_bacon=True, run_design_effect=True, run_heterogeneity=True, run_epv=True, run_pretrends_power=True, sensitivity_M_grid=(0.5, 1.0, 1.5, 2.0), sensitivity_method='relative_magnitude', alpha=0.05, survey_design=None, precomputed=None, outcome_label=None, treatment_label=None)[source]
Parameters:
  • results (Any)

  • data (DataFrame | None)

  • outcome (str | None)

  • treatment (str | None)

  • time (str | None)

  • unit (str | None)

  • first_treat (str | None)

  • pre_periods (List[Any] | None)

  • post_periods (List[Any] | None)

  • run_parallel_trends (bool)

  • run_sensitivity (bool)

  • run_placebo (bool)

  • run_bacon (bool)

  • run_design_effect (bool)

  • run_heterogeneity (bool)

  • run_epv (bool)

  • run_pretrends_power (bool)

  • sensitivity_M_grid (Tuple[float, ...])

  • sensitivity_method (str)

  • alpha (float)

  • survey_design (Any | None)

  • precomputed (Dict[str, Any] | None)

  • outcome_label (str | None)

  • treatment_label (str | None)

run_all()[source]

Run all applicable diagnostics. Idempotent; caches on first call.

Return type:

DiagnosticReportResults

to_dict()[source]

Return the AI-legible structured schema.

Return type:

Dict[str, Any]

summary()[source]

Return a short plain-English paragraph.

Return type:

str

full_report()[source]

Return the multi-section markdown report.

Return type:

str

export_markdown()[source]

Alias for full_report().

Return type:

str

to_dataframe()[source]

Return one row per check with status and headline metric.

Return type:

DataFrame

property applicable_checks: Tuple[str, ...]

Names of checks that will run, given estimator + instance + options.

No compute is triggered; this reflects only the applicability matrix filtered by instance state (survey_metadata, epv_diagnostics, vcov) and the user’s run_* flags.

property skipped_checks: Dict[str, str]

Mapping of skipped check -> plain-English reason. Requires run_all().

class diff_diff.DiagnosticReportResults[source]

Bases: object

Frozen container holding the outcome of a DiagnosticReport.run_all() call.

schema

The AI-legible structured schema (also returned by to_dict()).

Type:

dict

interpretation

The overall_interpretation paragraph synthesizing findings across checks.

Type:

str

applicable_checks

The names of checks that applied to this estimator + options.

Type:

tuple of str

skipped_checks

Mapping from skipped-check name to plain-English reason.

Type:

dict of str -> str

warnings

Warnings captured while running the underlying diagnostic functions.

Type:

tuple of str

schema: Dict[str, Any]
interpretation: str
applicable_checks: Tuple[str, ...]
skipped_checks: Dict[str, str]
warnings: Tuple[str, ...] = ()
__init__(schema, interpretation, applicable_checks, skipped_checks=<factory>, warnings=())
Parameters:
Return type:

None

diff_diff.DIAGNOSTIC_REPORT_SCHEMA_VERSION = '2.0'#

str(object=’’) -> str str(bytes_or_buffer[, encoding[, errors]]) -> str

Create a new string object from the given object. If encoding or errors is specified, then the object must expose a data buffer that will be decoded using the given encoding and error handler. Otherwise, returns the result of object.__str__() (if defined) or repr(object). encoding defaults to sys.getdefaultencoding(). errors defaults to ‘strict’.