Q072 - What_must_be_captured_to_make_a_RAG_run_audit-ready

Q072 — What must be captured to make a RAG run audit-ready?

← RAIDT · Star S6 - Influence Methods as Governance Interventions · primary item: S6.09 · Provenance-first RAG

A RAG run is audit-ready only when retrieval itself is logged as part of the configured use event.

Appears in sources
Answer

In Provenance-first RAG, a run becomes audit-ready only when every output is reconstructable as a governed event rather than merely accompanied by decorative references. Across the RAIDT papers, this means capturing prompt and version lineage, model lineage, adapter lineage where applicable, retriever settings, corpus or index snapshot identifiers, passage IDs and offsets returned at retrieval time, inline citation links from claims to retrieved passages, timestamps, and SHA-256 hashes for inputs, prompts, indexes, and outputs. The RAG paper is explicit that citations, hashes, and run logs are primary outputs, because they close the evidence-explanation gap that otherwise leaves reviewers guessing where a clinical summary or risk flag came from.

To satisfy the five pillars (Responsibility, Auditability, Interpretability, Dependability, Traceability), the record must also preserve reviewer forms, adjudication notes, task metrics, and the run’s score profile, assessed against anchors 1=missing / 3=partial / 5=audit-ready. In RAIDT terms, influence methods as governance interventions are only defensible when the run-level evidence pack shows what was retrieved, what was generated, and under which frozen configuration. Because RAIDT treats the run as the unit of governance, an audit-ready RAG record should bind together technical lineage (for example Git commit, environment snapshot, configuration fingerprint), retrieval evidence (document IDs, spans, top-k results, and snapshot freshness), and oversight evidence (human review, risk notes, and rationale). Without that bundle, a system may appear grounded, but it is not contestable, replayable, or post-hoc auditable.

Practical example

In a healthcare deployment, a clinical decision support tool uses provenance-first RAG to summarise a de-identified admission note and surface red flags. For one summary, the run-level evidence pack would contain the input-note hash, prompt ID and prompt hash, model version, adapter SHA where used, FAISS index snapshot ID and index hash, the retrieved passages with note IDs and paragraph anchors, the final answer with inline citations, the output hash, and the reviewer’s RAIDT score profile with comments.

If an auditor later asks why the system flagged renal impairment but omitted an anticoagulant interaction, the team can replay the exact run against the frozen snapshot, inspect the retrieved spans, and determine whether the failure arose in retrieval, prompting, generation, or reviewer adjudication. That is the practical meaning of audit-ready RAG in the papers: provenance is not an appendix to the answer; it is the evidence that makes the answer governable.

Sources in RAIDT papers
Powered by Forestry.md