S6.08 - RAG
S6.08 ? RAG
flowchart LR
A[Traditional RAG promise
grounded answers from approved sources] --> B[RAIDT
run-level evidence framework]
H[Practical retrieval fields
query, corpus version, chunk IDs, source hashes,
timestamps, reviewer notes] --> C[[RAG
retrieval-grounded influence on one run]]
B --> C
C --> D[Evidence pack
retrieval snapshot preserved]
C --> E[RAIDT score profile
especially Auditability and Traceability]
D --> F[Reviewer reconstruction
and contestability]
E --> G[Governance readiness
and organisational learning]← Star S6 - Influence Methods as Governance Interventions
Star context: Positions prompting, RAG, PEFT/LoRA, RLHF/DPO, and stacked influence as methods that shape the evidential conditions of governance inside RAIDT rather than replacing RAIDT's run-level core.
Academic picture
Definition / background
Retrieval-Augmented Generation (RAG) combines generation with a retrieval step that supplies external material to the model at inference time. Conceptually, it emerged from the combination of information retrieval and neural text generation: instead of relying only on model parameters or prompt wording, the system brings selected documents, chunks, or passages into the run so that the answer can be informed by a defined source base. In ordinary technical discussion, RAG is often presented as a method for grounding outputs, reducing hallucination, or improving domain relevance.
In governance terms, however, RAG matters for a different reason. It can create a more reviewable relationship between an output and the materials that shaped it, but only if the retrieval process is itself captured. If an organisation cannot show what was retrieved, from which corpus version, under which query and retriever settings, then RAG remains only a technical claim about how the system was supposed to behave. RAIDT therefore treats RAG as an influence method that can strengthen governance evidence when retrieval becomes part of the run-level record.
RAG differs from prompting alone because it introduces external source material into the run rather than relying only on instructions. It differs from fine-tuning or adapter-based methods because the knowledge is not primarily embedded into model weights; instead, it is selected dynamically at run time. That distinction is important in RAIDT because dynamic selection creates a specific evidential requirement: the retrieved context must be preserved if the run is to be reconstructable.
This item belongs inside RAIDT because the framework treats the run as the unit of governance. RAG changes the evidential structure of that run. When retrieval traces, source identifiers, and retrieved content snapshots are preserved, the run-level evidence pack becomes richer and the five-pillar score profile can be justified with more confidence. When those traces are absent, claims of grounding or traceability remain weak. In that sense, RAG is not automatically governance-ready; it becomes governance-relevant when it is made reviewable within RAIDT.
Why this concept matters
RAG matters because many organisations want GenAI systems to answer with reference to approved knowledge rather than free-form model memory. That is a legitimate practical goal, but governance needs more than the promise of better answers. It needs evidence of how the answer was shaped in one actual run. RAG helps when it turns the source relationship of the output into something that can be inspected later by supervisors, auditors, or reviewers.
The concept also avoids a common confusion between grounded output and traceable output. An output may appear well grounded because it resembles internal policy or cites a document title, yet still be poorly traceable if the retrieval event was not recorded properly. RAIDT clarifies that traceability is not solved by the existence of retrieval alone; it depends on preserving the evidence of retrieval.
If this concept is missing, organisations may overstate the governance value of RAG. They may assume that using a knowledge base automatically creates accountability, when in fact a later reviewer may still be unable to determine which documents were retrieved, whether they were current, whether reranking favoured some passages over others, or whether the final answer departed materially from the retrieved material. RAIDT uses this concept to move from broad claims of grounded AI to operational proof.
Key idea: RAG matters in RAIDT because retrieval improves governance only when the retrieved evidence itself is preserved at run level.
What this item enables
- The linking of generated outputs to specific retrieved sources rather than to vague claims of grounding.
- Reconstruction of which documents, passages, versions, and retrieval settings influenced one run.
- Stronger evidence packs that show not only the answer but also the knowledge path behind it.
- Better justified scoring for Auditability and Traceability, with secondary benefits for Dependability and Interpretability.
- More contestable use of GenAI, because reviewers can inspect whether the retrieved material really supported the output.
- Organisational learning about corpus quality, document freshness, retrieval drift, and domain coverage.
- A clearer boundary between acceptable evidence-backed use and unsupported model improvisation.
Practical example / likely audience question
Audience question
Does RAG automatically solve traceability in RAIDT?
Answer
No. The misconception behind the question is that retrieval and traceability are the same thing. They are not. RAG can improve traceability, but only if the organisation preserves evidence of the retrieval event itself. If the system retrieves passages from an internal knowledge base but does not log the query, the corpus or index version, the retrieved chunk identifiers, timestamps, and the material presented to the model, a reviewer may still be unable to reconstruct how the answer was produced.
Consider an enterprise policy assistant that drafts an answer about expenses claims. The assistant may have used RAG over HR policy documents, but a later reviewer needs more than the final answer. They need to know which policy version was retrieved, whether the assistant surfaced the correct paragraph, whether obsolete guidance was included, and whether the user or reviewer modified the answer before it was sent onward. Without that record, the organisation has only a plausible story about grounding, not robust evidence.
RAIDT handles this better than a generic AI governance approach because it does not treat RAG as a box-ticking feature. It asks whether the run can be reconstructed. If the retrieval artefacts are preserved, RAG contributes to an evidence pack and supports a justified five-pillar score profile. If they are not preserved, RAIDT makes that weakness visible instead of masking it behind the label of grounded generation.
Practical example in RAIDT terms
Consider a public-service setting in which a caseworker uses a GenAI assistant to draft an explanation of a housing-benefit decision. The assistant uses RAG over current legislation, local authority guidance, and internal policy notes. The GenAI use case is legitimate because the caseworker needs a quick, plain-language draft, but the run-level issue is whether the explanation was grounded in the right rules and whether that grounding can be demonstrated after the event.
The evidence needed is not just the final draft. RAIDT would require the run purpose, the user query, the prompt template, the retrieval configuration, the knowledge-base or index version, the document identifiers and chunk identifiers returned, timestamps, any relevance thresholds or reranking steps, the draft output, the caseworker's edits, and the final approved explanation. Responsibility is affected because a named role must remain accountable for the final communication. Auditability is affected because a reviewer must be able to reconstruct the retrieval path. Interpretability is affected because the relationship between the retrieved rules and the generated explanation must be understandable. Dependability is affected because repeated runs should retrieve relevant and current material reliably. Traceability is affected because the run must be linked to the specific sources and versions that shaped the answer.
RAG improves governance readiness here because it allows the authority to show more than that the system uses policy documents. It can show which policy documents were retrieved in this case, whether they were current, how the output related to them, and where human review intervened. That is much stronger for supervision, appeals, internal quality assurance, and external scrutiny than a generic assurance claim about AI-enabled drafting.
Detailed link to RAIDT
RAG links to RAIDT in four ways.
First, it supports the RAIDT core idea that governance should rest on evidence from actual organisational use rather than on abstract claims about model capability.
Second, it changes the evidential shape of the run by adding a retrieval layer that must itself be captured if the run is to be reconstructable.
Third, it can materially strengthen the evidence pack and the justification of the RAIDT score profile when source snapshots, retrieval metadata, and review actions are preserved.
Fourth, it improves reviewability, contestability, audit readiness, and organisational learning because reviewers can inspect not only what the model said but what source material was supplied and how that influenced the outcome.
RAG ? Run-level retrieval evidence ? Evidence pack ? RAIDT score profile ? Governance readiness
Link to the five RAIDT pillars
Responsibility
RAG supports Responsibility when it clarifies who selected the knowledge base, who approved its use, and who remained accountable for relying on retrieved material in a decision or communication.
Example evidence / implication:
- A record of the user role, reviewer role, and approval step for a RAG-assisted run.
- Documentation of which corpus or policy source was authorised for use in that workflow.
Auditability
This item has a particularly strong effect on Auditability because a reviewer can only audit a RAG-assisted run properly if the retrieval event is preserved alongside the output.
Example evidence / implication:
- Logged query text, retriever settings, corpus version, retrieved documents, and timestamps.
- A retrievable snapshot showing what context the model actually received before generation.
Interpretability
RAG can improve Interpretability by making the output explainable in relation to retrieved passages, even when the internal model remains opaque.
Example evidence / implication:
- Stored passages or citations showing which content informed the answer.
- Reviewer notes explaining whether the output faithfully reflected the retrieved material or drifted beyond it.
Dependability
RAG supports Dependability when retrieval is stable, current, and fit for purpose across repeated runs, rather than erratic or dependent on stale corpora.
Example evidence / implication:
- Records showing corpus update dates and whether outdated documents were excluded.
- Comparative review of similar runs to assess retrieval consistency and output reliability.
Traceability
RAG has an especially strong effect on Traceability because it can connect the output to specific sources, versions, and retrieval decisions rather than to a vague knowledge claim.
Example evidence / implication:
- Source identifiers, chunk identifiers, hashes, or version markers attached to the run record.
- A clear chain from query to retrieved material to generated draft to reviewed final output.
RAG affects all five pillars, but its strongest direct contributions in RAIDT are to Auditability and Traceability. Its value for the other pillars depends on whether the retrieval process is governed as evidence rather than treated as a black box.
Why this item is more than a generic concept
In general AI governance, RAG may simply mean a technical architecture for grounding responses in external sources. In RAIDT, it means something more operational: a governance-relevant intervention that can improve a run only when retrieval evidence becomes part of the run-level record.
The RAIDT meaning is more practical because it ties RAG to reconstructability, evidence packs, five-pillar scoring, and governance readiness. The question is not merely whether a system uses retrieval. The question is whether the organisation can show, for one actual run, what was retrieved, how it influenced the answer, and whether a reviewer can assess that influence after the fact.
Common misunderstanding
Misunderstanding
If a system uses RAG, its answers are automatically auditable and traceable.
Correction
RAG creates the possibility of stronger auditability and traceability, but it does not guarantee them. A system may retrieve useful passages and still fail governance review if it does not preserve the retrieval record. For example, a legal drafting assistant might consult a document store and produce a plausible answer, yet if the specific retrieved paragraphs, document version, and retrieval timestamp are not stored, a reviewer cannot determine whether the answer was grounded in the right source at the right time. In RAIDT, RAG only counts as a meaningful governance intervention when the retrieval pathway is preserved as evidence.
Boundary and limitation
RAG does not prove that an output is correct, fair, lawful, or safe. Retrieved material may itself be outdated, incomplete, conflicting, or poorly curated. Retrieval can also create a false sense of assurance if the system surfaces authoritative-looking text that is not actually applicable to the task at hand. In addition, some implementations preserve only citations or document titles rather than the retrieved passages and settings needed for real reconstruction.
RAG also does not replace broader governance measures such as corpus governance, human review, legal interpretation, policy approval, model evaluation, or workflow design. Its value depends on source quality, update discipline, and proportionate evidence capture. RAIDT handles this limitation by asking not simply whether retrieval exists, but whether the run contains sufficient retrieval evidence to support review, challenge, and learning.
Implementation levels
Manual implementation
A researcher or small team can implement RAG governance manually by recording the user query, prompt template, retrieved documents, source versions, key passages, output, and reviewer decision in a structured note or evidence form for important runs.
Semi-automated implementation
Semi-automated implementation can capture retrieval metadata through templates, wrappers, or workflow forms. For example, a tool can automatically attach document identifiers, timestamps, and corpus versions while requiring a human reviewer to record whether the retrieved material was appropriate and whether the answer stayed within scope.
Fully automated implementation
At scale, a platform, orchestration layer, or governance pipeline can log the query, retriever configuration, index version, retrieved chunks, hashes, citations, generated output, human interventions, and final decision state automatically. These records can feed an evidence-pack builder, support RAIDT scoring, and enable dashboards for retrieval quality, audit queries, and organisational learning.
Practical use in the RAIDT project
Within the RAIDT project, this item is particularly useful for explaining why some technical methods matter to governance only under evidential conditions. In Paper 08 Foundations, RAG helps show that influence methods are not outside governance; they shape what kind of run-level evidence is available and how reviewable a run becomes. In Paper 09 Empirical Validation, RAG is valuable because it offers a testable case: does preserved retrieval evidence measurably improve reconstruction, scoring confidence, and reviewer judgement?
For Paper 10 Policy Pathways, RAG provides a bridge between technical system design and policy-facing governance language. It is also relevant to sector playbooks because many real deployments in healthcare, public services, finance, education, and enterprise productivity rely on retrieval over local knowledge sources. In the evidence pack and scoring rubric, this item helps specify what retrieval artefacts should be captured. In viva defence or supervisor explanation, it helps answer a common question: why treat RAG as a governance intervention rather than merely as a performance enhancement? The answer is that, inside RAIDT, RAG alters the available evidence for governing one run.
Key audience questions to prepare for
Q1. Why is RAG a governance intervention in RAIDT rather than just a retrieval feature?
Because RAIDT focuses on what makes one run reviewable. RAG changes the evidential conditions of a run by introducing retrievable source material, and that matters for governance when the retrieval event is preserved as evidence rather than hidden inside the system.
Q2. Does RAG remove the need for human review?
No. RAG may supply relevant documents, but a human still needs to judge whether the retrieved material was current, applicable, and accurately represented in the final output. RAIDT treats human review as part of the run record, not as an optional extra.
Q3. What is the main governance risk if RAG is present but poorly documented?
The main risk is overclaiming traceability. An organisation may believe it can defend an output because a knowledge base was used, but without query, source, version, and retrieval records, a reviewer cannot reliably reconstruct or contest the run.
Q4. How is RAG different from provenance-first RAG in RAIDT terms?
RAG is the broader retrieval-augmented method. Provenance-first RAG is the stricter design orientation in which provenance capture, source identity, and retrieval evidence are deliberately prioritised as governance requirements. The latter is a more demanding operational form of the former.
Q5. Why does RAG matter for the RAIDT score profile?
Because the presence or absence of retrieval evidence affects how confidently a run can be scored, especially on Auditability and Traceability. It also influences Interpretability and Dependability when reviewers need to understand whether the output was supported by appropriate, current material.
Suggested citation concepts to support this item
- retrieval-augmented generation governance
- provenance in retrieval-augmented generation
- AI auditability and retrieval logging
- grounding and traceability in generative AI
- enterprise knowledge retrieval for large language models
- evidence-based governance of generative AI
- source attribution and citation quality in RAG systems
- organisational accountability for AI-assisted document drafting
- corpus governance and document versioning in AI systems
- human review of retrieval-augmented outputs
Short explanation for presentation
RAG, or Retrieval-Augmented Generation, matters in RAIDT because it can turn a generated answer into a more evidentially grounded run, but only if the retrieval step is preserved. In ordinary AI discussion, RAG is often treated as a performance method for reducing hallucination or improving relevance. RAIDT reframes it as a governance intervention. The key question is not simply whether documents were retrieved, but whether a reviewer can later see which documents, which versions, which passages, and which retrieval settings shaped the output. If that evidence is captured, RAG strengthens the evidence pack and supports better scoring, especially for Auditability and Traceability. If it is not captured, the organisation may only have an assertion that the system was grounded. RAIDT therefore makes RAG operational by tying it to run-level evidence and governance readiness.
One-line takeaway
RAG is retrieval-augmented generation used as a governance intervention because RAIDT turns preserved retrieval evidence into reviewable run-level proof.
Related items in influence methods as governance interventions
- S6.01 · Governance interventions
- S6.02 · Baseline prompting
- S6.03 · Prompting
- S6.04 · Structured prompting
- S6.05 · Role-based prompting
- S6.06 · Zero-shot prompting
- S6.07 · Chain-of-thought controlled use
- S6.09 · Provenance-first RAG
- S6.10 · PEFT / LoRA
- S6.11 · Adapter lineage
- S6.12 · RLHF-type / DPO controls
- S6.13 · Stacked influence