S6.08 - RAG

S6.08 ? RAG

flowchart LR
    A[Traditional RAG promise
grounded answers from approved sources] --> B[RAIDT
run-level evidence framework]
    H[Practical retrieval fields
query, corpus version, chunk IDs, source hashes,
timestamps, reviewer notes] --> C[[RAG
retrieval-grounded influence on one run]]
    B --> C
    C --> D[Evidence pack
retrieval snapshot preserved]
    C --> E[RAIDT score profile
especially Auditability and Traceability]
    D --> F[Reviewer reconstruction
and contestability]
    E --> G[Governance readiness
and organisational learning]

← Star S6 - Influence Methods as Governance Interventions

Star context: Positions prompting, RAG, PEFT/LoRA, RLHF/DPO, and stacked influence as methods that shape the evidential conditions of governance inside RAIDT rather than replacing RAIDT's run-level core.

Academic picture

Definition / background

Retrieval-Augmented Generation (RAG) combines generation with a retrieval step that supplies external material to the model at inference time. Conceptually, it emerged from the combination of information retrieval and neural text generation: instead of relying only on model parameters or prompt wording, the system brings selected documents, chunks, or passages into the run so that the answer can be informed by a defined source base. In ordinary technical discussion, RAG is often presented as a method for grounding outputs, reducing hallucination, or improving domain relevance.

In governance terms, however, RAG matters for a different reason. It can create a more reviewable relationship between an output and the materials that shaped it, but only if the retrieval process is itself captured. If an organisation cannot show what was retrieved, from which corpus version, under which query and retriever settings, then RAG remains only a technical claim about how the system was supposed to behave. RAIDT therefore treats RAG as an influence method that can strengthen governance evidence when retrieval becomes part of the run-level record.

RAG differs from prompting alone because it introduces external source material into the run rather than relying only on instructions. It differs from fine-tuning or adapter-based methods because the knowledge is not primarily embedded into model weights; instead, it is selected dynamically at run time. That distinction is important in RAIDT because dynamic selection creates a specific evidential requirement: the retrieved context must be preserved if the run is to be reconstructable.

This item belongs inside RAIDT because the framework treats the run as the unit of governance. RAG changes the evidential structure of that run. When retrieval traces, source identifiers, and retrieved content snapshots are preserved, the run-level evidence pack becomes richer and the five-pillar score profile can be justified with more confidence. When those traces are absent, claims of grounding or traceability remain weak. In that sense, RAG is not automatically governance-ready; it becomes governance-relevant when it is made reviewable within RAIDT.

Why this concept matters

RAG matters because many organisations want GenAI systems to answer with reference to approved knowledge rather than free-form model memory. That is a legitimate practical goal, but governance needs more than the promise of better answers. It needs evidence of how the answer was shaped in one actual run. RAG helps when it turns the source relationship of the output into something that can be inspected later by supervisors, auditors, or reviewers.

The concept also avoids a common confusion between grounded output and traceable output. An output may appear well grounded because it resembles internal policy or cites a document title, yet still be poorly traceable if the retrieval event was not recorded properly. RAIDT clarifies that traceability is not solved by the existence of retrieval alone; it depends on preserving the evidence of retrieval.

If this concept is missing, organisations may overstate the governance value of RAG. They may assume that using a knowledge base automatically creates accountability, when in fact a later reviewer may still be unable to determine which documents were retrieved, whether they were current, whether reranking favoured some passages over others, or whether the final answer departed materially from the retrieved material. RAIDT uses this concept to move from broad claims of grounded AI to operational proof.

Key idea: RAG matters in RAIDT because retrieval improves governance only when the retrieved evidence itself is preserved at run level.

What this item enables

The linking of generated outputs to specific retrieved sources rather than to vague claims of grounding.
Reconstruction of which documents, passages, versions, and retrieval settings influenced one run.
Stronger evidence packs that show not only the answer but also the knowledge path behind it.
Better justified scoring for Auditability and Traceability, with secondary benefits for Dependability and Interpretability.
More contestable use of GenAI, because reviewers can inspect whether the retrieved material really supported the output.
Organisational learning about corpus quality, document freshness, retrieval drift, and domain coverage.
A clearer boundary between acceptable evidence-backed use and unsupported model improvisation.

Practical example / likely audience question

Audience question

Does RAG automatically solve traceability in RAIDT?

Answer

No. The misconception behind the question is that retrieval and traceability are the same thing. They are not. RAG can improve traceability, but only if the organisation preserves evidence of the retrieval event itself. If the system retrieves passages from an internal knowledge base but does not log the query, the corpus or index version, the retrieved chunk identifiers, timestamps, and the material presented to the model, a reviewer may still be unable to reconstruct how the answer was produced.

Consider an enterprise policy assistant that drafts an answer about expenses claims. The assistant may have used RAG over HR policy documents, but a later reviewer needs more than the final answer. They need to know which policy version was retrieved, whether the assistant surfaced the correct paragraph, whether obsolete guidance was included, and whether the user or reviewer modified the answer before it was sent onward. Without that record, the organisation has only a plausible story about grounding, not robust evidence.

RAIDT handles this better than a generic AI governance approach because it does not treat RAG as a box-ticking feature. It asks whether the run can be reconstructed. If the retrieval artefacts are preserved, RAG contributes to an evidence pack and supports a justified five-pillar score profile. If they are not preserved, RAIDT makes that weakness visible instead of masking it behind the label of grounded generation.

Practical example in RAIDT terms

Consider a public-service setting in which a caseworker uses a GenAI assistant to draft an explanation of a housing-benefit decision. The assistant uses RAG over current legislation, local authority guidance, and internal policy notes. The GenAI use case is legitimate because the caseworker needs a quick, plain-language draft, but the run-level issue is whether the explanation was grounded in the right rules and whether that grounding can be demonstrated after the event.

The evidence needed is not just the final draft. RAIDT would require the run purpose, the user query, the prompt template, the retrieval configuration, the knowledge-base or index version, the document identifiers and chunk identifiers returned, timestamps, any relevance thresholds or reranking steps, the draft output, the caseworker's edits, and the final approved explanation. Responsibility is affected because a named role must remain accountable for the final communication. Auditability is affected because a reviewer must be able to reconstruct the retrieval path. Interpretability is affected because the relationship between the retrieved rules and the generated explanation must be understandable. Dependability is affected because repeated runs should retrieve relevant and current material reliably. Traceability is affected because the run must be linked to the specific sources and versions that shaped the answer.

RAG improves governance readiness here because it allows the authority to show more than that the system uses policy documents. It can show which policy documents were retrieved in this case, whether they were current, how the output related to them, and where human review intervened. That is much stronger for supervision, appeals, internal quality assurance, and external scrutiny than a generic assurance claim about AI-enabled drafting.

Detailed link to RAIDT

RAG links to RAIDT in four ways.

First, it supports the RAIDT core idea that governance should rest on evidence from actual organisational use rather than on abstract claims about model capability.
Second, it changes the evidential shape of the run by adding a retrieval layer that must itself be captured if the run is to be reconstructable.
Third, it can materially strengthen the evidence pack and the justification of the RAIDT score profile when source snapshots, retrieval metadata, and review actions are preserved.
Fourth, it improves reviewability, contestability, audit readiness, and organisational learning because reviewers can inspect not only what the model said but what source material was supplied and how that influenced the outcome.

RAG ? Run-level retrieval evidence ? Evidence pack ? RAIDT score profile ? Governance readiness

Link to the five RAIDT pillars

Responsibility

RAG supports Responsibility when it clarifies who selected the knowledge base, who approved its use, and who remained accountable for relying on retrieved material in a decision or communication.