Q172 - What_does_operationalising_RAIDT_mean_how_should_evidence_re

Q172 — What does operationalising RAIDT mean, how should evidence readiness be understood, and how does this differ from correctness?

← RAIDT · Star S11 - Boundaries, Limitations and Future Questions · primary item: S11.03 · Correctness vs governance readiness

Appears in sources
Answer

Operationalising RAIDT means turning responsible GenAI governance from a principles-only discussion into an inspectable run-level artefact and scoring practice. Across the papers, this is done by treating the run as the unit of governance and by specifying a run-level evidence pack that records the configured use event in context: prompt or instruction, model and tool configuration, retrieved context where used, output, checks, hashes, timestamps, and review notes. The run-level evidence pack is then assessed through a score profile across the five pillars (Responsibility, Auditability, Interpretability, Dependability, Traceability). In that sense, RAIDT operationalises governance by defining what evidence must exist for one material run, how that evidence should be bounded and linked, and how governance quality becomes reviewable rather than merely asserted. The papers also stress influence methods as governance interventions: structured prompting, retrieval augmentation, PEFT/LoRA, and preference-based alignment are not just engineering choices, because they alter both behaviour and what can later be evidenced and reviewed.

Evidence readiness should therefore be understood as reconstructable, contestable, and reviewable preparedness at run level, not as a claim that the system is simply good in the abstract. The score profile makes that preparedness visible through anchors 1=missing / 3=partial / 5=audit-ready, so reviewers can judge whether the evidence is sufficient for oversight, sampling, incident analysis, and organisational learning. This differs from correctness in a strict sense. Correctness concerns whether an output is true, appropriate, or substantively acceptable for the task, often requiring domain expertise and external verification. RAIDT explicitly does not guarantee correctness or legal compliance. A run may be correct yet poorly governed if it cannot be reconstructed, and it may be governance-ready yet still contain an error that must be clinically, legally, or professionally checked.

Practical example

In the healthcare discharge-summary example, a hospital uses a GenAI assistant to draft a note for clinician review. A governance-ready run would preserve the structured prompt, model deployment ID, retrieval snapshot from the internal guideline corpus, output hash, timestamps, and reviewer decision in a run-level evidence pack. The resulting score profile can then show, pillar by pillar, whether the run is auditable, traceable, interpretable, dependable, and responsibly overseen.

The contrast with correctness is important. A free-form summary might look clinically plausible and even be accepted by a busy reviewer, yet if the organisation cannot show which prompt version was used, what guideline text was retrieved, or whether uncertainty was disclosed, the run is weak on governance readiness and may sit nearer anchors 1=missing / 3=partial / 5=audit-ready on several pillars. Conversely, a well-instrumented run can be audit-ready while still requiring the clinician to verify whether the medication list or follow-up advice is actually correct for the patient.

Sources in RAIDT papers
Powered by Forestry.md