Q045 - Why_must_every_RAIDT_run_record_a_prompt_ID_and_version

Q045 — Why must every RAIDT run record a prompt ID and version?

← RAIDT · Star S4 - Evidence Architecture and Artefacts · primary item: S4.06 · Prompt ID and version

A registered prompt version turns an instruction into evidence that can be reconstructed, compared, and challenged.

Appears in sources

qa_deck_100#slide 47 · Outputs, review decisions, and retention

Answer

In RAIDT, the run as the unit of governance means that each material use must be reconstructable as a governed event rather than described only through general model documentation. The papers are explicit that generative AI behaviour is shaped at run time by prompts, configuration parameters, retrieved context, toolchains, oversight actions, and other influence methods as governance interventions. For that reason, every run-level evidence pack must record a prompt ID and version: without them, reviewers cannot show which exact instruction set, structure, and control logic shaped the output in that run. The Evidence Review also treats this as configuration provenance, specifying prompt template ID and version as core evidence elements.

Prompt IDs and versions matter because they support all five pillars (Responsibility, Auditability, Interpretability, Dependability, Traceability), even though the strongest immediate effect is on Auditability and Traceability. They allow a reviewer to link one output back to the approved template, compare repeat runs under fixed conditions, explain why a structured rationale or uncertainty disclosure appeared, and distinguish prompt changes from model or retrieval changes during post-incident review. In RAIDT terms, they turn governance from memory into inspectable evidence. They also affect the score profile directly: if a run cannot show which prompt revision was active, the evidence is incomplete and the run is unlikely to meet anchors 1=missing / 3=partial / 5=audit-ready for high-stakes use. Recording prompt ID and version is therefore not clerical overhead; it is a prerequisite for reconstruction, contestability, and organisational learning.

Practical example

In a healthcare note-summarisation workflow, a hospital uses a constrained discharge-summary prompt that requires headings, a statement of missing information, and an uncertainty section. One run produces a safe summary, but a later disputed run omits the uncertainty section and overstates a medicine plan. If the run record shows PROMPT-CLIN-SUMM v3.2, reviewers can inspect the exact template, confirm whether the omission came from the model, the retrieval context, or a prompt revision, and then assess whether oversight failed.

If the organisation recorded only the output and not the prompt ID and version, the incident review would rely on recollection. The hospital could not reliably compare the disputed run with earlier runs, justify its score profile, or show that the run-level evidence pack is complete enough for internal audit and corrective action.

Sources in RAIDT papers

08-RAIDT_Foundations_M_V50
13-RAIDT-Evidence-Review_M_v10