S4.06 - Prompt_ID_and_version

S4.06 — Prompt ID and version

flowchart LR
    A[Informal prompt use, prompt drift, weak reconstruction] --> B[RAIDT
run-level evidence framework]
    H[Prompt registry, version label, prompt hash,
approval note, task label, model pairing] --> C[[Prompt ID and version
exact governed prompt state used in a run]]
    B --> C
    C --> D[Evidence pack]
    C --> E[RAIDT score profile]
    D --> F[Reviewer reconstruction and contestability]
    E --> G[Governance readiness and organisational learning]

← Star S4 - Evidence Architecture and Artefacts

Star context: Specifies the concrete fields and artefacts that make a run record inspectable. Within this star, prompt ID and version identify which controlled instruction artefact governed a run, so a reviewer can connect one output to one approved prompt state rather than to a vague account of prompting practice.

Academic picture

Definition / background

Prompt ID and version identify the exact prompt artefact used in a run. The prompt ID is the stable identifier for a prompt template, instruction set, or controlled prompt object within a registry. The version specifies which released state of that prompt was active when the run occurred. Together, they answer a simple but crucial governance question: which approved instruction artefact shaped this output?

This concept originates in ordinary disciplines of version control, document control, and configuration management, but it has a distinct role in generative AI governance. Prompts are not merely user text; in many organisational settings they are operational instructions that influence output quality, safety, consistency, and compliance. When prompts are revised, even small changes in wording, role framing, constraints, examples, or formatting logic can materially alter downstream outputs.

Within RAIDT, prompt ID and version are not interchangeable with prompt text, prompt registry, or prompt hash. The registry records what prompt artefacts exist and how they are governed. The ID and version record which one was used for this run. The prompt hash can later support integrity checking of the prompt content itself. The distinction matters because RAIDT is concerned with run-level evidence: a reviewer must be able to connect a specific output to a specific prompt state, not merely to a general prompt family.

This item therefore belongs inside Evidence Architecture and Artefacts because it is one of the fields that make a run record inspectable. It supports the assembly of a run-level evidence pack and strengthens the defensibility of the RAIDT five-pillar score profile. Without prompt ID and version, governance claims about prompting are harder to verify, compare, or contest.

Why this concept matters

Prompt ID and version solve the problem of prompt ambiguity. In many real deployments, teams say that they used "the standard prompt", "the approved prompt", or "the latest prompt", but such descriptions are too vague for rigorous review. If several similar prompt variants exist, or if a prompt has evolved over time, reviewers cannot know which instructions actually shaped the run unless the record names the prompt artefact and its version.

The concept also prevents confusion between prompt design and prompt governance. A well-written prompt is not automatically a controlled prompt. Governance requires change history, release discipline, and the ability to link one run to one prompt state. This is especially important when organisations compare output quality across time, investigate incidents, or need to explain why two apparently similar runs produced different results.

If prompt ID and version are missing, several risks follow: prompt drift becomes hard to detect, audit reconstruction weakens, reviewers may compare the wrong prompt to the wrong output, and organisational learning about prompt improvements becomes anecdotal rather than evidential. RAIDT uses this field to move prompting from tacit practice towards accountable operational governance.

Key idea: Prompt ID and version matter because RAIDT can only review a run properly if the exact governed prompt state used in that run is identifiable.

What this item captures

The stable identifier of the prompt artefact or prompt template used in the run.
The specific released version of that prompt artefact at the time of execution.
The controlled prompt state that shaped the run, even if later prompt revisions occurred.
The link between prompt governance and run-level reconstruction.
The basis for comparing outputs produced under different prompt versions.
The evidential reference needed to connect the run record to registry entries, approval notes, change logs, and related hashes.
The prompt-level change-control context that supports scoring, review, and organisational learning.

Practical example / likely audience question

Audience question

If the prompt text is stored somewhere, why does every RAIDT run still need a prompt ID and version?

Answer

The concern behind this question is that storing the full prompt text may appear sufficient. However, text alone does not provide dependable governance context. A reviewer still needs to know whether that text was the approved version, whether it had been superseded, whether multiple variants existed for different workflows, and whether the run can be linked back to a governed change history.

A direct answer is that prompt ID and version turn prompt content into a controlled artefact reference. For example, a university may keep many admissions-support prompts in shared documents. If one applicant receives an inconsistent or overly assertive draft response, investigators need more than a copied prompt paragraph. They need to know whether the run used admissions_triage_prompt v2.4 or admissions_triage_prompt v3.0, whether version v3.0 had already introduced stricter escalation wording, and whether staff followed the approved release.

RAIDT handles this better than a generic AI governance approach because it links the prompt reference directly to the run as the unit of governance. Rather than asking only whether prompts are documented somewhere, RAIDT asks whether the exact prompt artefact used in the reviewed event is identifiable, reconstructable, and contestable.

Practical example in RAIDT terms

Consider a public-service setting in which a local authority uses GenAI to draft first-pass housing-support letters for citizens. The use case is administratively useful, but the run-level issue is whether the draft used the correct approved prompt for vulnerable applicants, including the required safeguarding and escalation language.

The evidence needed for one disputed run includes the task label, timestamp, operator role, prompt ID, prompt version, prompt hash, model/provider/version, retrieval inputs if any, generated draft, human edits, and reviewer decision. If the authority cannot show whether the run used housing_support_letter_prompt v1.6 or an older v1.3, it cannot reliably explain why the draft omitted a safeguarding paragraph that newer prompt versions were designed to include.

In RAIDT terms, Responsibility is affected because prompt release and use should be governed by named roles. Auditability is affected because the reviewer must reconstruct the exact instruction state. Interpretability is affected because the explanation of the output depends partly on which instructions were active. Dependability is affected because prompt version changes may improve or degrade consistency. Traceability is strongly affected because the prompt reference is part of the chain linking configuration to outcome. Recording prompt ID and version therefore improves governance readiness by making prompt change visible at the level of one real run.

Detailed link to RAIDT

Prompt ID and version link to RAIDT in four ways.

First, they operationalise the RAIDT core idea that governance should attach to what happened in one real GenAI use event, not only to broad statements that prompts are "managed" or "approved".

Second, they strengthen the run as the unit of governance by recording which governed instruction artefact shaped that run.

Third, they improve the evidence pack and the RAIDT score profile because reviewers can connect outputs, review notes, and performance judgements to a precise prompt state rather than to a vague prompting practice.

Fourth, they support reviewability, contestability, audit readiness, and organisational learning by making prompt changes visible across time and across comparable runs.

Prompt ID and version -> Run-level evidence -> Evidence pack -> RAIDT score profile -> Governance readiness

Link to the five RAIDT pillars

Responsibility

Prompt ID and version support Responsibility by clarifying which prompt artefact was authorised for use and which governance roles were responsible for approving or maintaining it.