S4.07 - Prompt_hash

S4.07 ? Prompt hash

flowchart LR
    A[Editable or weakly governed prompt records] --> B[RAIDT
Run-level evidence framework]
    A2[Retrospective reconstruction is unreliable] --> B
    B --> C[[Prompt hash
Integrity marker for exact prompt text]]
    H[Canonical prompt text] --> C
    I[Hashing method e.g. SHA-256] --> C
    J[Prompt registry] --> C
    K[Prompt ID and version] --> C
    L[Run record] --> C
    M[Reviewer verification step] --> C
    C --> D[Evidence pack]
    C --> E[RAIDT score profile]
    C --> F[Reviewer reconstruction]
    D --> G[Audit readiness and contestability]
    E --> N[Stronger Auditability and Traceability]
    F --> O[Organisational learning]

? Star S4 - Evidence Architecture and Artefacts

Star context: Specifies the concrete fields and artefacts that make a run record inspectable, reconstructable, and defensible within RAIDT's run-level evidence framework.

Academic picture

Definition / background

A prompt hash is a digital fingerprint computed from the exact prompt text associated with a specific run. In practice, it is usually produced by applying a cryptographic hash function to a canonicalised prompt string so that even a very small textual change produces a different result. Within RAIDT, the purpose of the prompt hash is not to interpret the prompt semantically, but to preserve evidential integrity around what was actually submitted to the model.

Conceptually, this item sits at the intersection of records management, software integrity, and AI governance. Organisations increasingly rely on prompts as operational instructions, yet prompts are often managed informally: copied between documents, revised during use, or retrospectively rewritten for reporting. That informality weakens governance because it becomes difficult to determine whether the prompt under review is truly the prompt that shaped the model output. Prompt hashing addresses that integrity gap.

This item differs from nearby concepts in Star S4. A prompt registry stores prompt assets and their metadata; prompt ID and version identifies which prompt template or revision is intended; a prompt hash verifies the exact prompt text artefact tied to a run record. In other words, the registry answers where the prompt lives, the ID/version answers which controlled prompt is claimed, and the hash answers whether the recorded text has remained unchanged.

Inside RAIDT, prompt hash belongs to run-level evidence because a run is the unit of governance. If the run is the object being reviewed, then the prompt used in that run is a core evidential artefact. The hash helps secure that artefact for inclusion in the evidence pack and supports more reliable judgement across the five-pillar score profile, especially for auditability and traceability.

Why this concept matters

Prompt hash matters because prompts are often decisive but poorly protected as evidence. In many organisational settings, later review depends on records assembled after the fact. Without a stable integrity marker, prompt text can be edited intentionally or accidentally, with no obvious sign that the evidence has changed. This creates confusion in incident review, weakens accountability, and makes governance claims harder to defend.

The concept also avoids a common confusion: transparency is not the same as integrity. An organisation may disclose a prompt in a report or repository, but if it cannot show that the disclosed prompt is exactly the one used in the run under scrutiny, the disclosure remains evidentially fragile. Prompt hashing therefore moves governance from descriptive assertion toward verifiable artefact control.

For organisations using GenAI in consequential work, the risk of missing this item is practical rather than abstract. Reviewers may be unable to reconstruct how an output was produced; investigators may struggle to determine whether a prompt was altered after an adverse event; managers may not know whether a claimed prompt version matches the actual deployed wording. RAIDT uses prompt hash to reduce these failure modes and to make run records more defensible.

Key idea: A prompt hash matters because it turns prompt integrity from a claim into a verifiable run-level evidential property.

What this item captures

The integrity state of the exact prompt text associated with a run.
A stable digital fingerprint that can be checked during review, audit, or dispute resolution.
Evidence that the prompt artefact included in the evidence pack has not been silently altered.
A bridge between prompt content, prompt governance controls, and run reconstruction.
A basis for comparing prompt records across logging systems, repositories, and review documents.
A practical control point linking prompt provenance to broader RAIDT evidence architecture.

Practical example / likely audience question

Audience question

Why use a prompt hash if the organisation already stores the prompt text and prompt version number?

Answer

The concern behind this question is usually that hashing appears redundant. If the prompt is already documented, why add another field? The direct answer is that documentation and identification do not by themselves establish integrity. A stored prompt can still be edited, reformatted, truncated, or copied incorrectly after the run. A prompt version number can also be correct at the template level while the actual submitted prompt differs because a user added instructions, removed clauses, or changed the order of text.

A practical example makes the distinction clearer. Suppose a hospital uses a drafting assistant to summarise discharge instructions. The official prompt template is version 3.2, and that version is recorded correctly. However, during a particular run, an operator appends an extra line asking the model to "keep the text very brief and omit low-priority detail". If only the prompt ID/version is stored, later reviewers may believe the standard template was used unchanged. If a prompt hash is recorded for the exact submitted text, the organisation can detect that the run-level prompt artefact differs from the baseline template or from a later edited copy.

RAIDT handles this issue better than generic AI governance approaches because it treats the run, not the policy statement, as the unit of review. Generic governance may say that prompts should be documented. RAIDT asks whether the exact prompt for this run can be evidenced, checked, and tied into an evidence pack that supports reviewability, contestability, and score-based governance.

Practical example in RAIDT terms

Consider an enterprise productivity setting in which a GenAI assistant drafts responses to customer complaints for a regulated financial services firm. One run produces an overly dismissive reply that fails to acknowledge a vulnerable customer's circumstances. The run-level issue is whether the problematic tone arose from the model alone, from the base prompt design, or from a last-minute prompt modification made by an operator under time pressure.

The evidence needed includes the run ID, timestamp, user or operator role, prompt ID/version, exact prompt text if permissible, prompt hash, model/provider/version identifier, decoding parameters, output hash, and reviewer notes. The prompt hash allows the reviewer to verify that the prompt artefact assembled during the investigation is the same artefact linked to the run at the time of generation.

The most affected RAIDT pillars are Auditability and Traceability, with Responsibility and Dependability also implicated. Auditability is strengthened because the reviewer can test the integrity of the prompt record. Traceability is improved because the chain from template to submitted prompt to output becomes more precise. Responsibility is supported because accountability for prompt changes is easier to assign. Dependability is supported because repeated runs can be compared on a more reliable evidential basis. In governance readiness terms, prompt hash helps the organisation move from a narrative explanation of the event to a reviewable evidential account.

Detailed link to RAIDT

Prompt hash links to RAIDT in four ways.

First, it supports RAIDT's core idea that governance should rest on inspectable run-level evidence rather than broad organisational claims.
Second, it links directly to the run because the hash is meaningful only when attached to the exact prompt artefact used in a specific configured use of a GenAI system.
Third, it strengthens the evidence pack by making one of the most important run inputs verifiable and by improving the defensibility of score judgements, particularly for auditability and traceability.
Fourth, it improves reviewability, contestability, audit readiness, and organisational learning because disputed cases can be examined against a more stable evidential record.

Prompt hash ? Run-level evidence ? Evidence pack ? RAIDT score profile ? Governance readiness

This chain matters because RAIDT operationalises governance through evidence architecture. Prompt hash is one of the artefacts that helps convert prompt management from informal practice into reviewable institutional infrastructure.

Link to the five RAIDT pillars

Responsibility

Prompt hash supports Responsibility indirectly but meaningfully. It helps clarify whether a prompt submitted in practice matches the prompt that a team, manager, or policy owner says should have been used.