S11.08 - Component_drift

S11.08 ? Component drift

flowchart LR
    A[Changing prompts retrieval indices adapters and policy layers] --> B[RAIDT - run-level evidence framework]
    A2[Traditional limitation: weak reconstruction after updates] --> B
    B --> C[[Component drift]]
    C --> D[Evidence pack with component versions]
    C --> E[Score profile interpreted in context]
    D --> F[Reviewer reconstruction and contestability]
    E --> G[Governance readiness and organisational learning]
    H[Healthcare public services enterprise productivity] --> C
    I[Prompt ID retrieval snapshot policy-layer version wrapper release] --> C

? Star S11 - Boundaries, Limitations and Future Questions

Star context: This item locates a major boundary condition for RAIDT: even well-designed governance can weaken if the technical components behind a run change over time without being recorded, compared, and reviewed.

Academic picture

Definition / background

Component drift refers to the change over time in one or more technical or procedural elements that shape a generative AI run. In practice, these elements can include prompt templates, model versions, retrieval corpora or indices, fine-tuned adapters, policy rules, safety settings, routing logic, external tools, and human review steps. The concept matters because a run is never produced by a model alone; it is produced by a configured system operating in a particular organisational context.

In GenAI governance, component drift is closely related to but distinct from model drift. Model drift usually refers to changes in model behaviour or performance. Component drift is broader: it includes changes in the surrounding stack that can alter outputs, risk exposure, or governance status even when the underlying model remains unchanged. A prompt revision, a refreshed knowledge base, or a modified access policy can all change the meaning of a run-level result.

This belongs inside RAIDT because RAIDT treats the run as the unit of governance and asks whether that run can be reconstructed, reviewed, contested, and compared. If the components that shaped a run are not versioned, then the evidence pack becomes incomplete and the score profile becomes harder to interpret across time. Component drift therefore sits at the boundary between technical maintenance and governance evidence: it is about preserving the integrity of what a run-level claim actually refers to.

Within RAIDT, component drift directly affects run-level evidence, the structure of the evidence pack, and the credibility of the five-pillar score profile. A score is meaningful only if the assessed configuration is known. Without that, organisations risk comparing unlike with unlike and mistaking silent system change for improvement, degradation, or inconsistency in human practice.

Why this concept matters

Component drift matters because governance claims are only as reliable as the stability and traceability of the system being governed. When a team says that a use case is dependable, interpretable, or audit-ready, that claim implicitly assumes a particular system configuration. If the configuration changes and the change is not recorded, then the governance claim can become detached from the current reality of use.

This concept solves a practical problem that appears in long-running deployments: stakeholders often notice that outputs feel different, but cannot determine whether the cause lies in the model, the prompt, the retrieved evidence base, the safety settings, or the task context. RAIDT avoids that confusion by requiring run-level evidence about the components involved in each run and by supporting comparison across runs over time.

For organisations, the absence of this concept creates familiar risks: failed reconstruction during audit, disputes about why outcomes changed, weak incident analysis, poor comparability of evaluations, and overconfident claims of consistency. By making component drift explicit, RAIDT shifts governance from principle-level assurance to operational reviewability.

Key idea: Component drift matters because governance judgments are only trustworthy when the components that shaped each run can be identified, versioned, and compared over time.

What this item captures

Changes in prompts, templates, guardrails, model endpoints, adapters, retrieval sources, indices, or workflow logic that alter how a run is produced.
The gap between a nominally "same" AI service and the actual configured system used at different points in time.
The evidential requirement to record component versions so that runs can be reconstructed and compared.
The risk that score profiles become misleading when underlying components change without documentation.
The organisational need to distinguish intentional updates from silent drift.
The connection between technical change management and governance readiness.

Practical example / likely audience question

Audience question

Why keep versions of prompts, retrieval indices, and policy layers if the use case and the model name stay the same?

Answer

The concern behind this question is a common misconception that the model name is the main determinant of system behaviour. In reality, many governance-relevant changes happen outside the base model. A revised prompt can narrow or expand the scope of an answer. A retrieval index refresh can introduce new source material or remove old guidance. A policy layer can block responses that were previously allowed. These are not superficial implementation details; they shape what the run actually was.

The direct answer is that versioning is necessary because otherwise a run cannot be reliably reconstructed months later. Suppose an organisation evaluates a drafting assistant in January and again in June. If the June system uses a refined prompt, an updated retrieval corpus, and stricter moderation rules, then improved or degraded outcomes cannot be interpreted properly unless those differences are visible in the evidence.

RAIDT handles this better than a generic AI governance approach because it does not stop at broad principles such as accountability or transparency. It ties those principles to the run itself. That means the evidence pack can show which components were active, and the score profile can be interpreted in light of that specific configuration rather than being treated as a floating claim about an abstract system.

Practical example in RAIDT terms

Consider a hospital using a GenAI assistant to draft discharge summaries for clinicians. In February, a run is performed using one prompt template, one retrieval index containing local discharge guidance, and one policy layer for redaction checks. By May, the hospital has updated the prompt to improve brevity, refreshed the retrieval index with newer guidance, and inserted an additional safety rule for medication references.

The run-level issue is that a later summary may differ for several reasons, but without component evidence the team cannot say which changes affected the result. The evidence needed includes prompt version IDs, retrieval index or corpus version identifiers, policy-layer version or rule-set references, timestamps, model endpoint metadata, and any wrapper or orchestration version used.

The most affected RAIDT pillars are Auditability, Dependability, and Traceability, with Responsibility and Interpretability also implicated. Auditability depends on reconstructing the conditions of the run. Dependability depends on knowing whether performance changed because the system changed. Traceability depends on linking an output to the components that produced it. Capturing component drift improves governance readiness because supervisors, auditors, and clinical leads can distinguish real quality changes from undocumented configuration changes.

Detailed link to RAIDT

Component drift links to RAIDT in four ways.

First, it supports RAIDT's core idea that governance should attach to a specific run rather than to general claims about a model or product.
Second, it sharpens the meaning of run-level evidence by showing that the run includes the configured stack around the model, not only the final prompt and output.
Third, it affects the evidence pack and the score profile because both become more defensible when component versions and changes are explicitly recorded.
Fourth, it strengthens reviewability, contestability, audit readiness, and organisational learning by making it possible to explain why outcomes differ across time.

Component drift -> Run-level evidence -> Evidence pack -> RAIDT score profile -> Governance readiness

In that chain, component drift is the reason version-aware evidence is needed; the evidence pack is where that information is assembled; the score profile is where interpretation depends on knowing whether the assessed run is comparable to earlier or later runs; and governance readiness is improved when those links are reviewable.

Link to the five RAIDT pillars

Responsibility

Component drift affects Responsibility because organisational actors remain accountable for the configured system they choose to deploy, update, and maintain. If changes are made without oversight or documentation, accountability becomes blurred.