S1.09 - Runtime_configuration

S1.09 ? Runtime configuration

flowchart LR
    A[Model-centric governance
misses configured behaviour] --> B[RAIDT
run-level evidence framework]
    B --> C[[Runtime configuration
prompts, settings, retrieval, tools, controls]]
    C --> D[Evidence pack]
    C --> E[RAIDT score profile]
    C --> F[Reviewer reconstruction]
    D --> G[Audit readiness]
    E --> G
    F --> H[Contestability and organisational learning]
    I[Healthcare] --> C
    J[Finance] --> C
    K[Public services] --> C
    L[Cybersecurity] --> C
    M[Enterprise productivity] --> C

? Star S1 - Origins, Background and History

Star context: Explains why RAIDT emerges from the recognition that GenAI behaviour is shaped not only by the model itself, but by prompts, retrieval settings, tool permissions, orchestration choices and review controls that vary from run to run. This item therefore links the historical background of the project to RAIDT's central move from principle-level AI governance to operational, run-level governance.

Academic picture

Definition / background

Runtime configuration is the set of operational parameters, components and control choices active when a generative AI system is used for a particular task. It includes not only obvious settings such as model choice or temperature, but also the system prompt, user prompt framing, retrieval source selection, tool availability, adapters or fine-tuned layers, guardrails, escalation rules, access permissions, workflow routing and human review steps. In other words, it is the practical assembly through which a run is produced.

Conceptually, this matters because GenAI behaviour is not governed solely by stable software logic in the way many traditional information systems were. The same underlying model can generate very different outputs depending on how it is configured at runtime. A governance approach that evaluates only the model therefore misses a major determinant of performance, risk and accountability.

Within RAIDT, runtime configuration belongs at the centre of the framework because RAIDT treats the run, rather than the abstract model, as the unit of governance. A run-level evidence pack is incomplete if it cannot show how the system was configured when the output was produced. Similarly, the RAIDT score profile cannot be interpreted properly without knowing whether the run involved high randomness, external retrieval, broad tool permissions, weak oversight or strong review controls.

This item also differs from broader terms such as deployment architecture or system design. Deployment architecture describes the general environment in which the system operates; runtime configuration focuses on the specific settings and components active in a concrete instance of use. That distinction is essential for reviewability. RAIDT needs to know not merely how a system is usually deployed, but how this run was actually configured when a decision, recommendation or draft was generated.

Why this concept matters

Runtime configuration matters because it prevents a misleading model-centric view of GenAI governance. Without it, organisations may believe they have governed a system simply by approving a vendor, naming a model or publishing high-level principles. In practice, however, risk often enters through the configured use of that model: the prompt may overstate authority, retrieval may pull from outdated documents, a tool may be allowed to act without adequate checks, or a safety layer may be disabled for speed.

For organisations using GenAI in operational settings, this concept helps distinguish between abstract capability and situated behaviour. It provides a disciplined way to ask what was enabled, constrained, retrieved, instructed, reviewed and logged in the specific run under scrutiny. That is what allows governance to move from general assurance statements to concrete evidence.

If runtime configuration is missing from governance, failures become difficult to explain and improvements become difficult to target. Teams may know that an output was problematic but remain unable to determine whether the cause lay in prompting, retrieval quality, tool use, human oversight or model settings. RAIDT addresses this by making configuration an evidential object rather than a hidden implementation detail.

Key idea: Runtime configuration matters because GenAI governance becomes operational only when the actual settings and controls of a specific run are visible, reviewable and contestable.

What this item controls

Which model, model version or provider endpoint is used in the run.
Which system instructions, prompt templates and user inputs shape the output.
Which sampling or generation parameters influence variability, confidence style and determinism.
Which retrieval sources, indexes, document versions or knowledge bases can be accessed.
Which tools, APIs or agent actions are enabled, restricted or routed during the run.
Which safety filters, policy layers or escalation rules are active.
Which human checkpoints, approval steps or review obligations apply before use.
Which metadata, logs and version identifiers are captured for later reconstruction.

Practical example / likely audience question

Audience question

If we have already approved the model, why do we need to document every runtime configuration?

Answer

The concern behind this question is usually administrative burden: people assume that once the model has passed procurement, safety testing or policy approval, the governance problem has largely been solved. RAIDT rejects that assumption because the same approved model can behave very differently under different runtime conditions.

A direct answer is that model approval does not tell you how the model was actually used in the run being reviewed. A low-temperature summarisation prompt operating on a fixed internal knowledge base is materially different from a high-variability agentic workflow that retrieves from live web sources and calls external tools. Treating both cases as though they were governed simply because they share a model name would be analytically weak and operationally risky.

Consider a practical example. An enterprise team may use the same foundation model for two tasks: drafting internal meeting notes and generating customer-facing policy guidance. The first run may involve a fixed template, no external retrieval and light post-editing. The second may involve dynamic retrieval, a compliance prompt layer and mandatory human sign-off. RAIDT handles this better than generic AI governance because it documents the actual run configuration rather than relying on generic statements that "the model is approved". That makes the resulting evidence more useful for audit, challenge and redesign.

Practical example in RAIDT terms

A healthcare trust uses a GenAI assistant to draft discharge instructions for clinicians. In one run, the assistant is configured with a prompt that asks for clear patient-friendly language, retrieval from the trust's internal medicines guidance, temperature set low for consistency, and a mandatory clinician review before anything is shown to the patient.

The run-level issue appears when a later incident reveals that the retrieval connector was pointed to an outdated guideline index and the prompt template had been modified to make the assistant sound more decisive. The underlying model had not changed, but the runtime configuration had changed in a way that increased the risk of overconfident and outdated advice.

The evidence needed in RAIDT terms would include the model and version identifier, system prompt version, prompt template hash, retrieval index version, document timestamp, temperature and related generation settings, enabled tool list, reviewer identity and sign-off record, and the output shown to the clinician. The most affected RAIDT pillars would be Dependability, Auditability and Traceability, with clear Responsibility implications for who approved the configuration and who reviewed the run. Capturing runtime configuration in this way improves governance readiness because the organisation can reconstruct the event, contest the adequacy of controls and implement targeted remediation rather than issuing vague assurances.

Detailed link to RAIDT

Runtime configuration links to RAIDT in four ways.

First, it operationalises RAIDT's core claim that the governable unit is not the abstract model but the specific configured run in context.
Second, it defines a major part of what must be captured as run-level evidence in order to explain why a run produced the output that it did.
Third, it populates the evidence pack and shapes the RAIDT score profile, because different configurations change the level of assurance required across the five pillars.
Fourth, it supports reviewability, contestability, audit readiness and organisational learning by letting reviewers reconstruct and compare runs rather than rely on general vendor or policy claims.

Runtime configuration -> Run-level evidence -> Evidence pack -> RAIDT score profile -> Governance readiness

Link to the five RAIDT pillars

Responsibility

Runtime configuration supports Responsibility because it makes visible who selected, approved or altered the settings that shaped the run. Without this, accountability can be displaced onto an abstract "AI system" rather than assigned to identifiable design and oversight choices.