C0.02 - Run

C0.02 ? Run

flowchart LR
    A[Background: model-level policy and documentation are too general] --> B[RAIDT: run-level evidence framework]
    B --> C[[Run: one configured GenAI use in context]]
    C --> D[Run-level evidence]
    D --> E[Evidence pack]
    E --> F[Reviewer reconstruction]
    C --> G[Score profile]
    G --> H[Governance readiness]
    F --> H
    I[Healthcare drafting] --> C
    J[Finance reporting] --> C
    K[Public-service correspondence] --> C
    L[Education support] --> C
    M[Enterprise productivity] --> C

← Star C0 - RAIDT Core, Definition, Values, Claims and Innovation

Star context: Defines the project identity of RAIDT by specifying the unit that governance actually attaches to: one concrete GenAI use in organisational work, examined through run-level evidence rather than abstract model claims.

Definition / background

In RAIDT, a run is one configured use of a generative AI system for a specific task, at a specific time, in a specific organisational context. It is not merely the final answer produced by the model. A run includes the task framing, the prompt or instruction, the model and tool configuration, any retrieved or supplied context, the generated output, the human or automated checks that followed, and the organisational setting in which the result was used.

Conceptually, the term matters because governance problems tend to arise at the level of actual use events. A policy may be well written and a model family may be well documented, yet a harmful, misleading, or non-compliant outcome usually emerges through one concrete execution in context. RAIDT therefore treats the run as the unit at which evidence should be gathered and governance should be assessed.

This distinguishes a run from related terms. It is narrower than a workflow, which may contain many AI and non-AI steps. It is more situated than a model card, which describes a system in general terms. It is more governable than a vague reference to "an AI use case", because it identifies a single reviewable instance. In RAIDT, that precision is what allows run-level evidence to be assembled into an evidence pack and translated into a score profile across the five pillars.

The run belongs centrally inside RAIDT because RAIDT's core claim is methodological: responsible governance becomes more credible when it is anchored in evidence about what was actually done. By defining the run clearly, RAIDT establishes the boundary of what is being evidenced, reviewed, contested, and improved.

Why this concept matters

The concept of the run solves a common governance failure in GenAI adoption: organisations talk about systems at too high a level of abstraction. They may know which model they procured and which policy they approved, but still be unable to explain what happened in a contested case, why a particular output was produced, or whether the right checks occurred before action was taken.

Treating the run as the unit of governance avoids several confusions. It prevents people from equating the model with the event, the output with the whole process, or policy compliance with evidential sufficiency. It also makes it possible to compare similar uses over time, identify recurring failure points, and show whether governance quality is improving across actual practice rather than rhetoric.

Without a clear concept of the run, GenAI governance becomes difficult to operationalise. Evidence collection becomes inconsistent, audit trails remain partial, and accountability is blurred across human decisions, prompts, tools, and model behaviour. RAIDT uses the run to move governance from principles and assertions towards evidence, reviewability, contestability, and continuous improvement.

Key idea: The run matters because it is the smallest meaningful unit of GenAI use that can be governed with evidence rather than assumption.

What this item captures

The bounded unit of GenAI use that RAIDT evaluates.
The relationship between prompt, configuration, context, output, and post-output checks.
The organisational circumstances that give a model interaction governance significance.
The point at which evidence can be collected, reconstructed, contested, and scored.
The conceptual basis for connecting technical execution to organisational accountability.
The starting unit from which patterns, risks, and readiness can later be analysed across many cases.

Practical example / likely audience question

Audience question

Why is RAIDT organised around the run rather than around the model, the policy, or the business process?

Answer

The concern behind this question is usually that the run may look too narrow. A supervisor, reviewer, or manager may worry that one isolated event cannot say much about governance quality. The direct answer is that RAIDT uses the run not because broader levels are irrelevant, but because governance failures become visible only when broad arrangements are instantiated in practice.

For example, an organisation might say that it uses an approved large language model under a responsible AI policy. That statement remains too general to resolve a dispute about one problematic summary, recommendation, or drafted response. To review the case properly, a reviewer needs to know which prompt was used, which context was retrieved, which model settings applied, what output was produced, what checks followed, and who acted on the result. That bounded event is the run.

RAIDT handles this better than a generic AI governance approach because it gives the reviewer an operational unit that can be evidenced. A generic approach may confirm that a policy exists or that a model was approved. RAIDT asks whether this particular use was sufficiently evidenced and governable. That is a much stronger basis for accountability, audit, and learning.

Practical example in RAIDT terms

Consider a healthcare trust using a GenAI assistant to draft discharge summaries from clinician notes. One clinician runs the system for a patient leaving hospital after a medication change. The model produces a discharge summary that omits a dosage adjustment. The governance issue is not simply that "the AI system was used in healthcare". The issue is that one specific run generated one specific clinical draft in one specific context.

The evidence needed includes the prompt template, the patient-note context supplied to the system, the model and version used, any retrieval or tool calls, the resulting draft, the clinician's edits, the approval step, and the final version sent onward. The most affected RAIDT pillars are Responsibility because a clinical actor must remain accountable for the use, Dependability because omission risk directly affects reliability, and Traceability because the run must be reconstructable after the event. Auditability also matters because a reviewer may need to test whether the omission arose from prompt design, missing context, or inadequate checking.

By defining the run clearly, RAIDT improves governance readiness. The organisation can review the exact event, identify where the failure entered the process, adjust templates or checks, and demonstrate a defensible improvement path rather than relying on generic assurances about safe deployment.

Detailed link to RAIDT

Run links to RAIDT in four ways.

First, it gives RAIDT its core unit of governance: one actual GenAI use in organisational context rather than a broad statement about a model or policy.
Second, it provides the boundary within which run-level evidence is gathered and interpreted.
Third, it is the object from which an evidence pack can be assembled and a score profile can be justified.
Fourth, it supports reviewability, contestability, audit readiness, and organisational learning because specific uses can be reconstructed, compared, and improved.

Run ? Run-level evidence ? Evidence pack ? RAIDT score profile ? Governance readiness

This chain is central to RAIDT's design. The run is the starting point; without a clearly bounded run, the later governance outputs become weaker, more subjective, and harder to defend.

Link to the five RAIDT pillars

Responsibility

The run is where responsibility becomes concrete. It identifies the task, the actor, the intended use, and the point at which human judgement should frame, review, or approve GenAI outputs.