S2.06 - Continuous_improvement

S2.06 ? Continuous improvement

flowchart LR
    A[Static governance problem
post-hoc review only
recurring evidence gaps] --> B[RAIDT
run-level evidence framework]
    B --> C[[Continuous improvement
evidence-guided revision of future runs]]
    C --> D[Evidence pack strengthened]
    C --> E[Score profile becomes actionable]
    C --> F[Reviewer reconstruction improved]
    C --> G[Organisational learning]
    C --> H[Governance readiness]
    I[Healthcare triage support] --> C
    J[Public-service case summaries] --> C
    K[Finance drafting] --> C
    L[Education feedback] --> C
    M[Cybersecurity analysis] --> C

? Star S2 - Governance Meaning and Problem Context

Star context: Clarifies governance as oversight, control, accountability, reviewability, contestability and continuous improvement, so RAIDT treats governance as an evidence-guided organisational practice rather than a vague ethics label.

Academic picture

Definition / background

Continuous improvement in RAIDT means using run-level evidence to refine how generative AI is configured, governed, reviewed, and used over time. In practical terms, weak scores, recurring evidence gaps, repeated reviewer concerns, or unstable task performance should trigger a change in the socio-technical arrangement around a run: for example, better prompt design, stricter logging, stronger retrieval governance, revised human-review checkpoints, clearer role accountability, or targeted user training.

The concept has roots in quality improvement, audit learning, safety management, and organisational learning, but RAIDT gives it a more specific governance meaning. It is not simply the idea that systems should get better. It is the claim that improvement should be traceable to documented runs, evidenced weaknesses, and defensible governance interventions. This matters because many AI-governance discussions stop at principles, while RAIDT asks what an organisation can actually review, reconstruct, challenge, and improve after a specific use of a GenAI system.

Inside RAIDT, continuous improvement belongs naturally with run-level evidence, evidence packs, and score profiles. A run-level evidence pack makes weaknesses visible in context. The five-pillar score profile makes patterns comparable across runs. Continuous improvement is the organisational response that converts those findings into better future practice. In that sense, it closes the loop between evidence collection and governance action.

It also differs from adjacent ideas such as optimisation or model fine-tuning. Continuous improvement in RAIDT may involve model or prompt changes, but it can also involve process redesign, reviewer assignment, access restrictions, escalation criteria, provenance capture, or documentation standards. The focus is therefore not only system performance, but governance readiness.

Why this concept matters

Continuous improvement matters because governance fails if evidence is collected but never used. An organisation may produce logs, scores, and review comments, yet still repeat the same weaknesses if there is no structured pathway from findings to intervention. RAIDT avoids that failure by treating review outputs as inputs to future governance design.

This concept also prevents a common confusion in GenAI governance: the assumption that auditability is enough on its own. Auditability tells an organisation what happened and how to inspect it. Continuous improvement adds the next step, namely how documented weaknesses alter future runs. Without that step, governance remains descriptive rather than corrective.

For organisations using GenAI in meaningful work, the absence of continuous improvement creates practical risk. Known prompt failures persist, documentation gaps recur, weak review rules remain unaddressed, and pillar scores become a reporting ritual instead of a governance mechanism. Continuous improvement makes RAIDT operational because it converts run-level evidence into concrete changes in controls, practices, and organisational learning.

Key idea: Continuous improvement matters in RAIDT because governance should not end with documenting a run; it should use run-level evidence to improve the next run.

What this item enables

It enables RAIDT findings to lead to corrective action rather than remaining static records.
It enables recurring weaknesses across runs to be identified as patterns rather than isolated incidents.
It enables prompts, workflows, review rules, and documentation practices to be revised on an evidence-informed basis.
It enables pillar scores to function as governance signals that trigger intervention.
It enables organisational learning by linking review outcomes to future design choices.
It enables PhD-level argumentation that RAIDT is not only evaluative but also improvement-oriented.

Practical example / likely audience question

Audience question

Is RAIDT only post-hoc audit?

Answer

No. The concern behind the question is that a run-level evidence framework might appear to operate only after the event, producing documentation about completed use without changing future behaviour. RAIDT does include post-hoc review, but it is not limited to it. Its design makes review actionable.

A weak score or recurring evidence gap can justify a specific intervention before the next comparable run takes place. For example, if a drafting assistant repeatedly produces unsupported claims because retrieval provenance is missing, RAIDT does not merely record that weakness. It can justify a change in retrieval rules, a new requirement for source capture, an added reviewer checkpoint, or a narrower task boundary for future use.

This is stronger than a generic AI-governance approach because generic frameworks often identify principles such as accountability or safety without specifying the operational unit through which learning occurs. RAIDT uses the run as that unit. That makes the pathway from evidence to improvement much more concrete, reviewable, and governable.

Practical example in RAIDT terms

Consider a public-service team using a GenAI assistant to draft case summaries for housing-support assessments. In one run, the system produces a fluent summary, but the reviewer notices that important claimant circumstances were compressed and the provenance of key statements is unclear. The run-level issue is not merely that the output could have been better; it is that the evidence pack shows weak source traceability, an incomplete review note, and an over-confident final summary.

The evidence needed includes the prompt, the input materials used, retrieval or source references, timestamps, reviewer comments, the final edited output, and the RAIDT pillar scores for that run. The most affected pillars are Auditability, Traceability, and Dependability, with Responsibility also implicated because reviewer roles and escalation thresholds may need tightening.

Continuous improvement then appears as a governance response. The organisation updates the summarisation prompt, requires explicit citation of case-file passages, adds a mandatory reviewer checklist for vulnerable-case indicators, and records that future runs of the same task should trigger escalation when provenance is incomplete. Governance readiness improves because the next run is better structured for review, challenge, and safe organisational use.

Detailed link to RAIDT

Continuous improvement links to RAIDT in four ways.

First, it supports RAIDT's core idea that responsible GenAI governance should be based on evidence rather than assertion. Improvement becomes credible only when it is tied to documented weaknesses in actual runs.

Second, it depends on the run as the unit of governance. RAIDT does not ask organisations to improve AI use in the abstract; it asks them to examine a specific configured use at a specific time in a specific context, and to learn from that case.

Third, it uses RAIDT outputs directly. The evidence pack captures what happened, while the score profile helps identify where governance weaknesses are concentrated. Together, they provide a practical basis for redesigning prompts, workflows, review rules, and accountability arrangements.

Fourth, it strengthens reviewability, contestability, audit readiness, and organisational learning. When a reviewer, supervisor, or external stakeholder asks what changed after a weak run, RAIDT can show not only the evidence but also the improvement response.

Continuous improvement -> Run-level evidence -> Evidence pack -> RAIDT score profile -> Governance readiness

Link to the five RAIDT pillars

Continuous improvement affects all five RAIDT pillars, but it has especially strong implications for Auditability, Dependability, and Traceability because those pillars often reveal where organisational adjustment is needed.

Responsibility

Continuous improvement supports Responsibility by making it clear who must act when weaknesses are found and who owns the resulting change.