Q256 - Monitoring_definition_example_and_why_it_matters_in_RAIDT

Q256 — Monitoring — definition, example, and why it matters in RAIDT

← RAIDT · Star S8 - Implementation and Operations · primary item: S8.05 · Monitoring

G. Implementation & Operations | Ordered by mind-map priority: inner circles first, then operational detail.

Appears in sources

workshop_dense_100#slide 79

Answer

Monitoring in RAIDT means tracking the score profile and its underlying evidence after runs so that governance quality can be observed over time. At minimum, it tracks scores, evidence completeness, drift, recurring errors, and configuration changes. This is a specifically run-level idea. The technical foundation argues that lifecycle dashboards and observability tools can show drift or failure rates, yet still fail to preserve the bounded evidence needed to review one disputed use. RAIDT addresses that gap by linking monitoring to the run-level evidence pack, so each monitored result remains reconstructable and contestable.

Why it matters follows directly from the nature of generative AI. Behaviour depends on prompts, retrieved context, adapters, safety settings, and human intervention, so governance cannot be inferred once and assumed to hold. RAIDT therefore monitors the five pillars (Responsibility, Auditability, Interpretability, Dependability, Traceability) across runs, using anchors 1=missing / 3=partial / 5=audit-ready to make trends visible and reviewable. Because influence methods as governance interventions can improve one governance outcome while weakening another, monitoring is the mechanism that detects when apparently minor prompt or corpus changes have shifted the organisation's governance posture. In this sense, monitoring is not an optional operational extra; it is how RAIDT turns a single score into organisational learning, incident preparedness, change control, and continuous improvement.

Practical example

In a public-service eligibility workflow, a caseworker uses GenAI to draft an explanation of whether an applicant meets a policy rule. RAIDT monitoring would not stop at a system dashboard showing usage volumes. Each run would retain the exact prompt version, policy clause version, retrieved passage identifiers, output, and reviewer decision. Over several weeks, the organisation might notice a stable drop in Traceability and recurring errors where outputs cite superseded policy wording after a corpus refresh. Because the run-level evidence pack is intact, the team can isolate the configuration change, correct the retrieval index, and tighten review thresholds until later runs return to an audit-ready pattern. The example shows why monitoring matters in RAIDT: it catches governance deterioration early, supports defensible correction, and preserves the evidence needed if a citizen later contests the decision.

Sources in RAIDT papers

08-RAIDT_Foundations_M_V50
18-RAIDT-Technical-Foundation_M_v04