Q265 - Empirical_programme_definition_example_and_why_it_matters_in

Q265 — Empirical programme — definition, example, and why it matters in RAIDT

← RAIDT · Star S10 - Empirical Programme, Domains and Sector Playbooks · primary item: S10.01 · Empirical programme

H. Policy, Empirical & Adoption | Ordered by mind-map priority: inner circles first, then operational detail.

Appears in sources

workshop_dense_100#slide 88

Answer

In RAIDT, an empirical programme is a structured evaluation that executes real or realistic runs, captures a run-level evidence pack for each run, and scores governance readiness through the five pillars (Responsibility, Auditability, Interpretability, Dependability, Traceability). It is empirical because RAIDT does not leave governance readiness at the level of principle. Instead, it asks whether readiness can be observed from recorded run evidence and compared across domains, scenarios, and influence configurations. The resulting score profile uses the anchors 1=missing / 3=partial / 5=audit-ready, with the run as the unit of governance and the profile retained so cross-pillar trade-offs remain visible.

A concrete example appears in the healthcare and ageing-society materials. Clinical note summarisation is examined under baseline prompting, structured prompting, RAG, LoRA/PEFT, RLHF-type alignment, and stacked conditions. Those papers show why governance readiness is altered not only by the model choice but by instrumentation: prompt and version registries, retrieval snapshots, adapter lineage, preference logs, and reviewer checks materially strengthen Auditability and Traceability and make the run more inspectable. This matters in RAIDT because organisations need more than fluent outputs. They need evidence that supports audit sampling, change control, procurement review, remediation, and the ability to challenge decisions. The empirical programme therefore matters because it turns governance from a policy aspiration into a measurable, comparable, and improvable operational capability.

Practical example

Consider an ageing-services workflow in which GenAI drafts a finance eligibility explanation for an older adult or summarises a healthcare record for triage. Without RAIDT instrumentation, staff may receive a plausible output but lack the evidence needed to explain which prompt version, policy text, retrieval source, or safety setting shaped it. That weakens both contestability and organisational learning.

Under the empirical programme logic, the organisation captures the full run-level evidence pack, scores the run, and compares it with alternative configurations. A configuration that adds reason codes, retrieval snapshots, adapter lineage, and reviewer sign-off can move the work from a partially evidenced run toward an audit-ready one. That is why the empirical programme matters: it shows, with run evidence rather than slogans, how governance can be strengthened in high-consequence service settings.

Sources in RAIDT papers

09-RAIDT_Empirical_M_V50.docx
20-RAIDT_AgeingSoc_M_V50
21-RAIDT_Sector_Playbook_Healthcare_V2