Q095 - Why_does_the_ageing_calibration_matter_for_RAIDT

Q095 — Why does the ageing calibration matter for RAIDT?

← RAIDT · Star S10 - Empirical Programme, Domains and Sector Playbooks · primary item: S10.15 · Ageing calibration

Ageing calibration shows RAIDT can adapt to sector conditions without losing its stable run-level core.

Appears in sources

qa_deck_100#slide 97 · Empirical programme, calibration, procurement, and assurance

Answer

Ageing calibration matters because RAIDT is designed to score governance readiness at the level where GenAI is actually used, but ageing-related services alter what counts as acceptable governance. The ageing-society paper argues that these services are not merely another deployment domain: they combine higher vulnerability, uneven digital inclusion, and frequent contestation-relevant decisions. In that setting, RAIDT keeps the run as the unit of governance, yet the evidential burden rises. A run-level evidence pack must not only support technical reconstruction; it must also show uncertainty, escalation, accessible explanation, and routes to challenge outcomes. This changes how the five pillars (Responsibility, Auditability, Interpretability, Dependability, Traceability) are read in practice.

That matters empirically as well as conceptually. The ageing-society worked healthcare example shows that prompt-only systems can improve wording or clarity, yet still leave Auditability and Traceability weak because the underlying evidence is incomplete. In the reported healthcare scores, baseline prompting produced a composite mean of 3.2, with Auditability and Traceability at 2.0, whereas instrumented LoRA, RAG, alignment, and stacked conditions moved governance readiness to 4.8-5.0 by logging retrieval snapshots, adapter lineage, hashes, and review records. This is why RAIDT treats influence methods as governance interventions and why ageing calibration matters: it prevents organisations from mistaking fluent outputs for defensible governance, especially where older adults may need reasons, handover, and post-hoc review.

Practical example

A hospital deploys GenAI to summarise notes for a 72-year-old presenting with chest pain. Without ageing calibration, the system might output a neat summary that appears clinically plausible but omits uncertainty language, review status, or escalation triggers. Staff see a useful draft, yet there is no basis for later checking whether the model was prompted to surface red flags, which version ran, or why a warning was absent.

With ageing calibration, the same run-level evidence pack records the prompt version, model configuration, any retrieval used, the output hash, and reviewer notes, plus explicit wording on limits and escalation. The score profile can then be examined against the five pillars (Responsibility, Auditability, Interpretability, Dependability, Traceability). If Responsibility or Dependability is weak, the workflow requires human review before the summary informs triage. The calibration therefore converts a convenient drafting tool into a governed clinical support process.

Sources in RAIDT papers

20-RAIDT_AgeingSoc_M_V50
09-RAIDT_Empirical_M_V50.docx