S10.07 - Healthcare

S10.07 ? Healthcare

flowchart LR
    A[Healthcare pressures:
documentation burden
safety sensitivity
fragmented records
regulatory scrutiny] --> B[RAIDT
run-level evidence framework]
    B --> C[[Healthcare
domain playbook for governed GenAI use]]
    H[Clinical note summarisation]
    I[Discharge drafting]
    J[Safety-case drafting]
    K[Red-flag detection]
    H --> C
    I --> C
    J --> C
    K --> C
    C --> D[Run-level evidence pack]
    C --> E[Five-pillar score profile]
    C --> F[Reviewer reconstruction
and contestability]
    C --> G[Governance readiness
and organisational learning]
    B --> L[Evidence over assertion]
    C --> L

? Star S10 - Empirical Programme, Domains and Sector Playbooks

Star context: Shows how RAIDT is operationalised in a high-stakes domain where generative AI outputs can shape clinical documentation, escalation, review, and patient-facing decisions, making run-level evidence especially important.

Academic picture

Definition / background

Healthcare, in the RAIDT context, refers to the use of the framework within clinical, administrative, and safety-relevant healthcare work where generative AI outputs may influence records, communication, triage, review, or organisational judgement. It is therefore a domain playbook rather than a single use case. The concept matters because healthcare combines strong potential value from generative AI with unusually high expectations around accountability, human oversight, patient safety, data handling, and reconstructability.

Conceptually, this item sits at the intersection of sector-specific governance and run-level evidence. Many governance discussions treat healthcare as a special sector because of regulation, ethics, and risk. RAIDT accepts that premise, but makes it operational by asking what evidence exists for each actual run. A run in healthcare is not just "using an LLM in a hospital"; it is one configured use for one task, at one time, for one purpose, under a particular level of review and with specific source material.

This distinguishes the item from broader ideas such as "AI in healthcare", "clinical AI governance", or "trustworthy medical AI". Those labels are often policy-level or model-level. In RAIDT, healthcare is defined through the practical question of whether a reviewer can inspect a specific run, understand what happened, assess whether controls were followed, and judge whether the resulting output was suitable for use in context.

The item belongs inside RAIDT because healthcare is a strong test case for the framework's central claim: responsible governance improves when the run is treated as the unit of evidence. In this domain, the evidence pack can capture prompt and template version, authorised source inputs, model and system configuration, reviewer identity or role, escalation thresholds, output artefacts, and the grounds on which an output was accepted, amended, or rejected. The five-pillar score profile then makes visible where a healthcare workflow is governance-ready and where it remains weak.

Why this concept matters

Healthcare matters in RAIDT because it exposes the limits of abstract AI principles faster than many other sectors. A statement that a system is safe, fair, or accountable is not enough when a generated summary may omit a contraindication, soften a warning, or create false confidence in a downstream decision-maker. The domain therefore forces governance to move from general declarations to evidence about actual use.

This concept avoids an important confusion: the belief that sector sensitivity can be handled purely by stronger policy language. In practice, healthcare organisations need mechanisms that show what the model was asked to do, what source basis it used, what it produced, who reviewed it, and how exceptions were handled. Without that level of detail, organisations may claim oversight while lacking reviewability.

If the concept is missing, the likely risk is performative governance. A hospital or supplier may report that human review exists, but not be able to show how it was applied in a particular run, whether the reviewer saw source evidence, or whether the output crossed a safety threshold requiring escalation. RAIDT treats healthcare as a domain in which those governance gaps become visible and contestable.

For organisations using generative AI, this matters because healthcare work often includes mixed-risk tasks. Some uses are low consequence and administrative; others are clinically adjacent; some may become safety critical once integrated into workflow. RAIDT helps separate those cases and calibrate governance controls accordingly, supporting a move from principles to operational governance readiness.

Key idea: Healthcare matters in RAIDT because high-stakes use requires governance to be evidenced at the level of each run, not asserted at the level of the system or policy.

What this item enables

A sector-specific playbook for applying RAIDT to clinical, administrative, and safety-related workflows.
Calibration of controls according to the consequences of error, omission, ambiguity, or delayed review.
Clear definition of reviewer roles, hand-off points, escalation conditions, and acceptable-use boundaries.
Run-level reconstruction of how a healthcare output was generated, checked, amended, and used.
Comparison of governance readiness across different healthcare scenarios, teams, models, and deployment configurations.
Organisational learning from repeated runs, near misses, exceptions, and patterns of reviewer intervention.
Translation of abstract ideas such as responsibility and auditability into concrete evidence requirements for healthcare settings.

Practical example / likely audience question

Audience question

How does RAIDT work in healthcare without pretending that an evidence pack can guarantee clinical safety?

Answer

The concern behind the question is sensible: healthcare is full of context, tacit knowledge, changing patient conditions, and professional judgement. A framework such as RAIDT should not be presented as if documentation alone can make a generative AI use safe. The direct answer is that RAIDT does not replace clinical governance; it makes the use of generative AI reviewable within clinical governance.

A practical example is clinical note summarisation. Suppose a model drafts a concise ward-round summary from a defined subset of authorised source notes. RAIDT would record the task definition, prompt and template version, model configuration, time of run, source documents used, output generated, output hash, reviewer role, and whether the clinician accepted, corrected, or rejected the draft. If a critical medication change was omitted, the run can be reviewed to determine whether the source text contained the information, whether the prompt constrained the task properly, whether the output was misleadingly fluent, and whether the human review step was adequate.

This is stronger than a generic AI governance approach because generic approaches often stop at policy, training, or high-level human oversight statements. RAIDT asks for inspectable run evidence. In healthcare, that difference matters because the key governance question is rarely "Did the organisation have an AI policy?" and more often "Can we reconstruct what happened in this case, assess the controls used, and improve the workflow before the same problem recurs?"

Practical example in RAIDT terms

Consider a hospital team using a generative AI assistant to draft discharge summaries from structured discharge notes, medication changes, and clinician-authored progress entries. The use case is time-saving and consistency improvement, but the run-level issue is that a plausible summary may omit follow-up instructions or misstate a medication change if the source basis is incomplete or the prompt is too compressive.

In RAIDT terms, the evidence needed includes the approved task definition, the exact prompt and template version, the authorised input bundle, the model and wrapper version, time stamps, the generated summary, reviewer edits, rejection reasons where relevant, and the final sign-off status. Responsibility is affected because the accountable reviewer and escalation threshold must be clear. Auditability and Traceability are affected because the organisation needs to reconstruct which inputs and configurations produced the summary. Interpretability is affected because clinicians must understand the role and limits of the output. Dependability is affected because repeated runs should show whether the workflow performs consistently and where failures cluster.

The item improves governance readiness by making the domain-specific controls explicit. Rather than treating discharge-summary drafting as a generic productivity task, healthcare framing shows that the workflow must be bounded, reviewable, and evidenced in a way proportionate to downstream patient risk.

Detailed link to RAIDT

Healthcare links to RAIDT in four ways.

First, it anchors RAIDT's core idea that governance should be attached to real organisational uses of generative AI rather than to abstract policy claims.
Second, it makes the run central, because healthcare governance depends on understanding one specific use for one defined task in one particular context.
Third, it gives practical shape to the evidence pack and score profile by specifying the forms of evidence, review, and exception handling needed in a high-stakes domain.
Fourth, it strengthens reviewability, contestability, audit readiness, and organisational learning by allowing specific runs to be inspected, challenged, and improved over time.

Healthcare ? Run-level evidence ? Evidence pack ? RAIDT score profile ? Governance readiness

In this chain, healthcare is the domain context that determines why the run needs careful controls; run-level evidence records what actually happened; the evidence pack organises that record for review; the score profile translates the record into a structured governance judgement; and governance readiness reflects whether the workflow is fit for responsible organisational use.

Link to the five RAIDT pillars

Responsibility

Healthcare strongly affects Responsibility because roles, authority, and acceptable delegation must be explicit. It must be clear who may initiate a run, who may review the output, what level of expertise is required, and when the output cannot be used without escalation.