Q237 - Dependability_definition_example_and_why_it_matters_in_RAIDT

Q237 — Dependability — definition, example, and why it matters in RAIDT

← RAIDT · Star S5 - RAIDT Pillars and Scoring · primary item: S5.04 · Dependability

E. Pillars & Scoring | Ordered by mind-map priority: inner circles first, then operational detail.

Appears in sources

workshop_dense_100#slide 60

Answer

In RAIDT, Dependability means that a configured generative-AI use behaves stably and safely enough to justify reliance under expected conditions. The Foundations paper defines it as stable, reliable performance across repeated runs, minor prompt variation, and known failure modes such as hallucination, inconsistency, and unsafe completions. This is why RAIDT insists on run as the unit of governance: Dependability belongs to one configured use in context, not to the model in the abstract. A single successful answer is not sufficient evidence, because runtime behaviour is shaped by prompts, retrieval, adapters, alignment layers, and human oversight.

Dependability matters because RAIDT is an evidence-centred framework rather than a narrative one. The run-level evidence pack lets reviewers examine whether stability claims are backed by repeated-run outputs, dispersion measures, prompt-perturbation records, monitoring signals, and versioned configuration data. In practice, this evidence feeds the score profile across the five pillars (Responsibility, Auditability, Interpretability, Dependability, Traceability), making it possible to see when a run is well documented but still unreliable. The papers also treat influence methods as governance interventions: structured prompts, retrieval augmentation, LoRA-style adaptation, and alignment controls can improve or weaken Dependability, so their versions and settings must be recorded. RAIDT therefore makes Dependability important not only for technical reliability, but for organisational learning, escalation, and defensible use over time.

Practical example

In healthcare discharge summarisation, a hospital may use a structured prompt plus retrieval from internal clinical guidance. If repeated runs with the same patient notes consistently produce the same medication cautions, the same uncertainty statement, and the same escalation flags, the workflow is more dependable. If a minor prompt wording change removes a warning about pending test results, Dependability is weak even if the text remains fluent and the run is fully logged.

This matters because clinicians rely on the output in a time-pressured environment. RAIDT would not ask only whether the answer sounds plausible; it would ask whether the organisation can show stable behaviour, detect instability after model or retrieval changes, and intervene before unsafe variation enters routine practice.

Sources in RAIDT papers

08-RAIDT_Foundations_M_V50
00-RAIDT_Scoring_v1
13-RAIDT-Evidence-Review_M_v10