Q187 - What_RAIDT_produces_and_what_gets_scored

Q187 — What RAIDT produces and what gets scored

← RAIDT · Star C0 - RAIDT Core, Definition, Values, Claims and Innovation · primary item: C0.11 · Core innovation

The evidence pack is the proof object; the five-pillar profile is the evaluative reading of that object.

Appears in sources

workshop_dense_100#slide 10

Answer

RAIDT produces two linked outputs for a material GenAI use. The first is a run-level evidence pack: a structured, auditable record of what happened in one configured use in context. In the papers, this includes identifiers and timestamps, user or system role, task and domain labels, the full prompt and template versions, model provider and version, decoding settings, enabled tools, retrieved passages where applicable, the output itself, hashes or provenance markers, and the human or automated checks applied afterwards. It is not intended as free-text assertion; it is a governed record that supports reconstruction, contestability, sampling, and post-incident review.

The second output is the RAIDT score profile for that same run. What gets scored is not the model in the abstract and not the organisation's policy rhetoric; rather, the scored object is the run and the sufficiency of its evidence. Reviewers assess the five pillars (Responsibility, Auditability, Interpretability, Dependability, Traceability) using anchors 1=missing / 3=partial / 5=audit-ready. The profile can be summarised with a composite score, but the dimension pattern is retained so weaknesses remain visible. This matters because different influence methods as governance interventions can strengthen one pillar while leaving another weak. RAIDT therefore produces both the evidential object and the governance-readiness assessment of that object, enabling comparison across repeated runs, configurations, and deployment settings.

Practical example

In an HR shortlist exercise, a team asks GenAI to draft candidate justifications and interview questions. Under RAIDT, the run-level evidence pack would store the prompt template version, the job criteria version, the model deployment identifier, and any PEFT or LoRA adapter version used to stabilise behaviour. If a candidate later contests the shortlist, the organisation has a reconstructable record rather than only a polished paragraph.

What gets scored is that specific run. Traceability depends on whether the versioned artefacts were preserved; Interpretability depends on whether the justification clearly links to documented criteria; Auditability depends on whether the run can be reconstructed. Using anchors 1=missing / 3=partial / 5=audit-ready, the organisation can produce a score profile that shows whether the governance record is genuinely reviewable, not merely whether the prose sounds plausible.

Sources in RAIDT papers

00-RAIDT_Wording_v2
11-RAIDT_Academic_Logic_M_v11