Q176 - What_practical_checklist_should_now_be_answerable_before_pap

Q176 — What practical checklist should now be answerable before paper-level decisions are made?

← RAIDT · Star S12 - Programme Architecture and Supervisory Navigation · primary item: S12.03 · Three-paper arc

Appears in sources

integrated_82#Q4.26

Answer

Before any paper-level decision about whether material belongs in foundations, empirical validation, or policy pathways, RAIDT implies that a practical checklist should already be answerable at the level of the configured use. Because RAIDT treats the run as the unit of governance, reviewers should be able to state what the material run is, why it is governance-significant, what influence methods as governance interventions were active, and whether the run-level evidence pack is bounded, reconstructable, and integrity-protected. At minimum, that pack should identify the prompt or template version, model and deployment configuration, retrieved context or snapshot identifiers where used, output and hashes, checks performed, and human oversight or escalation decisions.

The same checklist should then ask whether the five pillars (Responsibility, Auditability, Interpretability, Dependability, Traceability) can be scored from evidence rather than narrative assurance. In practice, this means that a score profile should be defensible against the anchors 1=missing / 3=partial / 5=audit-ready, with clear evidence pointers for each pillar. Reviewers should be able to answer: was the run appropriately constrained and overseen; can it be reconstructed later; is the explanation decision-usable; is the configuration stable enough under repeat use; and can claims be traced to sources, versions, and controls?

Only when those questions are answerable does a paper-level judgement become methodologically sound. Paper 08 can then justify the construct and rubric, later empirical work can validate measurement across repeated runs and configurations, and policy work can demonstrate interoperability without collapsing back into abstract compliance claims.

Practical example

In a public-service eligibility workflow, a team uses GenAI to draft explanations for why a claimant was deemed ineligible. Before deciding that the case supports a stronger programme claim, the reviewer applies a RAIDT checklist: which policy clause version was retrieved, is the retrieval snapshot stored, are the prompt template and model deployment IDs logged, did a human reviewer accept, edit, or reject the draft, does the output separate facts from assumptions, and has uncertainty been stated?

If those answers are present, the run-level evidence pack can support a credible score profile. Auditability and Traceability improve only if the exact clause, version, run ID, timestamps, and hashes are preserved. Responsibility depends on documented reviewer authority and escalation. Dependability depends on repeat-run evidence rather than one polished example. If the file contains only a persuasive explanation with no evidence pointers, the checklist fails and no strong paper-level claim should be made.

Sources in RAIDT papers

08-RAIDT_Foundations_M_V50
11-RAIDT_Academic_Logic_M_v11
12-RAIDT_DSR_Theory_M_v8