Q152 - What_does_manual_RAIDT_implementation_look_like

Q152 — What does manual RAIDT implementation look like?

← RAIDT · Star S8 - Implementation and Operations · primary item: S8.01 · Manual implementation

Appears in sources
Answer

Manual RAIDT implementation looks like a structured review workflow built around reconstructable artefacts rather than around system automation. The papers make clear that RAIDT does not begin with dashboards; it begins with a bounded evidentiary object. Reviewers assemble a run-level evidence pack for a specific configured use and then assess governance readiness from that pack, not from memory and not from broad policy statements. In manual form, the pack is assembled by people rather than middleware, but it still has to preserve the same core fields: prompts, outputs, configuration details, provenance material where relevant, checks performed, and review notes.

The practical routine is straightforward. A reviewer logs one material run, preserves the relevant artefacts, and then scores the five pillars (Responsibility, Auditability, Interpretability, Dependability, Traceability) using a rubric. The papers emphasise that reviewers should record evidence pointers so that another person can inspect the basis of the judgement later. Manual implementation is therefore compatible with pilots, calibration exercises, and supervision workshops because it keeps the governance logic visible. The output is a score profile, not a loose narrative, and the scoring remains anchored to anchors 1=missing / 3=partial / 5=audit-ready.

What distinguishes good manual implementation from ad hoc note-taking is that it explicitly treats influence methods as governance interventions. If retrieval, structured prompting, LoRA-style adaptation, or alignment constraints affected the run, they must be written into the record because RAIDT evaluates governed use in context, not an abstract model description.

Practical example

Consider a healthcare note summarisation pilot in which clinicians are testing whether a GenAI assistant can draft a summary for a high-risk presentation. The team does not yet have automated logging, so each run is captured manually in a reviewer form. The form records the prompt template ID, the saved output, the model deployment note, any safety wording, the uncertainty statement shown to the clinician, and whether a human escalation flag was triggered.

A supervisor then scores the run. Responsibility is judged partly from whether the note preserves uncertainty and escalation guidance; Auditability is judged from whether the run can be reconstructed; Interpretability is judged from whether the summary separates facts from assumptions. This manual workflow is slow, but it is suitable for a supervised pilot because it exposes governance weaknesses before the organisation automates them.

Sources in RAIDT papers
Powered by Forestry.md