Q145 - What_are_governance_interventions_in_RAIDT

Q145 — What are governance interventions in RAIDT?

← RAIDT · Star S6 - Influence Methods as Governance Interventions · primary item: S6.01 · Governance interventions

Appears in sources

integrated_82#Q3.19

Answer

In RAIDT, governance interventions are influence levers that do two things at once: they shape model behaviour and they shape evidence capture. That is why the branch should be read in terms of influence methods as governance interventions rather than mere optimisation tricks. Across the papers, prompting, PEFT/LoRA, RAG, and RLHF/DPO are all treated as controllable modules whose effects must be visible in outputs, logs, reviewer judgements, and lifecycle documentation. A method qualifies as a governance intervention when it is specified, versioned, reviewable, and capable of leaving an auditable trace.

The four papers differentiate the interventions by the kind of governance value they add. Prompting formalises instruction, role, and disclosure logic. LoRA localises behavioural change into modular, hashable adapters that can be versioned, rolled back, and described in adapter cards. RAG introduces provenance by binding claims to retrieved sources, retrieval policies, and index lineage. RLHF and DPO reshape tone, safety, and norm-conformance, but require annotation governance, reward or preference provenance, and reviewer controls so that alignment claims do not become opaque. The common RAIDT expectation is that each intervention leaves artefacts that can support scrutiny.

For that reason, RAIDT makes the run as the unit of governance. A governance intervention must populate a run-level evidence pack: prompt identifiers, adapter lineage where applicable, retrieval records, reward or preference identifiers where applicable, output hashes, reviewer forms, and adjudication notes. Supervisors then read the resulting score profile against the five pillars (Responsibility, Auditability, Interpretability, Dependability, Traceability) and the anchors 1=missing / 3=partial / 5=audit-ready. If a lever changes output behaviour but cannot support this evidential chain, the papers imply that it remains performance tuning rather than a fully realised governance intervention.

Practical example

Consider a finance team generating adverse-action letters. A plain prompt that asks for a courteous explanation may improve readability, but on its own it is weak as a governance intervention. A RAIDT-compliant intervention would add a versioned prompt, a LoRA adapter that stabilises the institution?s approved tone, retrieval of policy templates or factor definitions, and a run log that stores hashes and reviewer scores.

In that configuration, the bank can inspect why a letter was produced, which source factors informed it, whether the adapter version was approved, and how reviewers scored the explanation. The intervention has not only influenced the text; it has organised the evidence needed for internal audit, complaint handling, and external review.

Sources in RAIDT papers

05-RAIDT_LoRA_V2
06-RAIDT_RAG_V1
07-RAIDT_RLHF_V1