S6.01 - Governance_interventions

S6.01 ? Governance interventions

flowchart LR
    A[Traditional limitation:
influence methods treated as technical tuning only] --> B[RAIDT
run-level evidence framework] B --> C[[Governance interventions
operational controls shaping behaviour and evidence capture]] C --> D[Run-level evidence pack] C --> E[Five-pillar score profile] C --> F[Reviewability and contestability] D --> G[Reviewer reconstruction] E --> H[Governance readiness] I[Structured prompting] J[Provenance-first RAG] K[Adapter lineage] L[Healthcare / finance / public service use cases] I --> C J --> C K --> C L --> C

? Star S6 - Influence Methods as Governance Interventions

Star context: This star treats prompting, RAG, adapter methods, preference-tuning controls, and stacked influence as governance-relevant interventions because each changes what a run can do, what evidence can be captured, and what a reviewer can later reconstruct inside RAIDT.


Academic picture
Definition / background

Governance interventions are design or control choices that intentionally shape how a generative AI run behaves and how that behaviour can be evidenced, reviewed, and governed. In Star S6, the term covers influence methods such as prompting, structured prompting, role assignment, retrieval configuration, adapter-based customisation, preference-tuning controls, and combinations of these methods when they materially affect run outcomes. The central point is that these are not merely performance-tuning devices. They are interventions because they alter what the system is likely to produce, which constraints are active, which sources may be drawn upon, and what traces can be collected afterwards.

Conceptually, this matters because many discussions of AI governance stop at policy principles or model-level claims. RAIDT instead asks what happened in a specific run and what evidence exists to support review. Under that logic, an intervention is governance-relevant when it changes the behaviour of the run in a way that should be documented, assessed, and made contestable. A prompt template, a retrieval filter, an adapter, or a DPO-shaped preference profile each changes the operational conditions of use. That makes the intervention part of governance, not just part of engineering.

This item therefore belongs inside RAIDT because RAIDT treats the run as the unit of governance. If a run is the object of review, then the interventions shaping that run must be identifiable and evidentially visible. Governance interventions sit between abstract governance intentions and practical run behaviour. They connect organisational rules, technical controls, and evidential outputs.

The relationship to RAIDT's practical outputs is direct. Governance interventions affect what enters the run-level evidence pack, how the five-pillar profile is scored, and how a reviewer reconstructs whether the run was responsible, auditable, interpretable, dependable, and traceable. In short, governance interventions are the operational means through which high-level governance intentions become inspectable at run level.

Why this concept matters

This concept matters because organisations often assume governance is something added after a model has produced an output. RAIDT rejects that assumption. In practice, governance is already being enacted through intervention choices that shape prompts, retrieval, adaptation layers, output constraints, and review routes. Naming these choices as governance interventions makes them visible and governable.

The concept also prevents a common confusion: the belief that technical configuration belongs only to engineering, while governance belongs only to policy or compliance. In organisational use, that separation breaks down. The same intervention that improves relevance or efficiency may also affect explainability, source provenance, reproducibility, and reviewer confidence. If governance interventions are not recognised, organisations may have policies that sound robust while actual runs remain weakly evidenced and difficult to reconstruct.

Without this concept, risk accumulates quietly. Teams may be unable to explain why outputs changed, which controls were active, whether a retrieval source was approved, or whether a fine-tuned component altered the model's behaviour in an undocumented way. By framing influence methods as governance interventions, RAIDT helps move from principle statements to operational governance, where control choices can be justified, documented, compared, and improved.

Key idea: Governance interventions matter because they turn technical influence choices into reviewable evidence about how a specific GenAI run was shaped and governed.

What this item controls
Practical example / likely audience question

Audience question

Why does RAIDT include prompting, RAG, PEFT or alignment-style controls within governance interventions rather than treating them as separate technical details?

Answer

The concern behind the question is usually that governance should focus on policies, accountabilities, or outcomes, while implementation details should remain in the engineering layer. RAIDT takes a different but more operationally defensible view. If a technical choice changes what the model can access, how it is guided, what behaviours are preferred, or what can later be reconstructed, then that choice is already functioning as a governance intervention.

The direct answer is that these components determine the conditions under which a run takes place. A prompt template can narrow or widen acceptable outputs. A RAG pipeline can restrict or expand the evidence base the model draws from. A LoRA adapter can embed task-specific tendencies. A preference-tuning control can shift how the model handles uncertainty, refusal, or tone. Each of these changes the governance posture of the run because each affects behaviour, evidential visibility, and reviewability.

A practical example is a compliance drafting assistant. A generic governance approach might say only that the organisation has responsible AI principles and a human reviewer. RAIDT asks a more probing question: what exact interventions shaped this run? Was a controlled prompt used? Did the system retrieve only approved regulatory sources? Was a domain adapter active? Were refusal or escalation preferences encoded? RAIDT handles the issue better because it requires those shaping conditions to be captured as evidence, not left as assumptions.

Practical example in RAIDT terms

Consider a healthcare trust using a generative AI assistant to draft discharge-summary text for clinicians. The use case appears straightforward, but the run-level governance issue is not simply whether the output looks fluent. The issue is which interventions shaped the run: a structured clinical prompt, a retrieval layer restricted to approved hospital protocols, and an adapter trained on local documentation style.

In RAIDT terms, the evidence needed would include the run identifier, active prompt template version, retrieval source list, adapter version or lineage reference, timestamp, reviewer role, and any escalation or override notes. The affected pillars are especially Auditability, Dependability, and Traceability, because reviewers need to know whether the output was produced under approved conditions and whether those conditions can be reconstructed later. Responsibility and Interpretability are also implicated because the trust must show who authorised the intervention design and how the run can be explained to internal reviewers.

Governance interventions improve readiness here by turning hidden configuration choices into explicit evidence objects. Instead of merely claiming the assistant is governed, the organisation can show which interventions were active in the run, why they were selected, what they constrained, and how they supported safe clinical documentation practice.

Detailed link to RAIDT

Governance interventions link to RAIDT in four ways.

First, they connect directly to RAIDT's core idea that governance should be grounded in evidenced operational use rather than in abstract model claims alone.
Second, they shape the run itself by defining the configured conditions under which a specific task is performed at a specific time and in a specific context.
Third, they determine what must be captured in the evidence pack and influence how the five-pillar score profile should be justified.
Fourth, they support reviewability, contestability, audit readiness, and organisational learning because they make behavioural shaping choices visible for later reconstruction.

Governance interventions ? Run-level evidence ? Evidence pack ? RAIDT score profile ? Governance readiness

In RAIDT, this chain matters because governance interventions are the bridge between technical design choices and institutional accountability. Without that bridge, the evidence pack is incomplete and the score profile risks becoming a superficial summary rather than a reviewable governance artefact.

Link to the five RAIDT pillars

Governance interventions affect all five pillars, but they are especially strong for Responsibility, Auditability, and Traceability because they define who authorised a control, what it changed, and whether that change can later be reconstructed.

Responsibility

Governance interventions clarify who designed, approved, deployed, and reviewed the shaping conditions of a run. They prevent responsibility from being reduced to a vague statement that a model was used under policy.

Example evidence / implication:

Auditability

Auditability improves when interventions are explicitly logged and versioned, because reviewers can inspect how a run was configured rather than guessing from outputs alone.

Example evidence / implication:

Interpretability

Governance interventions support interpretability by making behavioural shaping mechanisms more legible. They do not make a model fully transparent, but they make the run more understandable.

Example evidence / implication:

Dependability

Dependability is affected because interventions are often the means by which organisations stabilise outputs, reduce unsafe variation, and enforce acceptable operating conditions.

Example evidence / implication:

Traceability

Traceability is strengthened when an organisation can reconstruct which intervention stack was active in a given run and how that stack changed over time.

Example evidence / implication:

Why this item is more than a generic concept

In general AI governance, governance interventions may be discussed loosely as safeguards, controls, or socio-technical measures around an AI system. In RAIDT, the term is more precise and more operational. It refers to identifiable design and control choices that shape a specific run and that therefore must be reflected in run-level evidence.

That RAIDT meaning is stronger because it does not stop at saying an intervention exists. It asks whether the intervention can be evidenced, reviewed, compared, and tied to governance outcomes. In other words, the concept is not just classificatory. It is evidential. A governance intervention in RAIDT is meaningful because it can be linked to the evidence pack, the score profile, and the organisation's ability to defend a run under scrutiny.

Common misunderstanding

Misunderstanding

Governance interventions are just another name for technical optimisation methods.

Correction

That is too narrow. A method becomes a governance intervention when it shapes what the system can do and what a reviewer can know about that run. For example, a retrieval restriction to approved policy documents is not simply a performance tweak. It is a governance decision because it constrains the evidence base of the run, affects output legitimacy, and creates a recordable control condition. RAIDT therefore treats the intervention as part of governance architecture, not as an incidental engineering detail.

Boundary and limitation

Governance interventions do not by themselves prove that a run was good, safe, lawful, or fair. An intervention can be well documented and still be poorly chosen, weakly implemented, or misaligned with the task. Likewise, logging an intervention does not replace human review, sector-specific assurance, or institutional accountability.

There is also a practical limit: some interventions are easier to document than to interpret. A prompt template may be legible, whereas the behavioural effect of a preference-tuned layer may be harder to explain fully. RAIDT handles this limitation by separating visibility from adequacy. The framework does not assume that intervention evidence solves all governance questions; it makes those questions easier to review, contest, and improve over time.

Implementation levels

Manual implementation

A researcher or small team can implement this manually by recording, for each run, which prompt pattern, retrieval setup, model variant, adapter, and reviewer control was active. A simple evidence template or structured note can capture the intervention stack and the reason it was used.

Semi-automated implementation

Semi-automated implementation uses metadata fields, standardised prompt libraries, retrieval-source registries, adapter version labels, and review checklists so that intervention details are captured consistently without relying entirely on free-text reporting.

Fully automated implementation

At scale, a platform or orchestration layer can automatically log intervention states for each run, including prompt version, tool configuration, retrieval scope, active adapters, policy wrappers, user role, timestamps, and downstream review outcomes. A governance dashboard can then map those records into evidence packs and RAIDT pillar scores.

Practical use in the RAIDT project

This item is central to how RAIDT explains the relationship between technical system design and governance evidence. In Paper 08 Foundations, it helps justify why run conditions must be treated as governance-relevant rather than as hidden implementation detail. In Paper 09 Empirical Validation, it supports comparison across intervention patterns by asking which combinations improve reviewability, consistency, and evidential quality. In Paper 10 Policy Pathways, it helps translate abstract governance requirements into operational controls that organisations can actually implement.

It also supports sector playbooks by showing how different domains should evidence different intervention choices. In the evidence pack, it identifies which shaping mechanisms were active in the run. In the scoring rubric, it helps justify why a run scores strongly or weakly on particular pillars. For supervision, viva defence, and journal positioning, this item clarifies that RAIDT is not anti-technical and not merely policy-facing; it is about making technical influence choices governable through evidence.

Key audience questions to prepare for

Q1. Why does RAIDT treat influence methods as governance interventions rather than as engineering details?

Because the methods shape behaviour, source access, output constraints, and the possibility of later reconstruction. If they alter what happened in the run, they are governance-relevant.

Q2. Does identifying a governance intervention mean the intervention is automatically good?

No. RAIDT distinguishes between evidencing an intervention and judging whether it was appropriate, effective, proportionate, or defensible.

Q3. Why is the run the right level for analysing governance interventions?

Because interventions often vary by task, context, user role, and timing. Run level is where those concrete operational conditions become visible and reviewable.

Q4. How does this help an organisation beyond compliance language?

It gives the organisation a structured way to show what controls were active, how they shaped behaviour, and how reviewers can test or contest those claims using evidence.

Q5. What is the main practical benefit of this concept for RAIDT?

It turns hidden configuration choices into governance artefacts that can enter the evidence pack, inform the score profile, and support continuous improvement.

Suggested citation concepts to support this item
Short explanation for presentation

Governance interventions are the practical design choices that shape how a GenAI run behaves and how that behaviour can later be reviewed. In RAIDT, this includes methods such as prompt structure, retrieval configuration, adapters, and alignment-style controls. The key point is that these are not treated as mere optimisation details. They are governance-relevant because they affect what the system can access, how it responds, what evidence can be captured, and whether reviewers can reconstruct the conditions of use. By treating interventions as part of run-level evidence, RAIDT connects technical configuration to the evidence pack, the five-pillar score profile, and organisational readiness for audit, contestation, and continuous improvement.

One-line takeaway

Governance interventions are the operational choices that shape a GenAI run and therefore must be evidenced in RAIDT if governance claims are to be reviewable.

Related items in influence methods as governance interventions
Mentioned in reference-paper summaries (2)

Paper summaries live in Port/93-References/pdf_summaries/. Each file listed below contains the key term at least once.

Anchored questions
Powered by Forestry.md