S6.13 - Stacked_influence

S6.13 ? Stacked influence

flowchart LR
    A[Background: model-only claims hide configuration effects] --> B[RAIDT: run-level evidence framework]
    B --> C[[Stacked influence]]
    H[Structured prompting] --> C
    I[Provenance-first RAG] --> C
    J[LoRA / adapter tuning] --> C
    K[RLHF-type / DPO controls] --> C
    N[Public-service casework] --> C
    O[Healthcare documentation] --> C
    P[Enterprise knowledge assistant] --> C
    C --> D[Run-level evidence pack]
    C --> E[RAIDT score profile]
    D --> F[Reviewer reconstruction]
    D --> L[Organisational learning]
    E --> G[Governance readiness]
    E --> M[Policy alignment]

? Star S6 - Influence Methods as Governance Interventions

Star context: Positions prompting, RAG, PEFT/LoRA, RLHF/DPO and stacked influence as governance-relevant interventions whose combined use shapes the evidence, scoring, and reviewability of a RAIDT run rather than replacing RAIDT itself.

Academic picture

Definition / background

Stacked influence refers to the combined effect of multiple behaviour-shaping interventions within the same GenAI run. In practical terms, a single run may be shaped by structured prompting, retrieval-augmented generation, adapter-based tuning such as LoRA, and preference or alignment controls such as RLHF-type or DPO-style optimisation. The key point is that the observed output is not attributable to one method alone; it emerges from the interaction of several configured influences.

Conceptually, this idea comes from the reality of modern GenAI deployment. Organisations rarely rely on a foundation model in a completely unmodified state. They add prompts, system instructions, retrieval layers, domain adaptation, filtering, and policy controls in order to improve usefulness, safety, and consistency. In engineering terms this is often treated as a stack. In governance terms, however, the stack matters because each layer changes what should count as evidence and what must be reviewed.

Within RAIDT, stacked influence matters because RAIDT treats the run as the unit of governance. A run is not just "a model response"; it is a configured use of a model for a specific task, at a specific time, in a specific context. If that run depends on several influence methods, then the evidence pack must show which ones were active, how they were configured, and how they jointly affected the five-pillar score profile.

This makes stacked influence different from a generic statement that "multiple methods improve performance". In RAIDT, the concept is tied to run-level evidence, reviewer reconstruction, contestability, and audit readiness. A stacked configuration may improve performance or governance scores, but it also increases evidential complexity because the run can only be understood properly if the stack is documented as an integrated configuration rather than a loose list of components.

Why this concept matters

Stacked influence matters because it prevents organisations from making over-simple claims about where quality, safety, or reliability comes from. Without this concept, a team may attribute good results to the model alone when the actual improvement comes from retrieval quality, prompt structure, adapter tuning, or an alignment layer. That creates weak accountability and makes later review difficult.

It also helps avoid a common governance mistake: treating every intervention as if it were independent. In practice, stacked methods interact. A retrieval layer may improve factual grounding, but only if the prompt asks the system to use retrieved content appropriately. An adapter may improve domain fit, but only if the alignment layer does not suppress the relevant behaviour. RAIDT needs the stacked view because governance must examine both the components and their interaction inside the executed run.

For organisations using GenAI, this concept supports operational governance rather than abstract principle statements. It helps teams decide what to log, what to test, what to compare, and what to present to reviewers. It also shows why stronger performance claims should be matched by stronger evidence requirements.

Key idea: Stacked influence matters because better outcomes in GenAI often come from combined interventions, and RAIDT makes those combinations governable through run-level evidence rather than unsupported system claims.

What this item explains

Why stacked configurations often outperform single interventions in both task performance and governance scoring.
How several influence methods can be complementary rather than redundant within one run.
Why each added layer increases evidential burden as well as potential capability.
How a reviewer should connect prompt design, retrieval settings, adapter lineage, and alignment controls in one evidence narrative.
Why RAIDT score profiles should reflect the executed configuration rather than a model-only description.
How stacked influence turns from an engineering pattern into a governance issue once reviewability and audit readiness matter.

Practical example / likely audience question

Audience question

Why do stacked configurations usually score better than single controls?

Answer

The concern behind this question is usually that a higher score might be mistaken for proof that a system is simply "better" in a general sense. The direct answer is more specific: stacked configurations often score better because different interventions strengthen different governance-relevant properties at the same time. A prompt can narrow the task, RAG can improve factual grounding and provenance, LoRA can adapt behaviour to a domain, and RLHF-type or DPO controls can shape refusals, tone, or preference alignment.

A practical example is a public-service drafting assistant used to prepare first-pass responses for housing-benefit appeals. The team might use a structured prompt to enforce answer format, a retrieval layer to pull current policy text, a LoRA adapter to reflect local drafting style, and a preference-tuned safety layer to reduce inappropriate advice. The output can appear stronger across several RAIDT pillars because the run is more constrained, more grounded, and more consistent than a baseline model call.

RAIDT handles this better than a generic AI governance approach because it does not stop at saying that a stack exists. It requires the run-level evidence pack to connect the layers. Reviewers can see which prompt version was active, which documents were retrieved, which adapter was loaded, which policy control applied, and whether the combined configuration actually justified the observed score profile. In other words, RAIDT turns "the stack helped" into a reviewable claim.

Practical example in RAIDT terms

Consider a local authority using GenAI to draft initial responses to citizen housing-support queries. The run uses a structured prompt, retrieval from current policy manuals, a domain-specific LoRA adapter for casework language, and a safety alignment layer that blocks unsupported legal advice.

The run-level issue is not merely whether the answer looks good. The real governance question is which part of the stack produced the answer and whether the configuration can be defended if the response is challenged. RAIDT would therefore require evidence such as the prompt template version, retrieval index or document identifiers, retrieval timestamp, adapter name and lineage reference, active safety-policy configuration, model version, user role, task context, output text, and any human correction or escalation decision.

The affected pillars are broad. Responsibility is affected because ownership must be clear across prompt design, knowledge curation, and adapter deployment. Auditability and Traceability are strongly affected because a reviewer must reconstruct the stack. Interpretability is affected because the explanation of the output depends on understanding how the layers interacted. Dependability is affected because performance may improve, but only if the stack behaves consistently under repeat use and policy updates.

In governance-readiness terms, stacked influence improves assurance only when the interaction is documented. If the authority can show that the run was grounded in current policy, constrained by a standard prompt, adapted for the casework domain, and logged with full lineage, then the evidence pack becomes suitable for review, challenge, and organisational learning. If those links are missing, the same stack becomes harder rather than easier to govern.

Detailed link to RAIDT

Stacked influence links to RAIDT in four ways.

First, it reinforces RAIDT's core idea that governance should focus on the configured run rather than the abstract model.
Second, it makes the run-level unit explicit because each run must record which influence methods were active at execution time.
Third, it expands the evidence pack and shapes the score profile because evidence must connect the interaction of components, not just list them separately.
Fourth, it supports reviewability, contestability, audit readiness, and organisational learning because reviewers can reconstruct which layer likely contributed to success, failure, or drift.

Stacked influence ? Run configuration ? Run-level evidence ? Evidence pack ? RAIDT score profile ? Governance readiness

In this sense, stacked influence is not peripheral to RAIDT. It is one of the clearest examples of why RAIDT needs run-level evidence. The more behaviour is shaped by combinations of interventions, the less adequate model-level description becomes.

Link to the five RAIDT pillars

This item has its strongest direct effects on Auditability, Dependability, and Traceability, but it also has important implications for Responsibility and Interpretability.

Responsibility

Stacked influence matters for Responsibility because combined interventions create distributed design responsibility. Someone must own the prompt pattern, someone must own the retrieval corpus, someone must approve the adapter, and someone must approve the alignment or policy controls.