S6.02 - Baseline_prompting

S6.02 ? Baseline prompting

flowchart LR
    A1[Minimal prompt use] --> B[RAIDT - run-level evidence framework]
    A2[No comparison point] --> B
    A3[Claims of improvement without evidence] --> B
    B --> C[[Baseline prompting - reference condition]]
    C --> D[Run-level evidence pack]
    C --> E[Five-pillar score profile]
    C --> F[Reviewer reconstruction]
    C --> G[Organisational learning]
    C --> H[Governance move: evidence over assertion]
    H --> I[Audit readiness and contestability]
    J[Healthcare drafting] --> C
    K[Finance compliance] --> C
    L[Public services] --> C
    M[Education support] --> C
    N[Enterprise productivity] --> C

? Star S6 - Influence Methods as Governance Interventions

Star context: Positions baseline prompting as the reference condition against which stronger influence methods can be assessed, so that RAIDT can show how governance readiness changes when additional controls are introduced.

Academic picture

Definition / background

Baseline prompting is the minimally enhanced prompt condition used as the reference point for evaluating a generative AI run before more explicit governance interventions are added. In practice, it is the prompt configuration that captures what the system would do under ordinary or default instruction conditions, without the extra scaffolding introduced by structured prompting, role specification, retrieval augmentation, or fine-tuned intervention layers.

Conceptually, baseline prompting inherits the logic of a control condition from experimental design. The point is not that the baseline is perfect, neutral, or context-free. The point is that it provides a stable comparison state from which a researcher, auditor, or governance reviewer can judge whether later interventions produce meaningful improvements in output quality, interpretability, safety, dependability, or traceability.

In generative AI governance, this matters because organisations frequently describe control measures without showing what those measures changed. A baseline prompt allows the evaluator to distinguish between performance that was already achievable with a simple instruction and performance that genuinely depends on a stronger governance mechanism. This reduces overclaiming and improves analytic discipline.

Within RAIDT, baseline prompting belongs inside the framework because RAIDT treats the run as the unit of governance. A run-level evidence pack is more defensible when it can show the unenhanced prompt condition alongside the governed condition, explain the differences in evidence, and connect those differences to the five-pillar score profile. Baseline prompting therefore supports comparison, attribution, and reviewability rather than acting as an end-state governance solution.

Baseline prompting also differs from related terms. It is not identical to zero-shot prompting, because a baseline may still contain a short task instruction or local formatting request. It is not the same as prompting in general, because its role is comparative rather than simply operational. In RAIDT terms, baseline prompting is the reference condition that helps establish whether additional influence methods are improving governance readiness or merely adding procedural complexity.

Why this concept matters

Baseline prompting matters because governance claims are weak when they lack a comparator. If an organisation says that structured prompting, RAG, or another intervention improves responsible use, the obvious follow-up question is: compared with what? A baseline prompt answers that question by establishing the simplest documented run condition against which changes can be assessed.

This avoids a common confusion in AI governance, namely the assumption that any visible control automatically creates better governance. Some controls improve performance, some improve documentation, and some do both, but without a baseline the organisation cannot demonstrate which effect has occurred. Baseline prompting therefore prevents unsupported assertions about improvement.

If baseline prompting is missing, two risks appear. First, governance teams may mistake ordinary model capability for the effect of a specific intervention. Second, they may underestimate the residual risks that remain even after a control is added, because they do not know what the uncontrolled or minimally controlled condition looked like. In both cases, the result is weaker audit readiness and weaker organisational learning.

For organisations using generative AI in real work, baseline prompting helps translate abstract governance principles into operational comparison. It supports evidence-led review, helps supervisors and examiners understand what changed between runs, and gives a practical basis for improving governance interventions over time.

Key idea: Baseline prompting matters because it gives RAIDT a documented comparison point from which governance improvement can be evidenced rather than merely claimed.

What this item enables

A documented reference condition for comparing governed and less-governed runs.
More credible claims about the value of structured prompting, RAG, PEFT/LoRA, or other influence methods.
Clearer attribution of changes in output quality, reliability, and traceability.
Stronger run-level evidence packs because the pre-intervention condition is visible.
Better scoring discipline across the five RAIDT pillars by reducing unsupported assumptions.
Organisational learning about where prompting alone is sufficient and where stronger controls are necessary.

Practical example / likely audience question

Audience question

Why include baseline?

Answer

The concern behind this question is usually that a baseline appears too simple to be useful. The direct answer is that baseline prompting is useful precisely because it is simple: it reveals what the model can do before governance scaffolding is added, and therefore shows what extra intervention is genuinely contributing.

For example, suppose a team introduces a structured prompt template for drafting compliance summaries. If they only evaluate the templated version, they may conclude that the template produces trustworthy outputs. However, without a baseline they cannot tell whether the trustworthiness came from the template itself, from the underlying model, or from reviewer correction after the fact. A documented baseline run makes this comparison visible.

RAIDT handles this better than a generic AI governance approach because it ties the comparison to a specific run, a specific context, and a specific evidence pack. The question is not only whether a prompt seems better in general, but whether the evidence at run level shows a measurable governance improvement that can be reviewed, contested, and repeated.

Practical example in RAIDT terms

Consider a healthcare administration use case in which a generative AI tool drafts patient discharge letters for clinician review. A baseline-prompting run uses a simple instruction such as: draft a discharge letter from the supplied notes. The run-level issue is that the output may be fluent but inconsistent in structure, may omit explanation of uncertainty, and may not make source dependence visible.

In RAIDT terms, the evidence needed would include the exact baseline prompt, the model and version used, the task context, the input notes available to the model, the output produced, reviewer comments, and a comparison against a more controlled run such as a structured prompt or provenance-first RAG configuration. The pillars most affected are Dependability, Interpretability, and Traceability, with Responsibility and Auditability also engaged once reviewers must justify deployment decisions.

Baseline prompting improves governance readiness here because it shows the organisation what the model does under minimal instruction before stronger safeguards are introduced. That evidence helps determine whether later controls genuinely improve consistency, explanation quality, traceability to source material, and reviewer confidence, rather than simply making the workflow look more formal.

Detailed link to RAIDT

Baseline prompting links to RAIDT in four ways.

First, it supports RAIDT's core idea that governance should be grounded in evidence about actual configured uses of generative AI rather than general claims about models.
Second, it links directly to the run because the baseline must be specified for a particular task, time, model configuration, and organisational context.
Third, it strengthens both the evidence pack and the score profile by making it possible to compare a minimally controlled condition with a more governed one.
Fourth, it improves reviewability, contestability, audit readiness, and organisational learning because reviewers can see what changed, why it changed, and whether the change improved governance performance.

Baseline prompting ? Run-level evidence ? Evidence pack ? RAIDT score profile ? Governance readiness

The chain matters because baseline prompting is the starting comparator that turns later intervention claims into assessable governance evidence.

Link to the five RAIDT pillars

Responsibility

Baseline prompting helps clarify what level of human and organisational responsibility is needed when the system is used with minimal control. It exposes whether governance claims rely too heavily on trust in the model rather than accountable oversight.