C0.04 - Evidence_pack

C0.04 ? Evidence pack

flowchart LR
    A[Fragmented evidence
logs, prompts, approvals, policy references] --> B[RAIDT
run-level evidence framework] H[Practical evidence elements
run ID, task, prompts, settings, outputs, review, retention] --> C[[Evidence pack
structured record of one run]] B --> C C --> D[Reviewer reconstruction] C --> E[RAIDT score profile] C --> F[Contestability and audit readiness] E --> G[Governance readiness and organisational learning]

? Star C0 - RAIDT Core, Definition, Values, Claims and Innovation

Star context: Defines the project identity of RAIDT by showing that the framework's first practical governance artefact is a structured evidence object for one run, not merely a policy statement or assurance claim.


Definition / background

An evidence pack is the structured, review-ready record of one RAIDT run. It assembles the run-level evidence associated with a specific use of a generative AI system and presents it in a form that can be inspected by a supervisor, auditor, manager, regulator, complaint handler, or researcher. In practical terms, the pack links identifiers, task context, prompts, configuration, retrieval or source material, model settings, outputs, integrity checks, human review actions, decisions, and retention rules.

This distinction matters conceptually. Run-level evidence is the underlying evidential material produced by or around one run. The evidence pack is the organised governance object created from that material. In other words, run-level evidence is the substance, while the evidence pack is the structured presentation of that substance for review and judgement. The pack is therefore more than a log bundle and more governance-oriented than a raw technical trace.

Within RAIDT, the evidence pack is one of the framework's two practical outputs, alongside the five-pillar score profile. The pack supports the score profile by giving reviewers a documented basis for judging Responsibility, Auditability, Interpretability, Dependability, and Traceability. Without an evidence pack, a score profile risks becoming insufficiently justified; without run-level evidence, an evidence pack cannot be meaningfully assembled.

This item belongs inside RAIDT Core because it shows how the framework turns the abstract commitment to evidence over assertion into an operational artefact. RAIDT is not only a way of saying that governance should be evidence-based. It specifies what that evidence should look like when organised for real organisational scrutiny.

Why this concept matters

The evidence pack solves a practical governance problem: even when organisations collect fragments of information about GenAI use, they often cannot present those fragments as one coherent proof object. When a supervisor asks what happened in a specific run, when a complaint must be handled, or when an internal review is triggered, disconnected records are difficult to interpret and easy to contest. The evidence pack provides a structured answer.

It also avoids a common confusion between having data and having reviewable governance evidence. Logs, prompts, screenshots, approval emails, and version numbers may all exist, but if they are not assembled into a meaningful record, governance remains weak. The evidence pack converts scattered traces into an inspectable unit that can support explanation, challenge, and organisational learning.

If the evidence pack is missing, organisations may struggle to justify decisions, compare runs, explain why a score was assigned, or show that human oversight actually occurred. The result is often a return to assertion: people say a process was controlled, but cannot demonstrate it clearly. RAIDT uses the evidence pack to make operational governance visible.

Key idea: The evidence pack matters because it turns scattered run-level traces into a structured proof object that supports review, challenge, scoring, and governance readiness.

What this item captures
Practical example / likely audience question

Audience question

Is the evidence pack just an archive of logs and documents, or does it do something more specific in RAIDT?

Answer

The concern behind this question is that organisations already generate many records, and a new governance artefact may look like extra bureaucracy. The direct answer is that the evidence pack is not simply a store of miscellaneous files. It is a structured, review-oriented package that assembles the evidential pieces of one run into a form that another person can inspect and use.

For example, imagine a financial-services team using GenAI to draft a customer complaint response. Raw traces may exist across several places: the prompt in one interface, the model version in system logs, the draft reply in a document, and approval comments in email or ticketing software. An evidence pack brings those pieces together and shows the run as one governance event. A reviewer can then see what the task was, what the model produced, how staff intervened, what checks were performed, and why the final response was accepted.

RAIDT handles this better than a generic AI governance approach because it does not stop at saying that evidence exists somewhere in the organisation. It requires a run-level proof object that can be reconstructed, inspected, and linked to scoring and governance readiness. That makes the evidence pack operational rather than merely archival.

Practical example in RAIDT terms

Consider a public-services setting in which a caseworker uses GenAI to draft a summary of a citizen's housing-support case before a supervisory review meeting. The GenAI use case is administratively useful, but the run-level issue is whether the summary accurately reflects the case record, avoids unsupported inferences, and can be defended if the citizen later challenges the decision process.

The evidence needed includes the run identifier, task purpose, prompt template, source case notes, any retrieval context, the model and configuration used, the generated summary, the caseworker's edits, the supervisor's comments, and the final decision about whether the draft could be used. Responsibility is affected because the organisation must show who reviewed and approved the summary. Auditability is affected because a later reviewer must be able to reconstruct the run. Interpretability is affected because the pack should show how the summary emerged from the prompt and source record. Dependability is affected because the organisation must assess whether the drafting process is consistently reliable. Traceability is affected because the run must remain linked to the relevant actor, artefacts, and decision stage.

The evidence pack improves governance readiness because it gives the organisation a defensible record for internal review, appeal handling, training improvement, and policy refinement. Instead of relying on a vague statement that staff used AI appropriately, the organisation can show what happened in one concrete case.

Detailed link to RAIDT

Evidence pack links to RAIDT in four ways.

First, it gives concrete form to RAIDT's core value of evidence over assertion by turning one GenAI use event into an inspectable governance artefact.
Second, it depends on the run and its run-level evidence, because the pack is assembled around one specific configured use of GenAI in context.
Third, it is one of RAIDT's two practical outputs and provides the documented basis from which a RAIDT score profile can be justified.
Fourth, it supports reviewability, contestability, audit readiness, and organisational learning because it gives reviewers a structured object for reconstruction and comparison.

Run ? Run-level evidence ? Evidence pack ? RAIDT score profile ? Governance readiness

Link to the five RAIDT pillars

Responsibility

The evidence pack supports Responsibility by showing who initiated the run, who reviewed it, who approved or rejected it, and what organisational purpose the run served.

Example evidence / implication:

Auditability

This item has a particularly strong effect on Auditability because the evidence pack is the main object a reviewer inspects when reconstructing a run after the event.

Example evidence / implication:

Interpretability

The evidence pack supports Interpretability by preserving enough contextual detail to explain how an output emerged in practice, even if the internal model remains only partly interpretable.

Example evidence / implication:

Dependability

The evidence pack supports Dependability by enabling repeated examination of whether comparable runs produce acceptable outputs under controlled conditions.

Example evidence / implication:

Traceability

The evidence pack is also central to Traceability because it connects the run to time, actor, tool version, source materials, and downstream decisions or actions.

Example evidence / implication:

The evidence pack affects all five pillars, but it is especially strong for Auditability and Traceability because those pillars depend directly on whether the run can be inspected as a coherent record.

Why this item is more than a generic concept

In general AI governance, an evidence pack might be understood loosely as a folder of supporting documents, compliance records, or assurance artefacts. In RAIDT, it has a more specific and operational meaning: it is the structured record of one run, assembled from run-level evidence so that the run can be reconstructed, challenged, scored, and reviewed.

The RAIDT meaning is more operational because the evidence pack is tied to a defined unit of governance, the run, and to explicit downstream uses such as the score profile and governance-readiness assessment. It is therefore not just documentation storage. It is a practical mechanism for accountable review.

Common misunderstanding

Misunderstanding

If an evidence pack exists, it proves that the run was compliant, correct, or safe.

Correction

An evidence pack does not prove that a run was good. It proves that the run can be examined. A strong pack may reveal that a process failed, that oversight was weak, or that a decision should be challenged. For example, a pack may clearly show that a GenAI-generated case summary was reviewed too late or against incomplete source material. That is still valuable governance evidence. In RAIDT, the pack supports judgement and contestability; it does not replace judgement.

Boundary and limitation

The evidence pack does not replace broader AI governance mechanisms such as procurement due diligence, model evaluation, staff training, legal compliance review, or system-level risk assessment. It also does not guarantee that all relevant facts were captured. If evidence capture is poorly designed, inconsistent, or incomplete, the resulting pack may give only a partial view of the run.

There is also a proportionality challenge. Rich evidence packs are particularly useful in higher-risk or contested settings, but excessive capture for low-significance tasks may create burden, privacy concerns, or poor data quality. RAIDT handles this limitation by treating the evidence pack as a structured governance object that should be proportionate to context, sensitivity, and review need rather than indiscriminately maximal.

Implementation levels

Manual implementation

A researcher or small team can create an evidence pack manually using a template that records the run identifier, purpose, prompt, inputs, settings, outputs, review notes, approval status, and retention decision. This is feasible for pilot studies, supervision, viva preparation, or high-value use cases.

Semi-automated implementation

Semi-automated implementation can combine structured forms, metadata capture, and checklist-based review. For example, a wrapper or workflow form can automatically attach timestamps, model identifiers, and prompt fields, while requiring a human reviewer to complete decision notes and sign-off fields.

Fully automated implementation

At scale, a platform, orchestration layer, or governance pipeline can assemble evidence packs automatically from run metadata, stored artefacts, review actions, version records, and scoring inputs. In this form, the pack becomes a reusable governance object for dashboards, audits, incident review, and cross-team learning.

Practical use in the RAIDT project

Within the RAIDT project, this item is important for Paper 08 Foundations because it shows how RAIDT turns the abstract notion of evidence-based governance into a concrete artefact. It is equally relevant to Paper 09 Empirical Validation because practical testing of RAIDT depends on whether evidence packs can be assembled consistently and judged usefully across cases.

For Paper 10 Policy Pathways, the evidence pack provides a bridge between conceptual governance claims and implementable organisational controls. It also matters for sector playbooks because different domains will populate the pack differently while preserving the same RAIDT logic. In the scoring rubric, the evidence pack provides the basis for justified scoring rather than impressionistic assessment. In influence methods and governance interventions, it helps explain how organisations can move from broad AI principles to practical review mechanisms.

For PhD supervision, viva defence, and journal positioning, the evidence pack is especially useful because it answers a simple but demanding question: what does RAIDT actually produce that another person can inspect? The answer is that RAIDT produces a structured run-level evidence object that supports scrutiny, comparison, and learning.

Key audience questions to prepare for

Q1. Why is the evidence pack a distinct concept if RAIDT already has run-level evidence?

Run-level evidence refers to the underlying material associated with a run. The evidence pack is the structured assembly of that material into a review-ready object. The distinction matters because governance depends not only on capturing evidence, but also on presenting it coherently for inspection.

Q2. Is every GenAI interaction supposed to generate a full evidence pack?

Not necessarily at the same depth. RAIDT implies proportionality. Higher-stakes, externally visible, or contestable uses justify richer packs, while lower-significance tasks may require a lighter structure. The key principle is that the pack should be sufficient for reconstruction and review relative to risk.

Q3. How does the evidence pack relate to the score profile?

The score profile is the evaluative summary across the five RAIDT pillars. The evidence pack is the supporting basis for that evaluation. Without a pack, the score profile can become weakly evidenced or difficult to defend.

Q4. Can an evidence pack include human judgement and not just technical traces?

Yes. In RAIDT it should. Human edits, reviewer comments, approval decisions, and reasons for acceptance or rejection are part of the governance significance of the run. Excluding them would make the pack less useful for accountability.

Q5. What makes the evidence pack valuable to organisations beyond compliance?

It supports learning as well as assurance. Evidence packs help organisations analyse incidents, compare practice across teams, refine workflows, train staff, and identify where governance controls are strong or weak in real use rather than in theory.

Suggested citation concepts to support this item
Short explanation for presentation

The evidence pack is one of RAIDT's two practical outputs and can be understood as the structured record of one GenAI run. It takes the raw material of run-level evidence and assembles it into a review-ready object that another person can inspect. That is important because organisations often have scattered traces of AI use, but not a coherent record that supports scrutiny. In RAIDT, the evidence pack links task context, prompts, settings, source materials, outputs, human review, decisions, and retention rules. This makes the run reconstructable and gives a defensible basis for assigning a five-pillar RAIDT score profile. In supervision, policy, or practice discussions, the evidence pack is useful because it shows that RAIDT does not merely advocate better governance in principle. It specifies a concrete artefact through which governance can be reviewed, challenged, and improved.

One-line takeaway

Evidence pack is the structured review object of one GenAI run because RAIDT turns run-level evidence into inspectable governance evidence.

Related items in RAIDT core, definition, values, claims and innovation
Anchored questions
Powered by Forestry.md