C0.08 - Core_value_reviewability

C0.08 ? Core value: reviewability

flowchart LR
    A[Fragmented records and memory-based reconstruction] --> B[RAIDT
Run-level evidence framework]
    A2[Generic policy claims without case-level proof] --> B
    B --> C[[Core value: reviewability
Inspectable run after the event]]
    F1[Healthcare]
    F2[Finance]
    F3[Education]
    F4[Public services]
    F5[Enterprise productivity]
    F1 --> C
    F2 --> C
    F3 --> C
    F4 --> C
    F5 --> C
    C --> D[Run-level evidence pack]
    C --> E[RAIDT score profile]
    C --> H[Reviewer reconstruction]
    D --> I[Contestability]
    E --> J[Governance readiness]
    H --> K[Organisational learning]
    J --> L[Audit-ready oversight]

? Star C0 - RAIDT Core, Definition, Values, Claims and Innovation

Star context: This item defines one of the core values that gives RAIDT its practical governance character. Within the C0 star, reviewability explains why RAIDT treats the run as the unit of analysis: governance should leave behind enough structured evidence for another person to inspect, reconstruct, and evaluate what happened in organisational GenAI use.


Definition / background

Reviewability means that a run can be examined after the fact by someone who was not the original operator and who may not share the operator's memory, assumptions, or role. In RAIDT, this value is essential because governance cannot depend on recollection, informal explanation, or trust in the user's intentions alone. A responsible governance framework must make it possible for another person to inspect what task was attempted, under what conditions, with which model and configuration, what evidence was captured, and how the output was interpreted or acted upon.

Conceptually, reviewability sits close to ideas such as transparency, auditability, documentation, and procedural accountability, but it is not identical to any one of them. Transparency may refer broadly to openness, while documentation may simply mean records exist. Reviewability is more demanding. It asks whether those records are organised and sufficient enough to support meaningful later inspection. A run may be documented badly and still not be reviewable. Likewise, a system may be auditable at a high level while an individual use instance remains difficult to reconstruct.

This is why the concept belongs centrally inside RAIDT. RAIDT treats the run as the unit of governance, so reviewability must be established at the level where real GenAI work actually occurs. The run-level evidence pack provides the artefacts needed for inspection, while the five-pillar score profile provides a structured way of interpreting how governance quality appears in that run. Reviewability is therefore both a value and a design requirement: the framework must capture evidence in a form that enables later scrutiny rather than merely asserting that governance took place.

Reviewability also has a strong relationship with the five RAIDT pillars. Responsibility depends on being able to inspect role allocation and decision ownership. Auditability depends on whether an independent reviewer can examine a run systematically. Interpretability depends on whether the reasons for outputs and decisions can be made intelligible in context. Dependability depends on being able to examine consistency, failure, and control. Traceability depends on linking outputs back to prompts, models, settings, actors, and timestamps. For that reason, reviewability is best understood as a cross-cutting core value that helps make the entire RAIDT architecture usable for governance.

Why this concept matters

Organisations increasingly use GenAI inside workflows that matter, yet many of those uses remain difficult to inspect once the moment of use has passed. If a problematic output appears, a manager, auditor, supervisor, or affected stakeholder may ask what happened. Without reviewability, the answer often becomes partial and unreliable: the user remembers some details, the output survives, but the surrounding run conditions are missing. That is a weak basis for governance, assurance, or learning.

Reviewability matters because it solves a practical governance problem. It allows later reviewers to move from a vague narrative about what probably happened to a more grounded reconstruction of what actually happened in the run. This is especially important when GenAI is used in settings with compliance exposure, service risk, public accountability, or quality assurance obligations. In such contexts, an organisation does not simply need outputs; it needs inspectable evidence about how outputs were produced and handled.

The concept also prevents a common confusion in AI governance: the belief that a policy, a principle statement, or a training declaration is enough. Those artefacts matter, but they do not by themselves make a specific use of GenAI reviewable. RAIDT closes that gap by operationalising reviewability through run-level evidence, evidence packs, and structured scoring. This helps shift governance away from principle-only compliance and towards operational governance that can survive scrutiny.

Key idea: Reviewability matters because RAIDT makes organisational GenAI use inspectable after the event at the level of the individual run, rather than leaving governance to memory, trust, or generic policy claims.

What this item enables
Practical example / likely audience question

Audience question

Who benefits from reviewability in RAIDT, and is it only useful for formal auditors?

Answer

The concern behind this question is that reviewability might sound like a narrow compliance feature designed only for audit teams. In practice, the beneficiary group is much wider. Formal auditors do benefit because reviewability supports systematic inspection, but so do supervisors, compliance teams, operational managers, model risk teams, service owners, affected users, and internal quality reviewers. Any party that needs to understand, assess, defend, question, or improve a GenAI-supported action benefits from a run being reviewable.

The direct answer is that reviewability serves both assurance and learning. It allows someone to check whether a run followed appropriate governance expectations, but it also helps a team understand where process quality was strong or weak. For example, if a staff member uses a GenAI system to draft a policy memo and the final document later contains a problematic claim, reviewability allows the organisation to inspect the run context, the prompt framing, the evidence captured, the human verification steps, and the basis for acceptance. Without that structure, the team is left with only the final text and an incomplete recollection of how it was produced.

RAIDT handles this issue better than a generic AI governance approach because it does not stop at saying that AI use should be documented or monitored. It ties reviewability to the run itself and connects that run to a structured evidence pack and score profile. That means the question "who benefits?" can be answered operationally: anyone who must inspect, reconstruct, challenge, or improve a real use instance benefits from RAIDT's form of reviewability.

Practical example in RAIDT terms

Consider a public service department using a GenAI assistant to draft responses to citizen complaints about housing support. One run involves a caseworker asking the system to summarise a complaint history, generate a draft response, and suggest whether the complaint should be escalated. The run-level issue is not only whether the draft looks plausible, but whether the department can later review how the draft was produced if the citizen challenges the response.

The evidence needed includes the task purpose, prompt wording, relevant case context, model or tool used, timestamp, staff role, any retrieval or source material provided, the generated draft, the edits made by the human caseworker, and the reason the final response was approved. The affected RAIDT pillars are Responsibility, because ownership and approval matter; Auditability, because an independent reviewer must be able to inspect the run; Interpretability, because the reviewer must understand why the draft took the form it did; Dependability, because the organisation must assess whether the process behaves consistently; and Traceability, because the draft and final response must be linked back to the run conditions.

Reviewability improves governance readiness here because it allows the department to move beyond saying that staff are trained and careful. It can show what happened in the run, how the output was handled, and whether the process supports later scrutiny. That becomes valuable in complaint resolution, internal assurance, policy refinement, and external accountability.

Detailed link to RAIDT

Reviewability links to RAIDT in four ways.

First, it reinforces RAIDT's core idea that responsible GenAI governance should be based on evidence from actual use rather than policy assertions alone.
Second, it depends on the run being captured as a concrete governance unit that can be reconstructed after the event.
Third, it becomes operational through the run-level evidence pack and is interpreted through the RAIDT score profile rather than through informal commentary alone.
Fourth, it directly supports reviewability itself, contestability, audit readiness, and organisational learning because later reviewers can inspect evidence instead of relying on memory.

Reviewability ? Run-level evidence ? Evidence pack ? RAIDT score profile ? Governance readiness

This chain matters because reviewability is not an isolated moral value in RAIDT. It is implemented through evidence capture and structured interpretation, which is why it contributes directly to governance readiness rather than remaining a purely conceptual aspiration.

Link to the five RAIDT pillars

Responsibility

Reviewability strengthens Responsibility because it makes role allocation, approval, oversight, and exception handling inspectable after the run.

Example evidence / implication:

Auditability

Reviewability has its strongest direct relationship with Auditability because a run cannot be meaningfully audited if it cannot first be reconstructed and examined.

Example evidence / implication:

Interpretability

Reviewability supports Interpretability by ensuring that later reviewers can understand not just what the model produced, but how the run context shaped the output and how humans interpreted it.

Example evidence / implication:

Dependability

Reviewability contributes to Dependability because repeat failures, unstable practices, and weak controls become visible only when runs can be examined consistently over time.

Example evidence / implication:

Traceability

Reviewability depends heavily on Traceability because later inspection requires links between the run, the model, the prompt, the output, the actor, and the decision context.

Example evidence / implication:

Reviewability most strongly affects Auditability and Traceability, but it also has material implications for the other pillars because meaningful governance review usually spans all five.

Why this item is more than a generic concept

In general AI governance, reviewability may mean little more than keeping records so someone can look back later if needed. In RAIDT, it means something more operational and more demanding. A run is reviewable only when the evidence is sufficiently structured, contextualised, and preserved to let another person reconstruct and evaluate the use instance.

That RAIDT meaning is more operational because it is tied to run-level evidence rather than abstract documentation duties. It is also more analytically useful because it links directly to evidence packs, scoring, and governance readiness. In other words, RAIDT does not treat reviewability as a vague virtue. It turns it into a practical test of whether governance survives scrutiny at the level where GenAI is actually used.

Common misunderstanding

Misunderstanding

Reviewability simply means storing logs or keeping a copy of the output.

Correction

That is too weak. Logs and outputs may be part of reviewability, but they do not guarantee that a later reviewer can understand what happened in the run. A stored answer without the task context, prompt conditions, human role, model setting, and acceptance rationale is often not meaningfully reviewable. For example, saving only a generated clinical note draft would not allow a reviewer to judge whether the draft was appropriate, whether the source context was sufficient, or whether a clinician checked it properly. In RAIDT, reviewability requires enough structured evidence to support informed reconstruction and assessment, not mere record retention.

Boundary and limitation

Reviewability does not prove that a run was correct, safe, fair, or compliant. It only makes later inspection more possible and more rigorous. A run can be highly reviewable and still reveal poor judgement, weak evidence, or harmful outputs. In that sense, reviewability is a governance enabler, not a guarantee of good outcomes.

It also depends on conditions that may not always hold. Evidence can be incomplete, logs can be inconsistently captured, privacy constraints may limit what is retained, and reviewers may still disagree about interpretation. Highly dynamic workflows may also make reconstruction difficult if data sources, tools, or contextual inputs change quickly. RAIDT handles this limitation by combining reviewability with the wider evidence pack and the five-pillar score profile, so that gaps can be made visible, discussed explicitly, and improved over time rather than hidden behind false confidence.

Implementation levels

Manual implementation

A researcher or small team can implement reviewability manually by using structured run templates, saving prompts and outputs, recording task context, noting model details, and documenting who reviewed or approved the result. Even a spreadsheet or markdown-based evidence pack can improve reviewability substantially if the fields are consistent.

Semi-automated implementation

Semi-automated implementation can use metadata capture, form-based evidence collection, versioned templates, and lightweight dashboards that assemble run artefacts into a standard evidence pack. This reduces omission risk and makes later review faster and more consistent.

Fully automated implementation

At scale, reviewability can be embedded in a system wrapper, orchestration layer, logging pipeline, or governance platform that automatically captures run identifiers, timestamps, model configuration, prompt variants, outputs, human interventions, approval states, and scoring cues. In that form, RAIDT becomes an operational governance layer that supports large-volume review, escalation, analytics, and assurance across many organisational runs.

Practical use in the RAIDT project

This item is useful across the RAIDT project because it clarifies why the framework is more than a taxonomy of governance ideas. In Paper 08 Foundations, reviewability helps define the normative logic of RAIDT: governance should be inspectable at the level of actual use. In Paper 09 Empirical Validation, it can guide the assessment of whether participants and organisations can realistically reconstruct and evaluate runs using the evidence provided. In Paper 10 Policy Pathways, it supports translation from abstract AI governance principles into operational requirements for documentation, oversight, and assurance.

It is also directly useful in the evidence pack and scoring rubric because reviewability can be treated as a practical design criterion for what evidence should be retained and how it should be organised. In sector playbooks and governance interventions, it helps explain to practitioners why structured evidence capture is not bureaucratic excess but a necessary condition for defensible use. For supervision, viva defence, and journal positioning, this item helps differentiate RAIDT from generic responsible AI discourse by showing that the framework makes a reviewable run the basis of governance.

Key audience questions to prepare for

Q1. How is reviewability different from traceability?

Traceability is about linking artefacts, actors, steps, and outputs across the run. Reviewability is about whether those linked materials are sufficient for another person to inspect and assess the run meaningfully. Traceability supports reviewability, but traceability alone does not guarantee it.

Q2. Does reviewability require full data retention for every run?

No. Reviewability requires sufficient evidence for later inspection, not unlimited retention. What counts as sufficient depends on context, risk, and governance need. RAIDT helps make that sufficiency question explicit rather than assuming that either total retention or minimal retention is always appropriate.

Q3. Why not rely on organisational policy and user declarations instead?

Because policy and declarations state expectations, but they do not reconstruct what happened in a specific run. RAIDT adds operational evidence so governance can be examined in practice rather than assumed from compliance language.

Q4. Is reviewability mainly for regulated sectors?

It is especially important there, but not limited to them. Any organisational setting in which GenAI outputs shape work quality, decisions, records, or stakeholder outcomes can benefit from reviewability because later scrutiny and learning are common governance needs.

Q5. Can a run be reviewable even if the model itself is not fully interpretable?

Yes. RAIDT does not require full insight into internal model mechanics for a run to be reviewable. It requires enough contextual and procedural evidence to inspect how the model was used, what it produced, how humans handled it, and whether governance controls were followed.

Suggested citation concepts to support this item
Short explanation for presentation

Reviewability is a core RAIDT value because responsible governance requires more than policy statements or user assurances. It requires a later reviewer to inspect what happened in a specific GenAI run. In RAIDT, that means the run must leave behind enough structured evidence to reconstruct the task, context, prompts, model use, human decisions, and output handling. This matters because organisations often face questions only after the event: why was this output accepted, what evidence supported it, and who checked it? RAIDT answers those questions at run level through evidence packs and a five-pillar score profile. So reviewability is not just about record keeping. It is what allows governance to become inspectable, challengeable, and improvable in real organisational use.

One-line takeaway

Reviewability is the capacity to inspect and reconstruct a specific GenAI run after the event because RAIDT ties governance to run-level evidence rather than to assertion alone.

Related items in RAIDT core, definition, values, claims and innovation
Anchored questions

No anchored questions were present in the original note.

Powered by Forestry.md