S9.09 - Internal_audit

S9.09 ? Internal audit

flowchart LR
    A[Traditional audit problem
Policy statements without inspectable runs] --> B[RAIDT
Run-level evidence framework]
    A2[Fragmented records
Weak reconstruction of GenAI use] --> B
    K[Finance, healthcare, public services,
procurement, enterprise productivity] --> C
    B --> C[[Internal audit
Structured review of material GenAI runs]]
    C --> D[Run-level evidence pack]
    C --> E[RAIDT score profile]
    C --> F[Reviewer reconstruction]
    C --> G[Exception testing]
    D --> H[Reviewability]
    E --> I[Comparative oversight]
    F --> J[Audit readiness]
    G --> L[Organisational learning]
    I --> M[Policy alignment]
    H --> J
    J --> L
    L --> M

? Star S9 - Policy, Standards and Assurance

Star context: Connects RAIDT to policy instruments, standards, assurance, procurement, audit and organisational accountability by showing how formal oversight can inspect concrete GenAI runs rather than rely only on high-level policy claims.

Academic picture

Definition / background

Internal audit is an independent organisational review function that evaluates whether governance arrangements, controls, and risk responses are designed appropriately and are operating effectively. In conventional governance settings, internal audit samples transactions, decisions, controls, and exceptions to assess whether the organisation can evidence compliance, accountability, and improvement. In the context of generative AI, the challenge is that many important uses are highly contextual, fast-moving, and poorly documented, which makes them difficult to inspect after the fact.

Within RAIDT, internal audit becomes the disciplined practice of sampling material GenAI runs, inspecting the evidence associated with those runs, comparing their RAIDT score profiles, and identifying both local failures and systemic governance weaknesses. The item therefore belongs inside RAIDT because RAIDT treats the run as the unit of governance. A run is not just an output event; it is an inspectable organisational event with context, configuration, evidence, and downstream implications.

This matters conceptually because internal audit is distinct from policy drafting, external assurance, and incident investigation. Policy states expectations. Assurance may attest to overall adequacy. Incident response deals with harms or failures once they emerge. Internal audit, by contrast, tests whether the control environment is genuinely functioning in practice. RAIDT strengthens that role by giving internal audit a structured evidential object: the run-level evidence pack and the accompanying five-pillar score profile.

The relationship to the five pillars is direct. Responsibility clarifies ownership and approval. Auditability supports inspection and control testing. Interpretability helps reviewers understand why a run took the form it did. Dependability supports assessment of consistency and operational robustness. Traceability enables reconstruction across prompts, outputs, actors, time, and workflow steps. Together, these make internal audit more than a generic review activity: they make it a practical method for inspecting the governance quality of GenAI use in organisational work.

Why this concept matters

Internal audit matters because organisations often claim to govern AI responsibly without being able to show how particular uses were reviewed, approved, challenged, or evidenced. When governance remains at the level of policy statements or annual control narratives, important GenAI activity can escape meaningful inspection. RAIDT addresses this gap by making materially significant runs sampleable and reviewable.

The concept also avoids a common confusion in AI governance: the idea that auditing AI means only auditing the model, the vendor, or the policy framework. In practice, many organisational risks arise in situated use: how the system was configured, what task it was used for, what data were supplied, who relied on the output, and whether the use complied with local controls. Internal audit matters because it examines these operational realities rather than only abstract governance claims.

If internal audit is missing, organisations may have no defensible way to detect recurring evidence gaps, weak approval pathways, inconsistent human oversight, or poor alignment between policy and day-to-day use. That absence increases exposure to regulatory challenge, procurement failure, weak assurance, and poor organisational learning. RAIDT gives internal audit a concrete basis for testing whether governance is actually functioning at the point where GenAI is used.

Key idea: Internal audit matters in RAIDT because it turns each material GenAI run into an inspectable governance event rather than leaving accountability at the level of policy assertion.

What this item enables

Sampling of materially significant GenAI runs in the same disciplined way that internal audit samples transactions, controls, and exceptions.
Testing of evidence completeness, including whether key metadata, prompts, outputs, approvals, and review records are present.
Comparison of RAIDT score profiles across teams, tasks, tools, vendors, time periods, and risk classes.
Identification of control design weaknesses, such as missing approval points, inconsistent documentation, or unclear ownership.
Detection of control operation failures, such as required reviews not being performed or evidence not being retained.
Escalation from run-level findings to wider governance actions, including policy updates, training, assurance reviews, and incident response.
Organisational learning by showing repeated patterns of weakness rather than treating each problematic run as an isolated anomaly.

Practical example / likely audience question

Audience question

Why sample GenAI runs like transactions instead of simply auditing the AI policy, the model, or the supplier once?

Answer

The concern behind this question is understandable: many people assume AI governance can be evaluated adequately by checking whether a policy exists, whether a vendor has documentation, or whether a model passed an initial assessment. That approach is incomplete because organisational risk often arises in use, not only in design. A model may be acceptable in principle while a particular run is poorly scoped, weakly evidenced, used with the wrong data, or relied upon without appropriate human review.

The direct answer is that each material GenAI use becomes an inspectable governance event in RAIDT. Sampling runs allows internal audit to see whether controls actually operated when work was performed. For example, an organisation may require human sign-off for AI-assisted drafting in a high-risk function. A policy review alone shows that the requirement exists. A run-level audit shows whether the sign-off happened, whether the prompt context was appropriate, whether the output was checked, and whether the evidence retained is sufficient for reconstruction.

RAIDT handles this better than a generic AI governance approach because it gives internal audit a structured object of review rather than an abstract compliance narrative. Instead of asking only, "Do you have an AI policy?", the auditor can ask, "Show me ten material runs from this process, their evidence packs, their score profiles, and the exceptions logged against them." That is a much stronger basis for assurance, challenge, and improvement.

Practical example in RAIDT terms

Consider an enterprise finance team using a GenAI assistant to draft first-pass summaries of supplier risk for procurement committees. The use case is not fully autonomous decision-making, but it still influences organisational judgement. A run-level issue arises when one summary is produced without a clear record of the source documents provided to the model, the prompt template used, or the human review performed before the summary was circulated.

In RAIDT terms, internal audit would sample this run and ask for the evidence pack: task purpose, date and time, user role, model or tool version, prompt or instruction set, input sources, output artefact, reviewer identity, approval status, and any exception notes. The auditor would then inspect the RAIDT score profile to see whether weaknesses had already been signalled in Responsibility, Auditability, or Traceability.

The pillars most obviously affected are Responsibility, because ownership and approval must be clear; Auditability, because the run must be inspectable; Dependability, because the summary may shape procurement decisions; and Traceability, because reviewers must be able to reconstruct what informed the output. Internal audit improves governance readiness here by converting a vague concern about "AI risk in procurement" into a concrete test of whether a specific use was governed well enough to withstand challenge.

Detailed link to RAIDT

Internal audit links to RAIDT in four ways.

First, it operationalises RAIDT's core idea that governance should attach to concrete runs rather than remain at the level of generic AI principles. Internal audit needs a stable unit of inspection, and RAIDT provides one.

Second, it links directly to the run and to run-level evidence. Each sampled run can be reconstructed in context, allowing internal audit to examine what was done, by whom, with what inputs, under which constraints, and with what downstream reliance.

Third, it links to the evidence pack and the RAIDT score profile. The evidence pack gives documentary substance to the run, while the score profile gives internal audit a comparative lens for spotting patterns, exceptions, and areas requiring escalation.

Fourth, it links to reviewability, contestability, audit readiness, and organisational learning. Internal audit is not only about detecting defects; it is about producing findings that improve policy, controls, training, procurement choices, and the maturity of organisational governance over time.

Internal audit ? Run-level evidence ? Evidence pack ? RAIDT score profile ? Governance readiness

In this chain, internal audit is the reviewing mechanism that turns RAIDT artefacts into practical accountability. RAIDT makes runs inspectable; internal audit makes that inspectability organisationally consequential.

Link to the five RAIDT pillars

Responsibility

Internal audit is strongly linked to Responsibility because auditors need to know who initiated the run, who authorised its use, who reviewed the output, and who is accountable for any downstream decision or action. Without clear responsibility, findings cannot be assigned, remediated, or escalated effectively.