S9.09 - Internal_audit

S9.09 ? Internal audit

flowchart LR
    A[Traditional audit problem
Policy statements without inspectable runs] --> B[RAIDT
Run-level evidence framework]
    A2[Fragmented records
Weak reconstruction of GenAI use] --> B
    K[Finance, healthcare, public services,
procurement, enterprise productivity] --> C
    B --> C[[Internal audit
Structured review of material GenAI runs]]
    C --> D[Run-level evidence pack]
    C --> E[RAIDT score profile]
    C --> F[Reviewer reconstruction]
    C --> G[Exception testing]
    D --> H[Reviewability]
    E --> I[Comparative oversight]
    F --> J[Audit readiness]
    G --> L[Organisational learning]
    I --> M[Policy alignment]
    H --> J
    J --> L
    L --> M

? Star S9 - Policy, Standards and Assurance

Star context: Connects RAIDT to policy instruments, standards, assurance, procurement, audit and organisational accountability by showing how formal oversight can inspect concrete GenAI runs rather than rely only on high-level policy claims.


Academic picture
Definition / background

Internal audit is an independent organisational review function that evaluates whether governance arrangements, controls, and risk responses are designed appropriately and are operating effectively. In conventional governance settings, internal audit samples transactions, decisions, controls, and exceptions to assess whether the organisation can evidence compliance, accountability, and improvement. In the context of generative AI, the challenge is that many important uses are highly contextual, fast-moving, and poorly documented, which makes them difficult to inspect after the fact.

Within RAIDT, internal audit becomes the disciplined practice of sampling material GenAI runs, inspecting the evidence associated with those runs, comparing their RAIDT score profiles, and identifying both local failures and systemic governance weaknesses. The item therefore belongs inside RAIDT because RAIDT treats the run as the unit of governance. A run is not just an output event; it is an inspectable organisational event with context, configuration, evidence, and downstream implications.

This matters conceptually because internal audit is distinct from policy drafting, external assurance, and incident investigation. Policy states expectations. Assurance may attest to overall adequacy. Incident response deals with harms or failures once they emerge. Internal audit, by contrast, tests whether the control environment is genuinely functioning in practice. RAIDT strengthens that role by giving internal audit a structured evidential object: the run-level evidence pack and the accompanying five-pillar score profile.

The relationship to the five pillars is direct. Responsibility clarifies ownership and approval. Auditability supports inspection and control testing. Interpretability helps reviewers understand why a run took the form it did. Dependability supports assessment of consistency and operational robustness. Traceability enables reconstruction across prompts, outputs, actors, time, and workflow steps. Together, these make internal audit more than a generic review activity: they make it a practical method for inspecting the governance quality of GenAI use in organisational work.

Why this concept matters

Internal audit matters because organisations often claim to govern AI responsibly without being able to show how particular uses were reviewed, approved, challenged, or evidenced. When governance remains at the level of policy statements or annual control narratives, important GenAI activity can escape meaningful inspection. RAIDT addresses this gap by making materially significant runs sampleable and reviewable.

The concept also avoids a common confusion in AI governance: the idea that auditing AI means only auditing the model, the vendor, or the policy framework. In practice, many organisational risks arise in situated use: how the system was configured, what task it was used for, what data were supplied, who relied on the output, and whether the use complied with local controls. Internal audit matters because it examines these operational realities rather than only abstract governance claims.

If internal audit is missing, organisations may have no defensible way to detect recurring evidence gaps, weak approval pathways, inconsistent human oversight, or poor alignment between policy and day-to-day use. That absence increases exposure to regulatory challenge, procurement failure, weak assurance, and poor organisational learning. RAIDT gives internal audit a concrete basis for testing whether governance is actually functioning at the point where GenAI is used.

Key idea: Internal audit matters in RAIDT because it turns each material GenAI run into an inspectable governance event rather than leaving accountability at the level of policy assertion.

What this item enables
Practical example / likely audience question

Audience question

Why sample GenAI runs like transactions instead of simply auditing the AI policy, the model, or the supplier once?

Answer

The concern behind this question is understandable: many people assume AI governance can be evaluated adequately by checking whether a policy exists, whether a vendor has documentation, or whether a model passed an initial assessment. That approach is incomplete because organisational risk often arises in use, not only in design. A model may be acceptable in principle while a particular run is poorly scoped, weakly evidenced, used with the wrong data, or relied upon without appropriate human review.

The direct answer is that each material GenAI use becomes an inspectable governance event in RAIDT. Sampling runs allows internal audit to see whether controls actually operated when work was performed. For example, an organisation may require human sign-off for AI-assisted drafting in a high-risk function. A policy review alone shows that the requirement exists. A run-level audit shows whether the sign-off happened, whether the prompt context was appropriate, whether the output was checked, and whether the evidence retained is sufficient for reconstruction.

RAIDT handles this better than a generic AI governance approach because it gives internal audit a structured object of review rather than an abstract compliance narrative. Instead of asking only, "Do you have an AI policy?", the auditor can ask, "Show me ten material runs from this process, their evidence packs, their score profiles, and the exceptions logged against them." That is a much stronger basis for assurance, challenge, and improvement.

Practical example in RAIDT terms

Consider an enterprise finance team using a GenAI assistant to draft first-pass summaries of supplier risk for procurement committees. The use case is not fully autonomous decision-making, but it still influences organisational judgement. A run-level issue arises when one summary is produced without a clear record of the source documents provided to the model, the prompt template used, or the human review performed before the summary was circulated.

In RAIDT terms, internal audit would sample this run and ask for the evidence pack: task purpose, date and time, user role, model or tool version, prompt or instruction set, input sources, output artefact, reviewer identity, approval status, and any exception notes. The auditor would then inspect the RAIDT score profile to see whether weaknesses had already been signalled in Responsibility, Auditability, or Traceability.

The pillars most obviously affected are Responsibility, because ownership and approval must be clear; Auditability, because the run must be inspectable; Dependability, because the summary may shape procurement decisions; and Traceability, because reviewers must be able to reconstruct what informed the output. Internal audit improves governance readiness here by converting a vague concern about "AI risk in procurement" into a concrete test of whether a specific use was governed well enough to withstand challenge.

Detailed link to RAIDT

Internal audit links to RAIDT in four ways.

First, it operationalises RAIDT's core idea that governance should attach to concrete runs rather than remain at the level of generic AI principles. Internal audit needs a stable unit of inspection, and RAIDT provides one.

Second, it links directly to the run and to run-level evidence. Each sampled run can be reconstructed in context, allowing internal audit to examine what was done, by whom, with what inputs, under which constraints, and with what downstream reliance.

Third, it links to the evidence pack and the RAIDT score profile. The evidence pack gives documentary substance to the run, while the score profile gives internal audit a comparative lens for spotting patterns, exceptions, and areas requiring escalation.

Fourth, it links to reviewability, contestability, audit readiness, and organisational learning. Internal audit is not only about detecting defects; it is about producing findings that improve policy, controls, training, procurement choices, and the maturity of organisational governance over time.

Internal audit ? Run-level evidence ? Evidence pack ? RAIDT score profile ? Governance readiness

In this chain, internal audit is the reviewing mechanism that turns RAIDT artefacts into practical accountability. RAIDT makes runs inspectable; internal audit makes that inspectability organisationally consequential.

Link to the five RAIDT pillars

Responsibility

Internal audit is strongly linked to Responsibility because auditors need to know who initiated the run, who authorised its use, who reviewed the output, and who is accountable for any downstream decision or action. Without clear responsibility, findings cannot be assigned, remediated, or escalated effectively.

Example evidence / implication:

Auditability

This is the most directly affected pillar. Internal audit depends on the existence, completeness, and accessibility of evidence that allows a reviewer to inspect whether controls were designed well and operated as intended.

Example evidence / implication:

Interpretability

Interpretability matters because internal audit must understand what the run was trying to achieve, how the output was framed, and why a reviewer considered it acceptable. The goal is not full model transparency in every case, but practical intelligibility for governance review.

Example evidence / implication:

Dependability

Internal audit uses Dependability to assess whether the run and the surrounding process were robust enough for the organisational setting. This is especially important where outputs shape decisions, communications, or operational actions.

Example evidence / implication:

Traceability

Traceability is essential because internal audit needs to reconstruct the lifecycle of a run from initiation to use. Without traceability, findings are weak, disputes are hard to resolve, and learning is fragmented.

Example evidence / implication:

Internal audit touches all five pillars, but it relies most heavily on Auditability and Traceability. Those pillars make the audit function operational, while Responsibility, Interpretability, and Dependability help explain whether the run was governed appropriately.

Why this item is more than a generic concept

In general AI governance, internal audit may mean a periodic review of policy documents, high-level controls, or vendor assurances. That is useful but limited. It can confirm that governance structures exist without proving that they work in day-to-day practice.

In RAIDT, internal audit is more operational because it is tied to run-level evidence. The auditor is not restricted to reviewing intentions or control descriptions; the auditor can inspect specific uses, compare evidence quality across runs, and assess whether the governance system works where risk actually materialises. This gives the concept more precision, more contestability, and greater practical value for accountability.

Common misunderstanding

Misunderstanding

Internal audit of GenAI is just retrospective paperwork that adds bureaucracy after the real work has already happened.

Correction

That view misunderstands the role of internal audit in a governance system. Good internal audit does not merely archive forms after the fact; it tests whether controls are meaningful, whether evidence is sufficient, and whether recurring weaknesses are being addressed. For example, if auditors repeatedly find that AI-assisted outputs are used without documented human review, the result is not only a compliance note. It is a governance finding that can trigger redesign of workflow controls, retraining, or stricter approval rules. In RAIDT, internal audit becomes a feedback mechanism for improving practice, not just a passive record-keeping exercise.

Boundary and limitation

Internal audit does not prove that a GenAI system is universally safe, fair, or correct. It also does not replace technical evaluation, legal advice, external assurance, or incident response. What it can do is test whether the organisation's governance controls are evidenced and operating effectively in sampled runs.

Its effectiveness depends on the quality of evidence capture, the appropriateness of sampling, and the maturity of the surrounding control environment. If run records are incomplete, if materiality thresholds are poorly defined, or if the audit function lacks enough technical understanding, important issues may still be missed. RAIDT handles this limitation by standardising what a run should contain, making comparison easier, and enabling internal audit to escalate from isolated observations to systemic governance findings.

Implementation levels

Manual implementation

A researcher or small team can implement this manually by selecting a set of material GenAI uses, collecting the evidence associated with each run, and reviewing them against a RAIDT checklist or scoring rubric. This may involve shared folders, structured note templates, and manual reviewer sign-off.

Semi-automated implementation

Semi-automated implementation adds structured metadata, standard evidence-pack templates, form-based capture, and dashboards that help auditors filter runs by risk, function, or exception status. This reduces inconsistency and makes periodic sampling more realistic.

Fully automated implementation

At scale, a platform, wrapper, orchestration layer, or governance pipeline can log run metadata automatically, bind outputs to workflow identifiers, generate evidence packs, calculate RAIDT score profiles, and surface anomalies for internal audit review. In that model, internal audit focuses less on evidence collection and more on testing control design, exception handling, and organisational response.

Practical use in the RAIDT project

In the RAIDT project, this item helps explain how the framework travels from conceptual governance into institutional oversight. In Paper 08 Foundations, internal audit helps justify why the run is the correct unit for practical governance inspection. In Paper 09 Empirical Validation, it offers a way to test whether RAIDT outputs support consistent review, comparison, and challenge across real cases. In Paper 10 Policy Pathways, it provides a bridge from framework design to organisational assurance, procurement scrutiny, and accountability mechanisms.

The item is also useful for sector playbooks because supervisors, reviewers, and practitioners often ask who will actually inspect the evidence once it is produced. Internal audit is one strong answer. For viva defence and journal positioning, it shows that RAIDT is not only descriptive or ethical; it is designed to support reviewable governance practice inside organisations.

Key audience questions to prepare for

Q1. Is RAIDT trying to replace internal audit?

No. RAIDT is better understood as an enabling framework for internal audit in GenAI settings. It provides the evidential structure that allows auditors to inspect AI-supported work more rigorously.

Q2. Why is internal audit needed if a supplier already provides assurance documentation?

Supplier assurance is useful, but it usually does not show how the tool was used in a specific organisational context. Internal audit examines situated use, local controls, and downstream reliance inside the organisation.

Q3. Does every GenAI run need to be audited?

No. Internal audit usually works through materiality, risk-based selection, and sampling. RAIDT helps because it makes individual runs comparable and therefore suitable for targeted review rather than universal inspection.

Q4. What does internal audit look for in a RAIDT evidence pack?

It looks for enough context and documentation to reconstruct the run, test whether required controls operated, and determine whether any weakness is isolated or systemic.

Q5. How does this strengthen governance readiness?

It strengthens readiness by making the organisation able to answer difficult questions with evidence: what happened, who approved it, what was relied upon, what controls operated, and what has been learned from exceptions over time.

Suggested citation concepts to support this item
Short explanation for presentation

Internal audit is important in RAIDT because it gives organisations a practical way to inspect whether GenAI governance is really working in use. Instead of reviewing policy statements alone, auditors can sample material runs, inspect the evidence pack, compare the RAIDT score profile, and test whether controls such as approval, review, documentation, and traceability actually operated. That matters because many governance failures appear not in the abstract model description but in situated organisational use. RAIDT therefore makes internal audit more operational: each run becomes an inspectable governance event. This supports assurance, contestability, organisational learning, and policy refinement. For supervision or viva discussion, the key point is that RAIDT does not stop at principles; it creates evidence that internal oversight functions can genuinely review.

One-line takeaway

Internal audit is the organisational review mechanism that makes RAIDT's run-level evidence inspectable for accountability, assurance, and governance improvement.

Related items in policy, standards and assurance
Mentioned in reference-paper summaries (5)

Paper summaries live in Port/93-References/pdf_summaries/. Each file listed below contains the key term at least once.

Anchored questions
Powered by Forestry.md