Q284 - How_can_it_be_implemented_and_mapped_to_audit_procurement_an

Q284 — How can it be implemented and mapped to audit, procurement and policy?

← RAIDT · Star S9 - Policy, Standards and Assurance · primary item: S9.05 · Interoperability

Appears in sources

workshop_table17#tag-band S8–S9 · 145–165 min

Answer

Implementation starts by deciding which generative AI uses are material enough to govern at run level, then capturing the minimum evidential object for each such use. In RAIDT, that means treating the run as the unit of governance and building a run-level evidence pack from anchored artefacts rather than narrative recollection. The policy pathways paper describes a staged operating model: manual pilots can begin with saved prompts, outputs, and reviewer forms; partial automation can verify evidence completeness from logs and repositories; and fuller automation can use an orchestration layer or wrapper to record run IDs, model and prompt versions, retrieval snapshots, hashes, and safety checks. The resulting run is then scored across the five pillars (Responsibility, Auditability, Interpretability, Dependability, Traceability), retaining the full score profile and using anchors 1=missing / 3=partial / 5=audit-ready to guide review, calibration, and escalation.

Mapping to audit, procurement, and policy follows from that shared evidence object. For audit, RAIDT gives internal auditors a bounded unit they can sample like a transaction, reconstruct after incidents, and compare across workflows; the audit paper is clear that this is different from a programme audit or a system audit. For procurement, the same evidence fields become tender thresholds, supplier disclosure requirements, audit rights, retention duties, and incident-reporting clauses, allowing buyers to demand governance-grade evidence without demanding proprietary source code. For policy and regulation, the evidence pack becomes an interoperability crosswalk: one set of records can populate sandbox learning, conformity assessment, technical documentation, market surveillance, and post-deployment supervision. Implementation therefore works best as an evidence pipeline: capture, score, calibrate, contract, sample, and update the mapping as frameworks evolve.

Practical example

Consider a local authority procuring a generative AI service to support citizen enquiries. In the pilot phase, staff capture prompts, outputs, reviewer notes, and escalation records manually for a defined set of high-impact runs. Once the process stabilises, the supplier adds logging for run IDs, model versions, retrieval snapshots, and output hashes so the evidence pack is more reliable and less labour-intensive.

That same pack then supports three governance routes. Procurement writes the required evidence fields and audit rights into the contract. Internal audit samples a subset of runs and reviews the score profile for weak Auditability or Traceability. The policy team uses the same records in a sandbox or supervisory discussion to show how oversight, documentation, and monitoring operate in practice. The implementation burden is therefore concentrated once, but the governance benefit is reused many times.

Sources in RAIDT papers

10-RAIDT_Policy_Pathways_M_V50
16-RAIDT-Audit-Accountability_M_v05