S11.07 - Evidence_capture_feasibility
S11.07 ? Evidence capture feasibility
flowchart LR
A[Closed platforms and missing metadata] --> B[RAIDT
run-level evidence framework]
H[Prompts, timestamps, outputs,
review notes, wrappers, logs] --> C[[Evidence capture feasibility
can the run be reconstructed?]]
B --> C
C --> D[Evidence pack]
C --> E[RAIDT score profile]
D --> F[Reviewer reconstruction
and contestability]
E --> G[Governance readiness
and organisational learning]
I[Procurement and implementation choices] --> C? Star S11 - Boundaries, Limitations and Future Questions
Star context: Clarifies a practical boundary of RAIDT by showing that governance quality depends partly on whether a platform, workflow, or organisational setting can actually produce the evidence needed for run-level review.
Academic picture
Definition / background
Evidence capture feasibility is the practical question of whether enough relevant evidence can be recorded for a particular generative AI run to be reconstructed, reviewed, and evaluated. In RAIDT, the issue is not whether an organisation would like better documentation in principle, but whether the technical platform, workflow design, and organisational controls make evidence capture possible at the level of the individual run.
The concept matters because RAIDT treats the run as the unit of governance. A run-level evidence pack and a five-pillar score profile depend on the existence of a usable evidential record. If prompts, settings, source materials, outputs, review actions, or timestamps cannot be captured reliably, then the organisation cannot fully justify its governance claims for that run. In that sense, evidence capture feasibility is a condition of governance visibility.
This concept is different from general logging, transparency, or documentation quality. Logging may exist but still be infeasible for governance purposes if it omits contextual details, human interventions, or output versions. Transparency may be claimed at the vendor or policy level without giving an organisation access to the artefacts needed to reconstruct one concrete run. Evidence capture feasibility therefore sits between infrastructure capability and governance method: it asks whether the environment can support RAIDT's evidential demands.
Within RAIDT, the concept belongs in Boundaries, Limitations and Future Questions because it prevents overclaiming. RAIDT does not solve missing evidence by rhetorical force. If a platform does not expose sufficient metadata, if review steps occur outside the system, or if implementation is weak, RAIDT makes that limitation visible. The framework remains useful precisely because it can show when low Auditability or Traceability scores reflect a real evidence gap rather than a failure of interpretation.
Why this concept matters
Evidence capture feasibility matters because many organisations adopt generative AI tools whose evidential affordances are uneven, opaque, or poorly aligned with governance requirements. A governance framework that ignores this issue risks assuming that evidence can always be produced after the fact. In practice, many disputes arise only once a problematic output, contested decision, or review request forces the organisation to discover what was never captured.
The concept also prevents a common confusion between governance design and governance executability. An organisation may have a strong policy, a clear responsible-use statement, and a well-written assurance narrative, yet still be unable to reconstruct a run because its toolchain does not retain prompts, versioned outputs, or reviewer actions. RAIDT uses evidence capture feasibility to separate aspirational governance from operationally supportable governance.
If this concept is missing, organisations may overestimate their audit readiness, underestimate procurement risk, and misinterpret weak evidence as a minor documentation inconvenience rather than a structural limitation. By foregrounding feasibility, RAIDT helps move governance from principle statements to realistic operational judgement.
Key idea: Evidence capture feasibility matters because RAIDT can govern only what an organisation can meaningfully evidence at the level of the individual run.
What this item explains
- Whether a given platform or workflow can capture the artefacts needed for run-level governance.
- Why missing metadata is not merely a technical inconvenience but a governance limitation.
- How weak evidence capture constrains the quality of the evidence pack and the defensibility of the score profile.
- Why low scores on Auditability or Traceability may reveal infrastructure or procurement problems rather than reviewer weakness.
- Which parts of a run are most at risk of becoming invisible, such as prompt history, context, tool settings, output revisions, or human review actions.
- How organisations can distinguish between what is theoretically desirable to record and what is operationally feasible to capture.
- Why feasibility should influence implementation planning, vendor selection, workflow design, and proportional governance expectations.
Practical example / likely audience question
Audience question
What if the platform cannot log everything?
Answer
The concern behind this question is that RAIDT might appear to assume ideal technical visibility. The direct answer is no: RAIDT does not require perfection, but it does require that evidence limitations be made explicit rather than hidden. If a platform cannot log everything, the missing evidence becomes a governance fact about that implementation environment.
For example, an organisation may use a vendor chatbot that stores final outputs but does not retain prompt history, configuration details, or reviewer edits. In that case, RAIDT can still be applied, but the resulting evidence pack will be thinner and the score profile should reflect that limitation, especially in Auditability and Traceability. The issue is not that RAIDT has failed. The issue is that the platform does not support the level of evidence capture needed for stronger governance assurance.
This is where RAIDT is more useful than a generic AI governance approach. A generic approach may stop at recommending better documentation. RAIDT turns the limitation into an assessable governance finding. It shows that the gap may need to be addressed through procurement requirements, wrapper design, workflow redesign, logging infrastructure, or policy constraints on which tools are acceptable for certain classes of work.
Practical example in RAIDT terms
Consider an enterprise productivity setting in which staff use a generative AI assistant to draft contract summaries for internal procurement teams. The use case is attractive because it speeds up first-pass review of supplier terms, but the run-level issue is that the chosen platform only stores the final generated summary and a timestamp. It does not preserve the original prompt, attached contract excerpt, model settings, or the sequence of edits made by the employee before the summary is circulated.
The evidence needed for stronger RAIDT governance would include the task purpose, the source clause text supplied to the model, the prompt or instruction template, the model and version used, the generated draft, any employee edits, review comments from legal staff, and the final decision on whether the summary could be relied upon. Responsibility is affected because accountability for review and sign-off becomes harder to demonstrate. Auditability is affected because a later reviewer cannot reconstruct how the summary emerged. Interpretability is affected because the reasoning context of the output is under-documented. Dependability is affected because recurring output quality problems cannot be analysed properly across runs. Traceability is affected because the chain from source text to generated and approved output is incomplete.
In governance-readiness terms, evidence capture feasibility improves the organisation's position when the tool is wrapped with structured templates and logging controls, or when staff are required to submit source excerpts, prompts, and reviewer notes into an evidence form before relying on the output. RAIDT therefore makes the limitation actionable: it identifies what additional evidence infrastructure is required before the workflow can be treated as strongly governable.
Detailed link to RAIDT
Evidence capture feasibility links to RAIDT in four ways.
First, it tests RAIDT's core idea that responsible governance should be grounded in evidence from actual runs rather than broad claims about tools or policies.
Second, it determines whether the run can function as a meaningful unit of governance, because a run that cannot be evidenced adequately cannot be reviewed in depth.
Third, it shapes the quality of the evidence pack and the confidence with which a RAIDT score profile can be justified across the five pillars.
Fourth, it strengthens reviewability, contestability, audit readiness, and organisational learning by revealing where evidence infrastructure is sufficient and where governance is being constrained by technical or process limitations.
Evidence capture feasibility ? Run-level evidence ? Evidence pack ? RAIDT score profile ? Governance readiness
Link to the five RAIDT pillars
Responsibility
Evidence capture feasibility supports Responsibility by determining whether the organisation can show who initiated, reviewed, approved, or relied on a run and under what authority.
Example evidence / implication:
- Named user role, reviewer role, or approver role attached to the run record.
- Evidence of whether human checking, escalation, or sign-off was required but captured inadequately.
Auditability
This item has a very strong effect on Auditability because poor evidence capture directly weakens the ability of another person to reconstruct and inspect the run after the event.
Example evidence / implication:
- Availability or absence of prompts, source inputs, outputs, timestamps, and review records.
- Clear indication that low auditability reflects missing capture capacity rather than only poor reviewer practice.
Interpretability
Evidence capture feasibility affects Interpretability because explanation of an output depends partly on whether the practical conditions of generation were recorded.
Example evidence / implication:
- Prompt wording, instruction templates, and contextual notes that make the output intelligible in use.
- Recognition that interpretability is constrained when the system exposes only the final answer without the surrounding run context.
Dependability
This item affects Dependability by influencing whether repeated failures, inconsistencies, or process weaknesses can be detected across comparable runs.
Example evidence / implication:
- Ability to compare runs over time using captured versions, review outcomes, and error notes.
- Reduced confidence in process reliability when evidence gaps make failure analysis incomplete.
Traceability
Evidence capture feasibility is especially central to Traceability because it determines whether the organisation can connect the run to its inputs, outputs, actors, timing, and downstream use.
Example evidence / implication:
- Linkage between source materials, generated artefacts, and final approved outputs.
- Explicit recognition that traceability is structurally limited if the platform hides or discards key metadata.
Evidence capture feasibility affects all five pillars, but it is most immediately decisive for Auditability and Traceability because those pillars deteriorate quickly when run artefacts cannot be retained or reconstructed.
Why this item is more than a generic concept
In general AI governance, evidence capture feasibility might mean whether an organisation has enough documentation or logging to support oversight in a broad sense. In RAIDT, it has a more precise and operational meaning: whether one specific run can yield enough structured evidence to support an evidence pack, justify a five-pillar score profile, and withstand meaningful review.
The RAIDT meaning is more operational because it does not treat missing evidence as a vague implementation problem. It treats it as a measurable governance limitation attached to concrete runs, workflows, platforms, and deployment choices. That makes the concept useful for practice, procurement, and audit readiness rather than for abstract discussion alone.
Common misunderstanding
Misunderstanding
If evidence capture feasibility is low, RAIDT cannot be used.
Correction
Low feasibility does not make RAIDT irrelevant; it makes RAIDT diagnostically important. The framework can still assess the run, but it should show that the governance weakness lies in limited capture capability. For example, if a public-sector chatbot records only a final answer and not the prompt or source retrieval context, RAIDT can still be applied to that constrained record. The resulting assessment should then state clearly that weak Auditability and Traceability arise from missing evidence infrastructure. This is valuable because it turns an invisible limitation into an actionable governance finding.
Boundary and limitation
Evidence capture feasibility does not guarantee that the captured evidence is accurate, sufficient for every purpose, or ethically uncomplicated. A platform may expose many logs while still omitting crucial contextual meaning, and extensive capture may introduce privacy, retention, or proportionality concerns. The concept therefore does not prove governance quality by itself.
It also does not replace wider governance tasks such as model evaluation, legal review, procurement due diligence, workflow design, or staff training. Evidence capture feasibility tells us whether the conditions exist for reconstructable run-level evidence, not whether the underlying model is correct, fair, or safe in general.
The concept may fail in settings where evidence is distributed across multiple tools, where human actions occur off-platform, or where vendor restrictions prevent access to needed artefacts. RAIDT handles this by treating incomplete capture as part of the assessment itself. The framework does not conceal the limitation; it exposes it and shows where implementation or procurement change is needed.
Implementation levels
Manual implementation
A researcher or small team can apply this concept manually by using a structured run sheet that records prompts, source material, outputs, timestamps, reviewer identity, and decision notes outside the platform when the platform itself is limited. This is burdensome but often sufficient for pilot studies, viva examples, or small-scale governance trials.
Semi-automated implementation
Semi-automated implementation can use templates, wrappers, forms, and workflow checkpoints that capture some metadata automatically while asking users to complete contextual fields manually. For example, a browser-based wrapper around a GenAI tool might save prompt text and output versions while a review form records purpose, reliance level, and approval status.
Fully automated implementation
At scale, a platform, orchestration layer, or governance pipeline can automatically capture run identifiers, model details, prompt templates, source references, outputs, review actions, and scoring inputs. A dashboard can then assemble evidence packs, flag low-feasibility workflows, and inform procurement or architecture decisions about which tools are acceptable for high-accountability uses.
Practical use in the RAIDT project
Within the RAIDT project, this item is important for Paper 08 Foundations because it clarifies a central boundary condition: RAIDT depends on evidential access to runs, but it does not assume that such access is universally available. The concept therefore sharpens the framework's realism and guards against overclaiming about what responsible governance can achieve in closed or weakly instrumented environments.
For Paper 09 Empirical Validation, evidence capture feasibility is likely to be a major empirical variable. It can explain differences in score confidence across settings and help distinguish whether weak governance readiness is due to poor practice, poor tooling, or both. This is especially useful when comparing pilots, sector workflows, or implementation models.
For Paper 10 Policy Pathways, the item provides a route from conceptual governance to procurement and implementation guidance. It can inform policy recommendations about minimum evidence-capture requirements, acceptable platform features, documentation obligations, and proportional controls for different risk classes. It is also directly useful in sector playbooks, evidence-pack design, scoring-rubric justification, supervisor explanation, viva defence, and journal positioning because it shows that RAIDT is attentive to infrastructural constraints rather than assuming frictionless accountability.
Key audience questions to prepare for
Q1. Does evidence capture feasibility mean every GenAI tool must record everything?
No. RAIDT does not require maximal logging in every case. It requires sufficient, proportionate evidence for the level of governance scrutiny that the task demands. The key issue is adequacy for reconstruction and review, not indiscriminate data capture.
Q2. Is low feasibility mainly a technical issue?
Not only. It can be technical, but it can also arise from procurement choices, workflow design, weak review processes, fragmented tool use, or organisational decisions not to retain key artefacts. RAIDT treats all of these as governance-relevant causes.
Q3. Why not just compensate for missing platform logs with policy statements?
Because policy cannot recreate missing run artefacts after the event. A strong policy may explain intended practice, but it cannot prove what happened in one contested or high-stakes run if the evidence was never captured.
Q4. How does this help with viva or supervisor scrutiny?
It shows that RAIDT is not naively claiming universal observability. You can explain that the framework is designed to reveal when governance is limited by evidence infrastructure and to convert that limitation into a practical recommendation about implementation or procurement.
Q5. What is the governance value of scoring a run when evidence is incomplete?
The value lies in making incompleteness visible and assessable. A low-confidence or lower-scoring assessment can still show where a workflow fails to support auditability, traceability, and accountable review, which is often exactly the governance finding an organisation needs.
Suggested citation concepts to support this item
- generative AI logging limitations in organisational use
- AI auditability and evidence capture requirements
- traceability challenges in large language model deployments
- vendor opacity and governance of generative AI platforms
- sociotechnical infrastructure for AI accountability
- documentation and reconstruction of AI-assisted decision processes
- procurement requirements for auditable AI systems
- human oversight records in generative AI workflows
- evidence sufficiency in AI governance and assurance
- operational constraints on AI audit readiness
Short explanation for presentation
Evidence capture feasibility asks a simple but crucial question inside RAIDT: can this GenAI run actually be evidenced well enough to support review? RAIDT treats the run as the unit of governance, so the framework depends on being able to capture prompts, inputs, outputs, settings, review actions, and other context in a reconstructable way. If a platform hides those artefacts or a workflow fails to retain them, RAIDT does not ignore the gap. Instead, it turns that limitation into a governance finding, often reflected in weaker Auditability and Traceability. This makes the concept especially important for procurement, implementation design, and audit readiness. In short, evidence capture feasibility keeps RAIDT realistic: it shows that responsible governance depends not only on having principles, but on having the practical means to produce run-level evidence.
One-line takeaway
Evidence capture feasibility is the practical possibility of recording enough of a GenAI run to support RAIDT's evidence pack, score profile, and governance readiness judgement.
Related items in boundaries, limitations and future questions
Anchored questions
No anchored questions are currently listed in the source item.