S11.02 - Limitations

S11.02 ? Limitations

flowchart LR
    A[Traditional overclaims
Guarantee correctness
Score equals compliance
Evidence removes risk] --> B[RAIDT
Run-level evidence framework]
    B --> C[[Limitations
Bounded claims about what RAIDT can and cannot establish]]
    C --> D[Run-level evidence pack]
    C --> E[RAIDT score profile]
    C --> F[Reviewer reconstruction]
    D --> G[Reviewability and contestability]
    E --> H[Governance readiness]
    F --> I[Organisational learning]
    J[Healthcare drafting]
    K[Finance compliance support]
    L[Public-sector casework]
    M[Prompts outputs timestamps reviewer notes approvals]
    J --> C
    K --> C
    L --> C
    M --> C

? Star S11 - Boundaries, Limitations and Future Questions

Star context: Clarifies the disciplined boundary of RAIDT by showing that the framework improves governance readiness around specific GenAI runs, but does not guarantee truth, remove domain risk, certify legality, or replace expert judgement.

Academic picture

Definition / background

In RAIDT, limitations are the explicit constraints on what the framework can validly claim, infer, or assure. They specify that RAIDT cannot guarantee factual correctness, remove domain-specific risk, provide legal certification, or replace expert professional judgement. This is not a weakness in the casual sense; it is a disciplined statement about the scope of a governance framework whose purpose is to improve evidence, reviewability, and accountability around real uses of generative AI.

Conceptually, limitations differ from failures, errors, and boundary conditions. A failure is something that goes wrong in practice. An error is a wrong output, judgement, or process step. A boundary condition defines the context within which a framework can sensibly operate. Limitations, by contrast, describe what RAIDT cannot deliver even when it is working as intended. They therefore protect the framework from overclaiming and protect users, reviewers, and supervisors from misunderstanding the meaning of RAIDT outputs.

This matters in GenAI governance because evidence and scoring can easily be misread as proof of correctness. RAIDT produces a run-level evidence pack and a five-pillar score profile across Responsibility, Auditability, Interpretability, Dependability, and Traceability. Those outputs can show whether a run is better documented, more reviewable, or more governance-ready than another. They do not prove that the generated content is true, fair, lawful, clinically safe, or fit for deployment in all contexts.

Limitations therefore belong centrally within RAIDT. They frame how run-level evidence should be interpreted, how evidence packs should be used, and how score profiles should be discussed in supervision, organisational review, and academic argument. Without an explicit limitations item, RAIDT could be misunderstood as a guarantee system rather than as an evidence-based governance framework.

Why this concept matters

This concept matters because governance frameworks fail intellectually and practically when they promise certainty that they cannot deliver. In GenAI contexts, organisations often want a method that reduces ambiguity, especially in high-stakes domains. RAIDT helps by making individual runs more reviewable and contestable, but it must also make clear that better governance evidence is not the same thing as guaranteed correctness.

The concept also avoids a serious confusion between governance readiness and outcome validity. A run may be well documented, properly reviewed, and appropriately scored, yet still contain a substantive mistake that only a domain expert can detect. Conversely, a technically strong output can emerge from a poorly governed run. The limitations item helps supervisors, practitioners, and reviewers keep those dimensions analytically separate.

If this item is missing, organisations may treat a strong RAIDT profile as a proxy for truth, legal sufficiency, or professional adequacy. That creates a false sense of assurance and can lead to misuse of scores, weak escalation practice, and poor communication with decision-makers. By stating its own limitations, RAIDT moves governance from vague confidence to disciplined operational realism.

Key idea: Limitations matter because RAIDT improves the quality of governance evidence, not the certainty of the underlying world that the evidence describes.

What this item explains

The difference between governance evidence and guaranteed substantive correctness.
Why RAIDT outputs must be interpreted as support for review, not as replacements for expert judgement.
What a run-level evidence pack and score profile can legitimately show.
Why high scores do not equal legal certification, factual truth, or domain safety.
How explicit limits make RAIDT more credible in PhD supervision, policy discussion, and organisational adoption.
Why responsible AI governance requires bounded claims rather than framework overstatement.

Practical example / likely audience question

Audience question

If RAIDT captures rich evidence and gives a five-pillar score, why can it not guarantee that the output is correct or compliant?

Answer

The concern behind this question is that structured evidence and scoring can look like a seal of approval. The direct answer is that RAIDT evaluates governance readiness around a specific run, not the ultimate truth-value or legal validity of the generated content. A well-evidenced run can still produce a wrong answer, and a well-scored process can still require human correction, escalation, or rejection.

For example, a university might use GenAI to draft guidance for a student visa query. RAIDT can capture the prompt, policy sources consulted, tool version, reviewer comments, edits made, and approval decision. It can therefore show whether the institution used the tool in a reviewable and accountable way. What RAIDT cannot do is certify that the guidance is legally correct under immigration law. That judgement still depends on expert review, current policy interpretation, and, where necessary, formal legal advice.

RAIDT handles this better than a generic AI governance approach because it makes the boundary explicit at run level. Rather than implying that governance artefacts automatically settle correctness or compliance questions, RAIDT shows precisely what evidence exists, how the run was governed, and where human expertise must still intervene.

Practical example in RAIDT terms

Consider a healthcare setting in which a clinician uses a GenAI assistant to draft discharge instructions after a hospital visit. The use case is efficient and operationally plausible, but the run-level issue is that the drafted instructions may still contain a clinically misleading statement, an omitted warning, or phrasing inappropriate for the patient's condition.

The evidence needed for RAIDT would include the task purpose, prompt template, relevant de-identified source notes, model or tool version, timestamp, generated draft, clinician edits, final approved discharge text, and notes on whether review or escalation occurred. Responsibility is affected because a named clinician or team must remain accountable for approving the communication. Auditability and Traceability are affected because the run must be reconstructable after the event. Interpretability is affected because reviewers need to understand how the draft arose from the available instructions and inputs. Dependability is affected because repeated safe performance cannot be assumed simply because documentation exists.

The limitations item improves governance readiness here by forcing the organisation to say something precise: RAIDT can show whether the discharge-instruction run was well governed and well evidenced, but it cannot itself certify that the final clinical advice was medically correct. That distinction is exactly what makes the governance claim credible.

Detailed link to RAIDT

Limitations links to RAIDT in four ways.

First, it protects the core RAIDT idea from being overstated by making clear that the framework is about evidence-based governance of GenAI use, not about producing certainty from uncertainty.

Second, it ties directly to the run because limitations apply to what can be inferred from one governed use event, even when that event is well documented.

Third, it disciplines interpretation of the evidence pack and the RAIDT score profile by clarifying that these outputs support judgement, reconstruction, and comparison rather than guaranteeing substantive validity.

Fourth, it strengthens reviewability, contestability, audit readiness, and organisational learning because reviewers can assess both what the evidence shows and what remains outside the framework's evidential reach.

Limitations ? bounded run-level claims ? evidence pack ? RAIDT score profile ? governance readiness without overclaiming

Link to the five RAIDT pillars

Responsibility

Limitations strongly affect Responsibility because the item makes clear that human and organisational accountability cannot be delegated to the framework itself. RAIDT can support responsible oversight, but it cannot become the accountable actor.