S4.04 - Task_and_domain_label

S4.04 ? Task and domain label

flowchart LR
    A[Vague AI use descriptions
mixed tasks and domains] --> B[RAIDT
run-level evidence framework]
    A2[Generic governance claims
without situated context] --> B
    B --> C[[Task and domain label]]
    H[Task type
summarisation, explanation, triage, Q&A] --> C
    I[Domain
healthcare, finance, policy, cybersecurity] --> C
    J[Risk band and reviewer expertise] --> C
    C --> D[Evidence pack
right evidence for the right run]
    C --> E[Score profile
context-sensitive pillar scoring]
    C --> F[Reviewer reconstruction
clearer audit trail]
    C --> G[Governance readiness
reviewability, contestability, learning]

? Star S4 - Evidence Architecture and Artefacts

Star context: Specifies the concrete fields and artefacts that make a run record inspectable, including the classification cues that determine which evidence, thresholds, and review expectations apply to a given GenAI run.

Academic picture

Definition / background

The task and domain labels classify the run, such as healthcare summarisation, finance explanation, policy Q&A or cybersecurity triage. In RAIDT terms, this means recording both what kind of work the GenAI system was being used to perform and the substantive domain in which that work took place.

Conceptually, the task label answers the question, "What is the system being asked to do in this run?" The domain label answers the question, "In what organisational, professional, or sectoral setting is that task being performed?" These are related but not identical. A summarisation task may appear in healthcare, law, education, or public administration, yet the governance implications change because the stakes, norms, and harm pathways differ by domain.

This distinction matters in GenAI governance because many apparent performance issues are actually context issues. The same model output may be acceptable in a low-stakes brainstorming context and unacceptable in a regulated or safety-critical context. RAIDT therefore treats task and domain labels as part of the evidence architecture of a run, not as optional descriptive metadata. Without them, a reviewer cannot know which rubric anchors, escalation thresholds, or evidence requirements should apply.

Within RAIDT, task and domain labels belong inside run-level evidence because RAIDT governs configured use, not abstract models. They help connect the evidence pack to the score profile across Responsibility, Auditability, Interpretability, Dependability, and Traceability. In effect, they are part of the classification layer that makes later judgement operationally meaningful.

Why this concept matters

Task and domain labels solve a basic but often neglected governance problem: organisations cannot review a run properly if they do not know what kind of activity the run represented. Without this classification, reviewers are forced to apply generic expectations to context-specific work. That weakens comparison across runs, distorts scoring, and makes it harder to justify why one case triggered stronger review than another.

The concept also prevents a common confusion between system capability and situated use. A model may be capable of many things, but RAIDT evaluates one configured run for one concrete task in one context. The task and domain labels make that unit of analysis visible. This is essential for moving from broad principles such as fairness, accountability, or safety toward operational governance that can be audited, challenged, and improved.

If the item is missing, several risks follow: sector-specific harms may be missed; inappropriate benchmarks may be used; reviewers with the wrong expertise may be assigned; and organisations may overstate governance maturity because they appear to have controls that are not actually matched to the run under review. In practice, the label is a small field with large downstream consequences.

Key idea: Task and domain labels matter because they tell RAIDT which governance expectations belong to a specific run, turning generic oversight into context-sensitive evidence and review.

What this item captures

The type of work performed in the run, such as summarisation, explanation, classification, drafting, triage, or question answering.
The substantive domain or sector in which the work occurs, such as healthcare, finance, law, education, social care, public services, or cybersecurity.
The context needed to select appropriate evidence expectations, review criteria, and escalation thresholds.
The basis for comparing like with like across runs rather than mixing dissimilar use cases into one governance category.
The conditions under which pillar scoring should be interpreted differently because domain stakes and task demands vary.
The organisational framing needed for reviewer assignment, policy mapping, and learning across repeated deployments.

Practical example / likely audience question

Audience question

Why classify tasks?

Answer

The concern behind the question is usually that classification sounds bureaucratic or redundant, as though a reviewer can simply inspect the prompt and output and work out what happened. In reality, that assumption breaks down quickly in organisational settings. A prompt may look similar across runs, while the governance significance differs because the intended task and domain differ.

The direct answer is that scoring anchors and risk thresholds depend on domain and task type. A model-generated summary for internal meeting notes is not governed in the same way as a model-generated summary of a patient record. Likewise, explanatory output in a finance setting may need to support stricter documentation and review than explanatory output in a low-stakes training context.

RAIDT handles this better than generic AI governance approaches because it binds the classification to a run-level evidence record. Instead of saying, "Our organisation uses GenAI in healthcare," RAIDT can say, "This run was healthcare summarisation, performed at this time, with this prompt lineage, this evidence pack, and these applicable review expectations." That shift makes the governance claim inspectable.

Practical example in RAIDT terms

Consider a hospital team using a GenAI assistant to summarise referral letters before a clinician review. The use case is summarisation, but the domain is healthcare, not generic office productivity. At run level, the issue is that the output may omit a clinically salient symptom, medication conflict, or safeguarding concern even if the summary appears fluent.

In RAIDT terms, the run record should therefore label the task as clinical summarisation and the domain as healthcare. The evidence needed includes the prompt version, retrieved context if any, output hash, reviewer notes, and a domain-appropriate judgement of completeness and risk. Responsibility is affected because accountability for use must be clearly allocated. Dependability is affected because the run must perform consistently under clinical expectations. Auditability and Traceability are affected because a reviewer must be able to reconstruct why this run was handled under healthcare controls rather than a generic summarisation rubric. The label improves governance readiness by ensuring that the run enters the right evidence pathway from the start.

Detailed link to RAIDT

Task and domain label links to RAIDT in four ways.

First, it links to RAIDT's core idea that governance should apply to situated runs rather than to abstract claims about a model or product.
Second, it links directly to run-level evidence because it classifies the specific use that the evidence record is meant to describe.
Third, it links to both the evidence pack and the score profile because task and domain determine which evidence is expected and how pillar judgements should be interpreted.
Fourth, it links to reviewability, contestability, audit readiness, and organisational learning because the classification makes runs comparable, challengeable, and reviewable across repeated use.

Task and domain label ? Run-level evidence ? Evidence pack ? RAIDT score profile ? Governance readiness

In short, the item helps RAIDT decide what kind of run is being governed before asking whether that run is responsible, auditable, interpretable, dependable, or traceable.

Link to the five RAIDT pillars

Responsibility

Task and domain labels support Responsibility by clarifying the context in which obligations apply. A run in a high-stakes domain may require stricter approval, stronger human oversight, or clearer role allocation than a similar task in a low-stakes setting.