S5.01 - Responsibility

S5.01 ? Responsibility

flowchart LR
    A[Background problem:
principles without run-level proof] --> B[RAIDT
run-level evidence framework] A2[Traditional limitation:
unclear authority, weak boundaries,
informal oversight] --> B B --> C[[Responsibility
justified, bounded, authorised,
and overseen use]] H[Practical fields:
healthcare, finance, education,
public services, enterprise workflows] --> C C --> D[Evidence pack:
purpose, limits, approvals,
escalation, safety controls] C --> E[Score profile:
Responsibility plus four-pillar view] C --> I[Governance move:
evidence over assertion] D --> F[Reviewability and contestability] E --> G[Governance readiness] D --> J[Organisational learning and policy alignment]

? Star S5 - RAIDT Pillars and Scoring

Star context: Responsibility is one of RAIDT's five governance pillars. It defines whether a specific GenAI run was appropriate, bounded, authorised, and overseen, so that scoring reflects accountable use rather than technical performance alone.


Academic picture
Definition / background

Responsibility in RAIDT asks whether a particular run of a generative AI system was appropriate, bounded, safe, authorised, and overseen in relation to the task being performed. It is therefore concerned with accountable use, not simply with whether the model generated a plausible or useful output. The concept matters because a technically successful answer can still be organisationally irresponsible if it was used outside scope, without authority, without safeguards, or without a route for escalation and review.

Conceptually, Responsibility sits at the intersection of governance, risk ownership, and justified action. In general AI governance discourse, responsibility is often discussed at the level of principles, organisational values, or high-level policy statements. RAIDT narrows the focus to the run as the unit of governance. That shift matters because organisations do not govern generative AI in the abstract; they govern situated uses by named roles, for specific tasks, at specific moments, under concrete constraints.

Responsibility differs from adjacent terms. It is not the same as auditability, which concerns whether a reviewer can inspect and reconstruct what happened. It is not the same as traceability, which concerns whether relevant artefacts, decisions, and transformations can be followed across the run. It is also not identical to dependability, which asks whether the process and outcome are sufficiently stable and reliable for the intended use. Responsibility instead asks whether the run should have occurred in the way it did, under the conditions it did, with the oversight it did.

Within RAIDT, Responsibility belongs inside the five-pillar profile because governance readiness requires more than records. A strong evidence pack with weak responsibility controls would still expose the organisation to misuse, unmanaged risk, and contestable decisions. For that reason, Responsibility connects the evidence pack to organisational accountability: it shows whether the run was normatively and procedurally justified, and whether that justification can be defended to supervisors, auditors, managers, and affected stakeholders.

Why this concept matters

Responsibility solves a recurring governance problem in organisational GenAI use: the gap between having general principles and being able to justify one concrete use. Without this concept, teams often assume that approved access to a tool, a broad acceptable-use policy, or a manager's informal endorsement is enough. RAIDT makes that assumption visible and testable by asking for run-specific evidence that the use was appropriate, bounded, and overseen.

The concept also prevents a common confusion between usefulness and legitimacy. A run may save time and produce a convincing output, but still be irresponsible if it involves sensitive information, bypasses review, exceeds delegated authority, or creates downstream risk for staff, clients, patients, students, or citizens. Responsibility therefore acts as a governance filter on model use, not as a quality judgement on text generation alone.

If Responsibility is missing, organisations drift towards principle-washing: they can claim alignment with responsible AI values while lacking evidence that real uses are governed in a disciplined way. By contrast, RAIDT operationalises responsibility through evidence that can be inspected, challenged, and improved over time. That move is central to RAIDT's wider aim of shifting governance from assertion to documented, reviewable practice.

Key idea: Responsibility matters because it shows whether a specific GenAI run was justified and governable, rather than merely possible or productive.

What this item captures
Practical example / likely audience question

Audience question

If RAIDT already records prompts, outputs, and logs, why does it need a separate Responsibility pillar?

Answer

The concern behind this question is the assumption that documentation alone equals good governance. It does not. Logs can show what happened, but they do not by themselves show whether the run should have happened in that form, under that authority, and within those limits.

The direct answer is that Responsibility addresses the legitimacy of use, whereas logs mainly support reconstruction of use. A run can be perfectly logged and still be irresponsible. For example, a member of staff might use a generative AI system to draft advice for a vulnerable service user without checking whether that use case is permitted, whether the data entered were appropriate, or whether a qualified reviewer was required before the output was acted on.

RAIDT handles this issue better than a generic AI governance approach because it asks for evidence at the level of the specific run. Rather than relying on broad policy language such as ?staff should use AI responsibly?, RAIDT looks for concrete indicators: a purpose statement, task scope, approval route, safety constraints, escalation rules, reviewer role, and conditions of use for the generated output. That is more operational, more defensible, and more useful in supervision, audit, and post-hoc review.

Practical example in RAIDT terms

Consider a healthcare administration team using a GenAI assistant to draft patient appointment follow-up letters. The use case appears low-risk because it is administrative rather than diagnostic, but a specific run becomes governance-sensitive when staff include details about treatment pathways, missed appointments, or vulnerability indicators.

The run-level issue is not simply whether the output reads well. The real question is whether this use of GenAI was authorised for patient communication, whether the prompt content stayed within approved data boundaries, whether a human reviewer had to approve the final letter, and whether escalation rules existed if the generated draft introduced clinically inappropriate wording.

The required evidence would include the stated purpose of the run, the approved task category, the user's role, any data-handling restrictions, the review requirement before sending, the relevant policy link, and the decision rule for when GenAI drafting must not be used. Responsibility is the primary pillar here, but Auditability, Traceability, and Dependability are also affected because the organisation must reconstruct the run, understand what was generated, and trust that the process is stable enough for repeated administrative use.

In governance-readiness terms, Responsibility improves the organisation's position by showing that GenAI was used under defined authority with explicit boundaries. That is more defensible than claiming after the fact that the tool was merely ?helping with drafting?.

Detailed link to RAIDT

Responsibility links to RAIDT in four ways.

First, it expresses RAIDT's core idea that governance should focus on the run, not only on the model or policy environment.
Second, it attaches accountability to the specific configured use of GenAI for a task, at a time, in a context.
Third, it feeds the evidence pack and score profile by requiring explicit evidence of purpose, authority, boundaries, oversight, and control.
Fourth, it strengthens reviewability, contestability, audit readiness, and organisational learning because a challenged run can be justified in governance terms rather than defended by informal explanation.

Responsibility ? Run-level evidence ? Evidence pack ? RAIDT score profile ? Governance readiness

In this chain, Responsibility is the pillar that asks whether the run was appropriately governed before its outputs are treated as usable organisational artefacts.

Link to the five RAIDT pillars

Responsibility

Responsibility is the primary pillar because it establishes whether the run was appropriate, authorised, bounded, and overseen.

Example evidence / implication:

Auditability

Auditability supports Responsibility by allowing reviewers to inspect whether the claimed controls and approvals can actually be verified after the event.

Example evidence / implication:

Interpretability

Interpretability matters because a responsible run also requires that relevant users and reviewers can understand how the output should be read, limited, and used.

Example evidence / implication:

Dependability

Dependability affects Responsibility because an authorised process is still weak if it behaves erratically or cannot be relied upon for the intended task.

Example evidence / implication:

Traceability

Traceability allows the organisation to follow the artefacts, decisions, and transformations associated with a responsible run.

Example evidence / implication:

Responsibility is therefore the normative anchor of the five pillars, but it depends on the others to make accountability inspectable in practice.

Why this item is more than a generic concept

In general AI governance, responsibility may mean moral accountability, broad organisational ownership, or compliance with stated principles. In RAIDT, it means something more operational and narrower: whether one specific run was justified, bounded, authorised, and overseen with sufficient evidence to support review.

The RAIDT meaning is more operational because it is tied to run-level evidence. It does not stop at saying that someone in the organisation is responsible for AI use in general. Instead, it asks who was responsible for this run, what conditions applied, what limits were set, what review path existed, and what evidence proves those claims. That shift makes the concept usable in governance design, scoring, supervision discussions, and empirical validation.

Common misunderstanding

Misunderstanding

Responsibility just means that a human remains ?in the loop?, so if a person saw the output, the run counts as responsible.

Correction

Human presence alone is not enough. A person can be involved in a process without clear authority, without understanding the task boundary, and without meaningful capacity to challenge or stop use. In RAIDT, Responsibility requires more than nominal human involvement: it requires defined roles, bounded use, explicit oversight conditions, and evidence that these controls applied to the run.

For example, if an analyst glances at a generated report summary before sending it onward, that does not by itself show responsible governance. A responsible run would also specify whether the analyst was authorised to approve the content, whether the use case was permitted, what should trigger escalation, and whether the summary could influence a consequential decision.

Boundary and limitation

Responsibility does not prove that a run was ethically perfect, legally unchallengeable, or free from harm. It also does not replace specialist review in domains where clinical, legal, safeguarding, or regulatory judgement is required. A run can score reasonably on Responsibility and still have weaknesses in interpretability, dependability, or downstream outcome quality.

The concept may also fail if evidence is superficial or performative. For instance, a template may contain a purpose statement and approval tick-boxes, but if users routinely complete them without reflection, the presence of documentation can create false confidence. RAIDT handles this limitation by treating Responsibility as evidence-based and reviewable rather than self-certified. Its strength therefore depends on the quality, specificity, and inspectability of the supporting evidence.

Implementation levels

Manual implementation

A researcher or small team can apply Responsibility manually by using a structured note or checklist for each run. The note should record the task purpose, user role, authority to use GenAI, boundaries on input data, expected review requirements, and any escalation triggers. This approach is feasible for pilot studies, small-scale evidence packs, and early RAIDT demonstrations.

Semi-automated implementation

Responsibility can be semi-automated through templates, metadata fields, and structured review workflows. For example, a run form can require completion of purpose, task category, sensitivity level, approved use case, and reviewer assignment before the run is marked complete. This reduces omission while still leaving substantive judgement with human reviewers.

Fully automated implementation

At scale, Responsibility can be embedded in a wrapper, orchestration layer, or governance platform that enforces policy-linked controls. The system can require approved task categories, assign reviewers based on risk level, block prohibited data types, log escalation events, and attach responsibility metadata directly to the evidence pack and scoring pipeline. In that form, Responsibility becomes part of the operational governance infrastructure rather than an after-the-fact narrative.

Practical use in the RAIDT project

In Paper 08 Foundations, Responsibility helps define the conceptual claim that governance must attach to the run as the actionable unit of analysis. In Paper 09 Empirical Validation, it provides one of the dimensions that can be assessed across real uses to test whether RAIDT produces meaningful discrimination between weakly and strongly governed runs. In Paper 10 Policy Pathways, it is the bridge from framework language to deployable organisational controls, because it translates principle-level commitments into implementable approval, oversight, and escalation structures.

Within the broader RAIDT project, Responsibility is also useful for sector playbooks, evidence-pack design, scoring rubrics, and governance interventions. It gives supervisors and examiners a clear answer to the question, ?What does RAIDT add beyond logging?? It also supports journal positioning by showing that the framework contributes an operational account of accountable GenAI use, rather than another abstract statement of responsible AI principles.

Key audience questions to prepare for

Q1. How is Responsibility different from ordinary compliance?

Compliance usually asks whether a rule or policy exists and has been followed in a broad sense. Responsibility in RAIDT asks whether a specific run was justified, bounded, authorised, and overseen with evidence that can be reviewed later.

Q2. Can a run be auditable but still irresponsible?

Yes. A run may be well logged and easy to reconstruct, yet still fall outside authorised use, lack adequate oversight, or involve inappropriate data. Auditability helps inspect the run; Responsibility helps judge whether the run was legitimate.

Q3. Why score Responsibility separately instead of embedding it everywhere?

A separate pillar prevents accountability issues from disappearing inside broader technical or procedural assessments. It ensures that governance legitimacy is visible, discussable, and comparable across runs.

Q4. Does Responsibility require heavy bureaucracy for every GenAI use?

No. The level of evidence should be proportionate to the task and risk. Low-stakes runs may need lightweight purpose and boundary documentation, whereas higher-stakes uses need formal approval, clearer escalation, and stronger review controls.

Q5. What is the minimum evidence that supports Responsibility in one run?

At minimum, the organisation should be able to show the run's purpose, who initiated it, why the use case was permitted, what boundaries applied, what review or oversight was expected, and how outputs were controlled before use.

Suggested citation concepts to support this item
Short explanation for presentation

Responsibility in RAIDT asks whether a specific GenAI run was appropriate, authorised, bounded, and overseen. That matters because useful output is not the same as legitimate use. An organisation may have broad AI policies, but RAIDT tests whether one actual run was carried out under clear purpose, role allocation, safety limits, and review conditions. In practice, this means Responsibility is supported by evidence such as purpose statements, approval routes, escalation rules, policy links, and controls on how outputs are used. The value of the pillar is that it moves responsible AI from principle-level claims to inspectable run-level evidence. In a supervision or viva setting, the key point is that RAIDT makes accountability operational: it shows not only what the system produced, but whether the organisation can justify having used it in that way.

One-line takeaway

Responsibility is the RAIDT pillar that tests whether a GenAI run was justified, bounded, and overseen because accountable governance must be evidenced at run level.

Related items in RAIDT pillars and scoring
Mentioned in reference-paper summaries (5)

Paper summaries live in Port/93-References/pdf_summaries/. Each file listed below contains the key term at least once.

Anchored questions
Powered by Forestry.md