S11.05 - Privacy_and_data_protection
S11.05 ? Privacy and data protection
flowchart LR
A[Traditional AI governance problem:
need evidence but logging can expose sensitive data] --> B[RAIDT:
Run-level evidence framework]
B --> C[[Privacy and data protection:
governs capture, minimisation, access, retention]]
C --> D[Evidence pack:
redacted or controlled run record]
C --> E[RAIDT score profile:
privacy-aware governance judgement]
D --> F[Reviewer reconstruction]
D --> G[Contestability]
E --> H[Audit readiness]
E --> I[Organisational learning]
J[Healthcare] --> C
K[Finance] --> C
L[Public services] --> C
M[Education] --> C
N[Enterprise productivity] --> C? Star S11 - Boundaries, Limitations and Future Questions
Star context: This item marks a practical boundary for RAIDT: evidence improves governance only if the evidence pack itself is handled in ways that protect sensitive data, respect proportionality, and avoid creating new organisational harms through logging.
Academic picture
Definition / background
Privacy and data protection in RAIDT refer to the disciplined governance of any sensitive information that may enter the run record, the evidence pack, or the surrounding review process. Because RAIDT treats the run as the unit of governance, it necessarily pays attention to the concrete artefacts generated by one configured use of a generative AI system for a specific task, at a specific time, in a specific context. Those artefacts may include prompts, outputs, attached sources, model settings, reviewer notes, user roles, and downstream actions. Any of these can contain personal data, confidential organisational material, or other restricted content.
Conceptually, this item sits at the intersection of information governance, responsible AI, records management, and risk control. It is concerned not only with whether data are lawfully processed, but also with whether evidence capture is proportionate to the governance purpose. That distinction matters. A system can generate rich evidence while still creating avoidable privacy risk if it logs too much, stores it too long, or grants overly broad access.
Within RAIDT, privacy and data protection therefore differ from a generic compliance statement. They become run-level design questions: what should be captured, what should be redacted, who should see it, how long should it be retained, and how can reviewers reconstruct a run without exposing unnecessary detail. This makes the concept structurally important to both the evidence pack and the five-pillar score profile.
The item belongs in RAIDT because responsible governance cannot rely on evidence collection alone. Evidence must itself be governed. Otherwise, a framework intended to improve accountability may introduce new harms by creating sensitive records without adequate minimisation, access control, or retention discipline.
Why this concept matters
Privacy and data protection matter because governance systems often fail in one of two opposite ways: either they capture too little evidence to support review, or they capture so much detail that the evidence pack becomes a new source of institutional risk. RAIDT is specifically designed to avoid this false choice. It seeks evidence that is sufficient for reviewability and contestability, but proportionate to the sensitivity of the run.
This concept prevents a common confusion in AI governance: the assumption that more logging is always better. In practice, indiscriminate logging can expose personal data, leak confidential business information, reveal protected case material, or create discoverable records that were never intended for broad internal circulation. If privacy controls are absent, the evidence pack may become legally difficult to manage, ethically questionable, or operationally unusable.
For organisations using generative AI, this item turns privacy from a broad principle into an operational governance discipline. It helps teams decide when raw prompts should be masked, when source excerpts should be abstracted, when access should be role-based, when retention should be shortened, and when evidence capture should be limited to metadata rather than content. In that sense, it moves AI governance from general aspiration to implementable control.
Key idea: Privacy and data protection matter in RAIDT because run-level evidence is only governance-ready when the evidence pack is informative enough to review, but restrained enough to avoid creating unnecessary exposure.
What this item controls
- The scope of run evidence captured from prompts, outputs, source materials, user context, and system configuration.
- The degree of data minimisation applied before evidence enters the pack.
- The redaction, masking, pseudonymisation, or abstraction of sensitive fields.
- The access rights granted to reviewers, auditors, managers, and system operators.
- The retention period and deletion rules for evidence linked to completed runs.
- The conditions under which raw content is stored versus when metadata-only capture is more appropriate.
- The balance between accountability needs and exposure risk across different organisational settings.
- The practical governability of RAIDT outputs in high-sensitivity domains.
Practical example / likely audience question
Audience question
Can logging create risk?
Answer
Yes. The misconception behind the question is that logging is inherently protective because it supports audit and review. In reality, logging can create a second-order governance problem: the records captured to make a run accountable may themselves contain sensitive personal, commercial, legal, or clinical information. If those records are over-detailed, widely accessible, or retained for too long, the governance mechanism becomes a source of risk.
In RAIDT terms, the direct answer is that evidence capture must be governed just as carefully as model use. For example, a team using a generative AI assistant to draft case summaries may wish to retain enough information to reconstruct why a specific output was accepted or challenged. However, storing the full prompt thread with names, identifiers, and attached case notes in a broadly accessible evidence repository would create privacy exposure. A better RAIDT approach is to capture the relevant run metadata, selected excerpts, redacted content, the decision rationale, and the reviewer trail needed for reconstruction.
RAIDT handles this issue better than a generic AI governance approach because it does not stop at a principle such as ?protect data?. It asks what evidence is needed for this run, who needs to inspect it, and what minimum record supports reviewability without unnecessary disclosure. That makes privacy and data protection operational rather than rhetorical.
Practical example in RAIDT terms
Consider a healthcare provider using a generative AI tool to help draft discharge summaries from clinician notes. The use case is valuable because it reduces administrative burden and speeds documentation. The run-level issue, however, is immediate: prompts and outputs may contain patient identifiers, diagnoses, medication details, and clinician annotations.
In RAIDT, the organisation should not assume that the full prompt-output exchange can simply be logged into a standard evidence repository. Instead, it would define the evidence needed for governance readiness: the task description, model version, prompt template class, timestamp, operator role, reviewer decision, flagged risks, and redacted excerpts sufficient to explain why the output was accepted, amended, or rejected. Where full content must be stored, access control and retention rules become part of the governed run record.
The pillars most affected are Responsibility, Auditability, Dependability, and Traceability, with Interpretability also relevant where reviewers must understand how sensitive context shaped the output. The item improves governance readiness because it allows the healthcare provider to reconstruct the run, justify its oversight process, and support audit or challenge without converting the evidence pack into an uncontrolled patient-data archive.
Detailed link to RAIDT
Privacy and data protection link to RAIDT in four ways.
First, they reinforce RAIDT's core idea that governance should be grounded in concrete evidence from specific runs rather than broad claims about systems in general.
Second, they shape what can responsibly be captured at the run level, including how sensitive prompts, outputs, and contextual metadata are minimised and governed.
Third, they determine whether the evidence pack and score profile can be used safely in review, assurance, and escalation processes.
Fourth, they support reviewability, contestability, audit readiness, and organisational learning by ensuring that evidence remains accessible to the right people without becoming unnecessarily exposed to the wrong ones.
Privacy and data protection ? Run-level evidence ? Evidence pack ? RAIDT score profile ? Governance readiness
In other words, this item ensures that RAIDT's move towards evidence does not undermine the very governance quality it is meant to improve.
Link to the five RAIDT pillars
Responsibility
Privacy and data protection strengthen Responsibility by requiring those who design, approve, and operate GenAI use cases to define acceptable evidence practices in advance rather than treating data handling as an afterthought.
Example evidence / implication:
- Documented rules for what categories of run content may be captured, redacted, or excluded.
- Clear assignment of authority for approving access, retention, and disclosure decisions.
Auditability
This item has a strong effect on Auditability because evidence is only auditable if it is available in a controlled form that reviewers can legitimately inspect. Excessive secrecy undermines audit, but excessive disclosure undermines governability.
Example evidence / implication:
- Access logs showing who viewed sensitive run records and for what purpose.
- Review-ready evidence packs containing redacted or minimised material rather than indiscriminate raw dumps.
Interpretability
Privacy and data protection affect Interpretability when understanding an output depends on contextual information that may itself be sensitive. RAIDT therefore needs ways to preserve interpretive value without exposing unnecessary detail.
Example evidence / implication:
- Structured summaries explaining relevant input context without reproducing full confidential text.
- Reviewer notes clarifying how sensitive source material influenced judgement on output quality or risk.
Dependability
This item supports Dependability by reducing the operational risk that governance processes themselves become fragile, non-compliant, or unusable because sensitive data have been mishandled. Dependable governance requires evidence practices that can be sustained safely over time.
Example evidence / implication:
- Retention and deletion rules that prevent evidence repositories from becoming unmanaged risk accumulators.
- Controls showing that sensitive runs can still be reviewed consistently under constrained access conditions.
Traceability
Privacy and data protection strongly affect Traceability because RAIDT must preserve a credible chain from run conditions to review outcomes while ensuring that the trace does not expose more data than necessary.
Example evidence / implication:
- Metadata links connecting a decision outcome to the relevant run, model configuration, reviewer action, and redacted evidence.
- Versioned records showing when sensitive elements were masked, transformed, or restricted from wider circulation.
This item affects all five pillars, but its strongest operational influence is on Auditability and Traceability because those pillars depend most directly on the design of the evidence record.
Why this item is more than a generic concept
In general AI governance, privacy and data protection often remain high-level commitments about lawful processing, confidentiality, or user rights. In RAIDT, the meaning is more operational. The question is not only whether privacy matters, but how privacy is enacted in relation to one specific run and the evidence required to review it.
That difference matters because RAIDT makes privacy testable through governance artefacts. A team can inspect whether prompts were minimised, whether sensitive content was redacted, whether access was restricted, whether retention was justified, and whether reviewers could still reconstruct the run. The RAIDT meaning is therefore more concrete than a generic policy statement because it is tied directly to run-level evidence.
Common misunderstanding
Misunderstanding
If an organisation wants strong AI accountability, it should retain as much run data as possible.
Correction
More retention does not automatically produce better governance. It may instead produce a large, sensitive archive that is difficult to control, costly to review, and risky to disclose. For example, a public-sector team might log complete prompt histories to show diligence, only to find that those records contain identifiers, case details, and legally sensitive material that most reviewers should never see. RAIDT corrects this by asking what minimum evidence supports reconstruction, challenge, and assurance for the run in question. Good governance is evidence-sufficient, not evidence-maximal.
Boundary and limitation
Privacy and data protection do not prove that a run was fair, accurate, lawful in every respect, or socially acceptable. They also do not remove the need for upstream data governance, model procurement scrutiny, security controls, or domain-specific compliance review. An evidence pack can be carefully protected and still describe a poor or harmful use of GenAI.
This item may also be difficult to implement when interpretive review genuinely depends on seeing highly sensitive source content. In such cases, minimisation has limits, and governance must rely on stricter access control, secure review environments, or differential evidence views for different audiences. RAIDT handles this limitation by making privacy a design decision within the evidence workflow rather than assuming one uniform disclosure model for all runs.
Implementation levels
Manual implementation
A researcher or small team can apply this item manually by using a run template that prompts explicit decisions on data sensitivity, redaction, access permissions, and retention period before evidence is stored. Raw prompts and outputs can be reviewed by a limited number of authorised people and then converted into a minimised record for wider governance use.
Semi-automated implementation
Semi-automated implementation can use structured metadata fields, evidence-pack templates, redaction checklists, access labels, and retention tags. This allows teams to standardise privacy decisions across runs while still relying on human judgement for edge cases such as mixed personal and commercial sensitivity.
Fully automated implementation
At scale, a platform or orchestration layer can classify run content, mask sensitive fields, separate raw and review-ready evidence views, enforce role-based access, log reviewer access, and trigger retention or deletion workflows automatically. In a mature RAIDT deployment, privacy controls become embedded in the evidence pipeline rather than added afterwards.
Practical use in the RAIDT project
This item is useful across the RAIDT project because it helps explain why run-level evidence must be governed as well as captured. In a foundations-oriented paper, it supports the claim that accountability requires disciplined evidence design rather than indiscriminate logging. In empirical validation, it provides criteria for assessing whether organisations can actually implement RAIDT in sensitive settings. In policy and pathway work, it shows how RAIDT can align evidence practices with real institutional obligations around confidentiality, proportionality, retention, and internal review.
It is also valuable for sector playbooks and scoring rubrics because privacy questions vary by domain while the underlying governance logic remains stable. For viva defence or supervisor discussion, this item helps answer a predictable challenge: whether a framework based on evidence capture risks creating new harm through documentation. RAIDT's answer is that evidence collection is itself a governed intervention.
Key audience questions to prepare for
Q1. Does RAIDT require storing raw prompts and outputs for every run?
No. RAIDT requires enough evidence to support reviewability and governance judgement, not universal retention of raw content. In sensitive contexts, a minimised or redacted record may be more appropriate than full transcript storage.
Q2. How is this different from ordinary data protection compliance?
Ordinary compliance often operates at organisational or system level. RAIDT adds a run-level governance perspective by asking what evidence should exist for one specific AI-supported task and how that evidence can be safely reviewed.
Q3. Does privacy protection weaken auditability?
Not if designed well. Poorly governed privacy controls can obstruct review, but proportionate redaction, role-based access, and structured metadata can preserve auditability while reducing unnecessary exposure.
Q4. Why place this item in a star about boundaries and limitations?
Because it marks a limit on evidence capture. RAIDT cannot simply assume that more documentation is always better; it must acknowledge that some evidence practices become harmful or impractical if privacy is ignored.
Q5. What would strong implementation look like in practice?
Strong implementation would show sensitivity classification, evidence minimisation rules, controlled access, retention schedules, and reviewer workflows that still allow challenge and reconstruction. The key test is whether accountability is improved without turning the evidence pack into an uncontrolled sensitive-data store.
Suggested citation concepts to support this item
- AI governance data minimisation
- accountable logging in generative AI systems
- privacy-preserving audit trails for AI
- evidence retention and access control in AI governance
- confidential data handling in human-AI decision support
- run-level accountability for generative AI
- records management for AI-generated outputs
- proportionality in AI assurance and documentation
- reviewer access control for sensitive AI evidence
- privacy by design in organisational AI deployment
Short explanation for presentation
Privacy and data protection in RAIDT mean that the evidence used to govern a generative AI run must itself be governed. Because RAIDT works at run level, it may capture prompts, outputs, settings, sources, reviewer notes, and decision trails. That is valuable for accountability, but it can also create risk if the evidence pack contains personal, confidential, or otherwise sensitive information. The point is therefore not to log everything. It is to capture enough evidence to support reviewability, contestability, and audit readiness while minimising unnecessary exposure. In practice, this means decisions about redaction, access control, retention, and metadata design are part of governance, not an administrative afterthought. This item shows that evidence-based AI governance only works if evidence collection is proportionate and privacy-aware.
One-line takeaway
Privacy and data protection are the governance controls that make RAIDT's run-level evidence usable, reviewable, and proportionate without turning accountability records into a new source of harm.
Related items in boundaries, limitations and future questions
Anchored questions
- Audience question: Can logging create risk? Answer: yes, which is why the evidence pack itself must be governed.
Mentioned in reference-paper summaries (1)
Paper summaries live in Port/93-References/pdf_summaries/. Each file listed below contains the key term at least once.
UNM-005__3458723.md