Section 26.8: Regulation & Compliance

★ Big Picture

The regulatory landscape for AI is evolving rapidly across jurisdictions. The EU AI Act establishes the world's first comprehensive AI regulation with risk-based tiers. GDPR already applies to LLM systems that process personal data. US executive orders set voluntary frameworks, while sector-specific regulations (HIPAA, financial services, education) add domain-specific requirements. Engineers must design systems that can adapt to this shifting landscape.

1. EU AI Act Risk Tiers

Figure 26.8.1: The EU AI Act classifies AI systems into four risk tiers with escalating compliance requirements.

2. GDPR Requirements for LLM Systems

GDPR Article	Requirement	LLM Implication
Art. 6	Lawful basis for processing	Need legal basis for training on personal data
Art. 13-14	Transparency	Disclose AI use; explain automated decisions
Art. 17	Right to erasure	Must be able to remove individual's data from model
Art. 22	Automated decision-making	Human review for decisions with legal/significant effects
Art. 25	Data protection by design	Privacy-preserving training, PII filtering
Art. 35	DPIA required	Impact assessment before deploying LLM systems

from dataclasses import dataclass, field
from datetime import datetime

@dataclass
class ComplianceChecker:
    """Check LLM deployment against regulatory requirements."""

    checks: dict = field(default_factory=lambda: {
        "gdpr_lawful_basis": False,
        "transparency_notice": False,
        "dpia_completed": False,
        "human_oversight": False,
        "data_retention_policy": False,
        "erasure_mechanism": False,
        "audit_logging": False,
        "bias_assessment": False,
    })

    def mark_complete(self, check_name: str):
        if check_name in self.checks:
            self.checks[check_name] = True

    def report(self) -> dict:
        total = len(self.checks)
        completed = sum(self.checks.values())
        missing = [k for k, v in self.checks.items() if not v]
        return {
            "score": f"{completed}/{total}",
            "compliant": completed == total,
            "missing": missing,
            "assessed_at": datetime.utcnow().isoformat(),
        }

checker = ComplianceChecker()
checker.mark_complete("gdpr_lawful_basis")
checker.mark_complete("audit_logging")
print(checker.report())

{'score': '2/8', 'compliant': False, 'missing': ['transparency_notice', 'dpia_completed', 'human_oversight', 'data_retention_policy', 'erasure_mechanism', 'bias_assessment'], 'assessed_at': '2025-01-15T10:30:00'}

3. Sector-Specific Regulations

def get_sector_requirements(sector: str) -> dict:
    """Return regulatory requirements by sector for LLM deployments."""
    requirements = {
        "healthcare": {
            "regulations": ["HIPAA", "FDA guidance on AI/ML", "21 CFR Part 11"],
            "requirements": [
                "PHI de-identification before LLM processing",
                "BAA with cloud/API providers",
                "Clinical validation for diagnostic support",
                "Audit trail for all AI-assisted decisions",
            ],
        },
        "finance": {
            "regulations": ["SR 11-7", "ECOA", "FCRA", "SEC guidance"],
            "requirements": [
                "Model risk management documentation",
                "Fair lending compliance testing",
                "Explainability for credit decisions",
                "Independent model validation",
            ],
        },
        "education": {
            "regulations": ["FERPA", "COPPA", "state AI laws"],
            "requirements": [
                "Student data privacy protection",
                "Parental consent for minors (COPPA)",
                "Transparency about AI use in grading",
                "Opt-out mechanisms for students",
            ],
        },
    }
    return requirements.get(sector, {"error": "Sector not found"})

import json
print(json.dumps(get_sector_requirements("healthcare"), indent=2))

Figure 26.8.2: Regulatory approaches vary significantly by jurisdiction, from binding EU law to voluntary US frameworks.

⚠ Warning

GDPR Article 17 (right to erasure) creates a fundamental challenge for LLMs: you cannot simply delete a person's data from a trained model's weights. Compliance may require machine unlearning techniques (Section 26.11), data filtering before training, or architectural solutions that separate personal data from model parameters.

📝 Note

The EU AI Act classifies general-purpose AI (GPAI) models separately. Providers of GPAI models with systemic risk (trained with more than 10^25 FLOPs) face additional obligations including red-teaming, incident reporting, and cybersecurity requirements.

★ Key Insight

Design for the most restrictive regulation you might face, not just your current jurisdiction. If your LLM application processes data from EU residents, GDPR applies regardless of where your servers are located. Building compliance into the architecture from the start is far cheaper than retrofitting it later.

Knowledge Check

1. What are the four risk tiers in the EU AI Act?

Show Answer

Prohibited (social scoring, subliminal manipulation), High Risk (employment, credit, education, law enforcement, requiring conformity assessment and human oversight), Limited Risk (chatbots, deepfakes, requiring transparency notices), and Minimal Risk (no specific obligations). Most LLM chatbots fall under Limited Risk, but using them for employment screening or credit decisions elevates them to High Risk.

2. How does GDPR Article 22 affect LLM-based decision systems?

Show Answer

Article 22 gives individuals the right not to be subject to decisions based solely on automated processing that produce legal or similarly significant effects. LLM systems used for hiring, credit scoring, or insurance underwriting must provide meaningful human oversight, the ability to obtain human intervention, and explanations of the logic involved.

3. What is a DPIA and when is it required for LLM systems?

Show Answer

A Data Protection Impact Assessment (DPIA) is required under GDPR Article 35 when processing is likely to result in a high risk to individuals' rights and freedoms. For LLM systems, a DPIA is required when the system processes personal data at scale, makes automated decisions about individuals, processes sensitive categories of data, or systematically monitors individuals.

4. Why do financial services LLM deployments face unique regulatory challenges?

Show Answer

Financial services are regulated by SR 11-7 (model risk management), ECOA (fair lending), FCRA (credit reporting accuracy), and SEC guidance. LLMs used in credit decisions must be explainable (adverse action notices require specific reasons), independently validated, documented with model risk management standards, and tested for fair lending compliance across protected classes.

5. Why should you design for the most restrictive applicable regulation?

Show Answer

Regulations often apply based on the data subject's location, not the server's location. An application serving EU users must comply with GDPR and potentially the AI Act regardless of where it runs. Building compliance into the architecture from the start is far cheaper than retrofitting. Designing for the strictest standard also future-proofs against tightening regulations in other jurisdictions.

Key Takeaways

The EU AI Act establishes four risk tiers; most LLM chatbots fall under Limited Risk (transparency required) but domain-specific uses may be High Risk.
GDPR applies to any LLM processing personal data and creates obligations around consent, transparency, erasure, and automated decision-making.
US regulation relies on executive orders, voluntary commitments, and sector-specific rules (HIPAA, SR 11-7, FERPA) rather than comprehensive AI legislation.
Conduct a DPIA before deploying LLM systems that process personal data at scale or make decisions affecting individuals.
Design for the most restrictive regulation you may face; compliance is cheaper to build in than to retrofit.
GDPR's right to erasure creates a fundamental challenge for LLMs that may require machine unlearning or architectural separation of personal data.