Writings

Building EU AI Act-Ready Agents with LangChain

1. Foreword: the €15M question

TLDR. EU AI Act high-risk obligations enforce on 2 August 2026. Non-compliance: up to €15M or 3% of worldwide annual turnover, whichever is higher. A builder's guide to making LangChain agents AI Act-ready, written for execs and the engineers who report to them.

I'm the LangChain ambassador for the Netherlands — which puts me squarely inside the EU's enforcement jurisdiction. I've spent 15+ years in software engineering, with 8+ of those as a solution architect in regulated industries: finance, banking, enterprise infrastructure. I've navigated compliance frameworks, sat across the table from risk and legal teams, and built systems where audit trails weren't optional and accountability had to be designed in from day one.

I'm also the author of Platform Agentic — compliance, governance, and accountability for teams building agentic AI systems.

That experience is why the EU AI Act reads differently to me than it might to someone coming purely from the AI side. The obligations in Articles 9 through 72 aren't novel concepts — they're the same governance expectations that regulated industries have always demanded, now applied to AI systems. The builders who will find this easiest are the ones who've already internalized that discipline. This post is about the governance layer: accountability, audit trail, and the seven articles that drive operational requirements for high-risk AI systems — and how LangChain v1 maps to each one.

Why this piece, why now:

  • Penalty cap: up to €15,000,000 or 3% of worldwide annual turnover, whichever is higher
  • Enforcement deadline: 2 August 2026 (less than 3 months away as of writing)
  • First comprehensive AI regulation anywhere in the world
  • High-risk sectors covered: credit scoring, recruitment, healthcare, biometric identification, education, employment, critical infrastructure, law enforcement, migration, justice
  • Most enterprise agents fall into "high-risk" — that's where the obligations bite

What this post covers:

  • High-risk system obligations (Articles 9, 10, 12, 13, 14, 15, 72)
  • GPAI (general-purpose AI) obligations and what they mean for downstream builders
  • Deployment topology and EU data residency
  • A 90-day plan to get audit-ready

No runnable code. Conceptual code and diagrams only.


2. What the EU AI Act classifies

TLDR. Four risk tiers + a GPAI overlay. Most enterprise agents in regulated industries are high-risk — that's where the operational obligations sit.

The risk pyramid

flowchart TB
    Prohibited["PROHIBITED<br/>e.g. social scoring, manipulative AI,<br/>untargeted facial scraping"]
    HighRisk["HIGH-RISK<br/>credit, HR, healthcare, biometric,<br/>critical infrastructure, law enforcement"]
    Limited["LIMITED RISK<br/>chatbots, deepfakes<br/>(transparency obligations)"]
    Minimal["MINIMAL RISK<br/>spam filters, video games<br/>(no obligations)"]
    GPAI["GPAI Models<br/>separate obligations<br/>cross-cutting"]

    Prohibited --> HighRisk
    HighRisk --> Limited
    Limited --> Minimal

    GPAI -.cross-cuts.-> HighRisk
    GPAI -.cross-cuts.-> Limited

    classDef prohibited fill:#fcc,stroke:#a00,color:#000
    classDef high fill:#fde2e2,stroke:#a02020,color:#000
    classDef limited fill:#fff4cc,stroke:#cc8800,color:#000
    classDef minimal fill:#e6f5d0,stroke:#5a8a3a,color:#000
    classDef gpai fill:#cfe6ff,stroke:#1a4d8c,color:#000
    class Prohibited prohibited
    class HighRisk high
    class Limited limited
    class Minimal minimal
    class GPAI gpai

High-risk triggers (Article 6 + Annex III)

You're high-risk if your agent makes or materially supports decisions in:

  • Financial services — creditworthiness, credit scoring, insurance pricing
  • Employment — recruitment, screening, performance evaluation, task allocation
  • Education — admissions, exam scoring, learning evaluation
  • Healthcare — medical devices, triage, diagnostic support
  • Biometric identification — including emotion recognition
  • Critical infrastructure — energy, water, transport, digital infrastructure
  • Law enforcement & migration — risk assessment, document verification
  • Justice & democratic processes — judicial decision support, election integrity

Quick reality check for builders

  • A finance copilot that approves loans → high-risk
  • An HR agent that screens CVs → high-risk
  • A clinical triage assistant → high-risk
  • A customer support chatbot → limited risk (transparency obligation only — disclose it's an AI)
  • An internal documentation search agent → minimal risk

If you're unsure, assume high-risk and downgrade with legal counsel.


3. The seven articles that matter

TLDR. Seven articles drive the operational requirements for high-risk systems. Each maps cleanly onto a LangChain v1 primitive.

Master crosswalk

flowchart LR
    subgraph Articles["EU AI Act Articles"]
        direction TB
        A9["Art. 9<br/>Risk Management"]
        A10["Art. 10<br/>Data Governance"]
        A12["Art. 12<br/>Event Logging"]
        A13["Art. 13<br/>Transparency"]
        A14["Art. 14<br/>Human Oversight"]
        A15["Art. 15<br/>Accuracy & Resilience"]
        A72["Art. 72<br/>Post-Market Monitoring"]
    end
    subgraph LC["LangChain v1 Capabilities"]
        direction TB
        Eval["Online Evaluators<br/>+ Custom Thresholds"]
        PII["Bias Evaluators<br/>+ PII Middleware"]
        Trace["LangSmith Tracing<br/>+ Retention Tiers"]
        Studio["LangSmith Studio<br/>Visual Execution Graphs"]
        Interrupt["LangGraph interrupt<br/>+ Annotation Queues"]
        AdvEval["Correctness +<br/>Adversarial Evaluators"]
        Drift["Drift Detection<br/>+ Dashboards"]
    end
    A9 --> Eval
    A10 --> PII
    A12 --> Trace
    A13 --> Studio
    A14 --> Interrupt
    A15 --> AdvEval
    A72 --> Drift

Article 9 — Risk management

TLDR. A living risk management system across the development lifecycle — not a one-time document.

Requires:

  • Identify, analyze, evaluate, and mitigate known and reasonably foreseeable risks
  • Continuously updated, not snapshot-in-time
  • Test datasets representative of intended use

LangChain v1 capability:

  • Online evaluators scoring production traffic against custom thresholds
  • Custom evaluators for domain-specific risks (financial accuracy, clinical safety)
  • Webhook → PagerDuty alerts when thresholds breach
  • Risk register kept in sync with evaluator outputs
flowchart LR
    Identify["Identify Risks"] --> Analyze["Analyze"]
    Analyze --> Mitigate["Mitigate"]
    Mitigate --> Monitor["Monitor in Production"]
    Monitor --> Identify

    classDef cycle fill:#cfe6ff,stroke:#1a4d8c,color:#000
    class Identify,Analyze,Mitigate,Monitor cycle

Article 10 — Data governance & bias

TLDR. Data quality, representativeness, and explicit bias examination across protected characteristics.

Requires:

  • Documented data sources and provenance
  • Bias examination across protected characteristics: race, gender, age, religion, nationality, disability, sexual orientation
  • Documented mitigation steps for identified bias

LangChain v1 capability:

  • Bias and fairness evaluators (LangSmith ships templates for the protected characteristics above)
  • PII Middleware — prevents leakage of protected attributes in inputs and outputs
  • Trace dataset documentation in LangSmith for evaluation provenance

Article 12 — Automatic event logging

TLDR. Automatic logging across the system's lifetime, sufficient to identify risks and support post-market monitoring.

Requires:

  • Logs spanning the full system lifecycle
  • Inputs, outputs, timestamps, agent context
  • Sufficient detail for deployer oversight and regulatory inspection

LangChain v1 capability:

  • End-to-end tracing — every LLM call, tool invocation, and reasoning step
  • Structured metadata — timestamps, inputs, outputs, agent context
  • Retention tiers — 14-day base / 400-day extended / bulk export for archival
  • EU residency — LangSmith EU SaaS, BYOC, or self-hosted
flowchart LR
    Agent["Agent Execution"] -->|trace| LS["LangSmith"]
    LS --> Base["Base traces<br/>14 days"]
    LS --> Ext["Extended traces<br/>400 days"]
    Ext --> Archive["Bulk export<br/>long-term archival"]

    classDef store fill:#cfe6ff,stroke:#1a4d8c,color:#000
    class LS,Base,Ext,Archive store

Article 13 — Transparency to deployers

TLDR. Outputs must be interpretable enough that deployers can use the system appropriately.

Requires:

  • Clear instructions for use
  • Information about capabilities, limitations, and known failure modes
  • Outputs interpretable enough to act on or override

LangChain v1 capability:

  • LangSmith Studio — visual execution graph showing state transitions, tool calls, decisions
  • Full reasoning traces — every step inspectable
  • Documented agent specifications — inputs, outputs, tool registry, system prompt

Article 14 — Human oversight

TLDR. Humans must understand, intervene on, override, and interrupt the system. Not theatrical — measurable.

Requires:

  • Oversight measures designed into the system architecture
  • Humans able to intervene at decision points
  • Auditable trail of oversight events

LangChain v1 capability:

  • LangGraph interrupt — pause, inspect, modify, resume at any node
  • LangSmith annotation queues — structured feedback fields, not one-click approve
  • Webhooks for incident routing
  • Durable runtime — checkpointing, exactly-once execution, resume-from-exact-point
flowchart LR
    Agent["Agent reaches<br/>state-change tool"] --> Int["LangGraph interrupt"]
    Int --> Q["Annotation Queue"]
    Q --> Reviewer["Human reviewer<br/>structured feedback"]
    Reviewer -->|approve| Resume["Resume from checkpoint"]
    Reviewer -->|reject| Halt["Halt + log decision"]

    classDef ok fill:#cfe6ff,stroke:#1a4d8c,color:#000
    class Int,Q,Reviewer,Resume,Halt ok

Article 15 — Accuracy & adversarial resilience

TLDR. Declared accuracy levels and demonstrable protection against common attack surfaces.

Requires:

  • Stated accuracy metrics relevant to the use case
  • Adversarial resilience (prompt injection, jailbreak, data poisoning)
  • Consistency over the system's lifetime

LangChain v1 capability:

  • Correctness, exact match, plan adherence, task completion evaluators
  • Prompt injection, jailbreaking evaluators (LangSmith templates)
  • API leakage, code injection evaluators for tool-calling agents
  • Adversarial evaluation suites — run before every release

Article 72 — Post-market monitoring

TLDR. Continuous monitoring of production behavior with incident reporting to authorities.

Requires:

  • Continuous monitoring of system behavior
  • Drift detection
  • Incident reporting to national supervisory authorities

LangChain v1 capability:

  • Online evaluators with custom thresholds
  • Drift detection dashboards
  • Webhooks → incident response system
  • Audit dashboards for internal compliance and regulator-facing reporting

4. GPAI obligations

TLDR. If you train and distribute a general-purpose AI model, you have separate obligations. If you build agents on top of a GPAI model (most LangChain users), you don't — but you inherit downstream effects.

Provider obligations (Articles 51–55)

If you provide a GPAI model to others:

  • Technical documentation — capabilities, limitations, training process
  • Information for downstream developers — what they need to integrate responsibly
  • Copyright compliance — disclose training data sources, respect opt-outs
  • Energy & resource use disclosure — compute, energy consumption
  • Systemic risk threshold — models above 10²⁵ FLOPs of training compute have additional obligations: safety evaluations, adversarial testing, incident reporting

Most LangChain users are NOT GPAI providers

flowchart LR
    Foundation["Foundation Model<br/>(GPT, Claude, Gemini, etc.)"] -->|API call| YourAgent["Your LangChain Agent"]
    YourAgent --> EndUser["End User"]

    Foundation -.GPAI obligations.- ProviderRole["Model Provider<br/>(Anthropic, OpenAI, Google, etc.)"]
    YourAgent -.AI Act high-risk obligations.- DownstreamRole["You — Downstream Developer"]

    classDef provider fill:#ffe5b4,stroke:#996600,color:#000
    classDef downstream fill:#cfe6ff,stroke:#1a4d8c,color:#000
    class ProviderRole provider
    class DownstreamRole downstream

Implications for you:

  • Picking a model provider = picking docs you'll cite in your technical documentation
  • Providers that publish detailed model cards and evaluation suites make your life easier
  • Fine-tuning extensively and redistributing the fine-tuned model can shift you into the provider role — get legal advice if you're close to that line

5. Deployment & data residency

TLDR. The Act doesn't mandate EU-only data residency, but for high-risk systems you'll often need it for adjacent reasons — GDPR, sector rules, audit posture.

flowchart TB
    subgraph Options["LangSmith Deployment Options"]
        direction LR
        Cloud["Managed Cloud<br/>(US region)"]
        EU["LangSmith EU<br/>(EU region SaaS)"]
        BYOC["Bring Your Own Cloud<br/>BYOC"]
        SH["Self-Hosted<br/>your Kubernetes"]
    end

    Cloud --> Use1["General use<br/>non-EU workloads"]
    EU --> Use2["High-risk EU systems<br/>most common choice"]
    BYOC --> Use3["Regulated industries<br/>finance, healthcare"]
    SH --> Use4["Maximum control<br/>government, defense"]

    classDef opt fill:#cfe6ff,stroke:#1a4d8c,color:#000
    class Cloud,EU,BYOC,SH opt

Choosing between them:

  • Managed Cloud (US): general-purpose, not for EU AI Act high-risk traces
  • LangSmith EU SaaS: trace data stays in-jurisdiction; default for most EU teams subject to the Act
  • BYOC: deploy in your own AWS/GCP/Azure region; common for regulated industries that want their cloud, their region, their controls
  • Self-hosted on Kubernetes: maximum control; airgap-capable for defense, government, classified workloads

Practical guidance: if you're in scope of the Act and your customers are EU-based, default to LangSmith EU SaaS or BYOC. The audit story is dramatically simpler when traces never leave the jurisdiction.


6. The 90-day compliance plan

TLDR. Map your agents → wire up the technical primitives → document for audit.

Days 1–30: Map

  • Identify which agents are high-risk under Article 6 + Annex III
  • Document use case, deployer, sector, decision impact for each
  • Determine your role: GPAI provider or downstream developer
  • Stand up the risk register
  • Pick deployment topology (EU SaaS / BYOC / self-hosted)

Days 31–60: Wire up

  • LangSmith tracing on every high-risk agent path
  • PIIMiddleware on inputs and outputs
  • Bias evaluators per relevant protected characteristic
  • LangGraph interrupt on every state-changing tool call
  • Online evaluators for prompt injection and adversarial inputs
  • Webhooks → incident response system

Days 61–90: Document

  • Technical documentation per Annex IV
  • Risk management documentation per Article 9
  • Logging and retention policies per Article 12
  • Human oversight procedures per Article 14
  • Post-market monitoring plan per Article 72
  • Internal audit-readiness review before the August deadline

Trade-offs to budget for

  • LangSmith EU / BYOC / self-hosted = higher cost, lower latency than managed US cloud
  • HITL queues require operational headcount — someone must actually review
  • Documentation burden is real; budget engineering time, not just legal time

7. What this doesn't cover

TLDR. AI Act is one regulation in an emerging stack. GDPR overlap, sector-specific rules, and other jurisdictions all matter — and most converge on the same operational asks.

  • GDPR overlap — Article 10 (data governance) overlaps with GDPR Articles 5 and 32; PII Middleware helps both
  • Sector-specific rules — finance (DORA), healthcare (MDR), employment (national labor law) stack on top
  • Other jurisdictions — Colorado AI Act (enforcement June 2026), UK pro-innovation approach, NYC AEDT, US state-level laws
  • National implementations — EU member states are still finalizing how they'll enforce; expect variation
  • Convergence is real — most regimes share the same operational primitives. Build once for AI Act, cover most of the rest.

8. Closing

TLDR. AI Act compliance isn't a checklist you pass once — it's a posture you build into the system. The good news: every primitive that makes you compliant also makes you safer, faster to debug, and more trustworthy to deployers.

  • The seven articles, mapped to LangChain v1, give you a clear technical scope of work
  • The 90-day plan is tight but achievable if you start now
  • The audit trail you build for the regulator is the same audit trail that helps your team ship faster
  • Until the next round of regulation lands: map your agents, trace everything, keep humans in the loop, and assume the regulator will eventually ask

Article crosswalk (quick reference)

Article Requirement LangChain v1 Capability
Art. 9 Risk management lifecycle Online evaluators + custom thresholds + alerting
Art. 10 Data governance & bias Bias/fairness evaluators + PII Middleware
Art. 12 Automatic event logging LangSmith tracing + retention tiers + EU residency
Art. 13 Transparency to deployers LangSmith Studio + full reasoning traces
Art. 14 Human oversight LangGraph interrupt + annotation queues
Art. 15 Accuracy & adversarial resilience Correctness + adversarial evaluators
Art. 72 Post-market monitoring Drift detection + dashboards + webhooks

References