LangChain Interrupt 2026 Conference Recap

If Interrupt 2025 was about the arrival of the "agent engineer" — model optionality, context engineering, and eval-driven development — then Interrupt 2026 was about what comes after you can build an agent: industrializing it. The phrase on every stage this year was deep agents, and the tools, launches, and war stories all pointed the same direction — from building agents to running, observing, and improving them at scale, often in the most regulated industries on earth.

A year of better models, maturing MCP, and a new vocabulary of skills, sandboxes, harnesses, and subagents showed up everywhere: in LangChain's own product launches, in how teams write evals, and in production deployments now measured in the hundreds of millions of runs.

A packed house at LangChain Interrupt 2026 in San Francisco, seen from the balcony during the Day 1 keynote A full house for the Day 1 keynote — LangChain Interrupt 2026, San Francisco, May 13–14.

Want last year's recap? See the LangChain Interrupt 2025 recap.

I. The Deep Agents Era (Keynote — Harrison Chase & Ankush Gola)

The 2026 keynote reframed the agent from "a model calling tools in a loop" to a deep agent: an agent harness that adds batteries to that loop. A deep agent is a model plus four capabilities:

An execution environment — a spectrum from a virtual file system (a database exposed to the agent as files) to a full code sandbox. Deep Agent 0.6 adds a Code Interpreter built on QuickJS as the lightweight, multi-tenant middle ground.
Context management — skills, memory, summarization, context offloading, and prompt caching, with skills progressively disclosed so they don't blow the context window.
Human-in-the-loop steering — interrupting, editing, and redirecting a running agent.
Delegation to subagents — decomposing work across specialized agents.

LangChain shipped Deep Agent 0.6 with native open-source model support (GLM, DeepSeek, Nemotron), inference-partner integrations (Fireworks, Baseten, NVIDIA), and a new streaming protocol with front-end SDKs. A recurring principle: memory and context standards should stay open.

"Deep agents is an agent harness, and it basically adds more batteries-included things that supercharge this loop."

II. LangChain's 2026 Launches — the Platform Now Runs and Improves Agents

Where 2025's launches were about control and deployment (LangGraph Platform GA, Studio v2, Open Agent Platform), 2026's launches are about operating agents in production and closing the loop:

Launch	What it is
Managed Deep Agents	A single API for the deep-agents harness with a production runtime, durable checkpointing, Context Hub, and sandboxes (private beta).
LangSmith Engine	An ambient agent that watches production traces, clusters and prioritizes issues with trace-backed evidence, and proposes fixes as one-click GitHub PRs — plus online evaluators and ground-truth datasets. An agent that fixes agents.
LangSmith Sandboxes	Now GA: ~1-second spin-up, egress proxy that hides API keys from the agent, durable pause/resume, snapshot/fork. For running untrusted agent-written code safely.
Context Hub	Versioned `agent.md` files, skills, and LLM wikis as open-standard, shareable memory.
LLM Gateway	Spend limits and PII/secret guardrails in front of model calls (beta).
SmithDB	A purpose-built observability database written in Rust (on Apache DataFusion + Vortex) that made trace workloads 6–15× faster — now serving all US Cloud tracing.

The numbers behind those launches set the tone for the whole conference: 100M+ agent runs served, 150M+ traces per week, and P99 trace payloads that grew from kilobytes to megabytes as agents got deeper. Traces, the keynote argued, are now the center of the agent lifecycle.

➡️ Full product breakdown of all seven launches → Product Announcements for 2026

III. Evals Grew Up

In 2025, "eval-driven development" was the mantra. In 2026 it became infrastructure — and the people writing evals changed.

Make Legal Write Your Evals (Chime). Evals became the alignment surface between engineers and legal/compliance. Break vague risk into a taxonomy of domains and concrete risks; legal writes structured risk definitions that bootstrap both datasets and LLM-as-judge evaluators. One annotation feeds a four-way improvement flywheel, and compliance signals arrive in hours, not at the release date.
Evals That Actually Matter (Lyft). Treat AI evals like traditional ML: offline eval as the quality gate, a tau-bench-style simulator with mocked MCP tools, and LLM-as-judge framed around real agent tasks.
Observing and Testing CX Agents (Cisco). Every thumbs-down is a signal: a continuous loop from trace → AI-diagnosed PR → permanent regression test. "Evals are infrastructure, not a side project."
Building AI for Healthcare (Abridge). Reference-free and reference-based judges cut release cycles from 1–2 months to days — "you don't have to trade velocity for quality."

"Trust is earned in drops but lost in buckets." — Abridge

IV. Agents in Production — Across the Most Regulated Industries

The case studies this year weren't demos; they were deployments in finance, healthcare, transportation, and the enterprise.

Company	Talk	Headline
Toyota	The Production System for Agents	An internal platform took agent builds from 6 months / 6 engineers to 4 days / 1 engineer; 50,000+ agents in production, mapped to Toyota Production System principles.
MongoDB	Agents in the Enterprise	ElevenLabs ran 14M production agents after migrating to MongoDB; ~70% of MongoDB's own checked-in code last week came from coding agents.
Box	Agents in the Enterprise	Aaron Levie: coding agents win because output is verifiable; knowledge work is the opposite, so expect years of change management. Decade-old single-governance bets now pay off.
Abridge	Building AI for Healthcare	A clinical-intelligence platform across 250 of the largest US health systems, 100M+ conversations/year.
Bridgewater	Building Pat, the AI Pocket Analyst	"The plan really is the analysis" — plan once, generate code in parallel; human-like search inspection lifted accuracy from ~75% to ~90%.
Coinbase	Building Developer Support Agents	A self-improving developer-support system on self-hosted LangSmith tracing + an MCP docs server with RAG fallback.
LinkedIn	60% Faster Time-to-Interview	Hiring agents cut time-to-interview by 60%, evolving from chains to a central LLM planner running plan–act–replan.
Cisco	Frontier CX Agents	From chatbot to delegating teammate via dynamic plan-based supervisor graphs — and the lesson that 95% accuracy wasn't enough for adoption.

A common thread: MCP as the integration layer (swap backends without touching agents), skills as a shared, reusable primitive, and coding-agent harnesses as the substrate for all knowledge work.

V. The Human in the Loop Endures

For all the automation, the closing talks pulled back to people and judgment.

Future of AI Agents — Andrew Ng. Build with small, high-context generalist tasks under guardrails. Most AI experiments yield incremental efficiency, not transformation — so redesign the whole workflow top-down. Chase growth over cost savings (growth has almost no ceiling), and preserve optionality with vendor-neutral tools and open-weight models.
The Return of the Data Scientist — Shreya Shankar & Hamel Husain. The data-scientist mindset is back: experiment, measure, improve. The single most-repeated piece of advice of the whole conference:

"Always look at your data."

The Interrupt 2026 audience filling the floor, seen from the balcony A full floor for two days of deep-agents talks in San Francisco.

VI. What Changed Since 2025

	Interrupt 2025	Interrupt 2026
Core abstraction	The "agent engineer"; model optionality	Deep agents — the agent harness (skills, memory, sandbox, subagents)
LangChain orchestration	LangGraph (low-level control)	Deep Agents (batteries-included harness on top of LangGraph)
Flagship launches	LangGraph Platform GA, Studio v2, Open Agent Platform	Managed Deep Agents, LangSmith Engine, Sandboxes (GA), Context Hub, LLM Gateway
Observability	New agent metrics + trajectory views	SmithDB (Rust) 6–15× faster; 150M+ weekly traces; traces as the center of the lifecycle
Evals	Eval-driven development; LLM-as-judge	Legal/compliance write the evals; online evals; eval-as-infrastructure; agents that build their own evals
Tools & interop	MCP emerging; AGNTCY interop	MCP as the integration layer; skills as a shared primitive
The frontier	"Deployment is the next hurdle"	Deployment at scale (100M+ runs, 50k+ agents); agents that improve agents
Who builds	"Everyone becomes an agent builder"	The data scientist returns — taste and looking at your data still decide

VII. The Road Ahead

Harnesses over frameworks. The durable engineering is in the agent harness — execution environment, context management, steering, delegation — kept simple and rewritable.
Closed-loop operations. Trace → diagnosis → fix → eval, increasingly run by agents (LangSmith Engine), with humans approving PRs.
Open memory and context. Skills, agent.md, and memory as portable, open standards rather than platform lock-in.
Sandboxed, multi-model, multi-modal. Untrusted code runs in sandboxes; enterprises stay model-agnostic on cost; computer-use and richer modalities are next.
Customer-facing scale. Most enterprises are still on internal agents; the prize — and the hard part — is reliable customer-facing agents at volume.

Key Takeaways

Deep agents are the organizing idea of 2026 — a harness of skills, memory, sandboxes, and subagents around the model.
The LangChain platform now runs and improves agents, not just observes them (Managed Deep Agents, LangSmith Engine, Sandboxes, SmithDB).
Evals are infrastructure — and increasingly written by legal, compliance, and the agents themselves.
MCP is the integration layer and skills are the new shareable primitive.
Production is real: 100M+ runs, 50k+ agents at one company, 14M at another, 250 health systems.
Stay agnostic on models and frameworks; build for portability.
Always look at your data.

Resources

Event agenda: interrupt.langchain.com/event-agenda
Full session digest: Presentation Directory — every talk, in running order
LangChain: langchain.com · LangSmith: smith.langchain.com · Deep Agents: docs.langchain.com
Last year: LangChain Interrupt 2025 recap

On these transcripts

Talk pages include synthesized summaries plus the complete timestamped transcript (in-person audio, transcribed locally with Whisper large-v3 and lightly cleaned). A few proper nouns and figures may carry minor transcription artifacts; see each page's notes. ~6 agenda sessions were not captured and are summarized from the public agenda.

Conference: May 13–14, 2026 · San Francisco

I. The Deep Agents Era (Keynote — Harrison Chase & Ankush Gola)​

II. LangChain's 2026 Launches — the Platform Now Runs and Improves Agents​

III. Evals Grew Up​

IV. Agents in Production — Across the Most Regulated Industries​

V. The Human in the Loop Endures​

VI. What Changed Since 2025​

VII. The Road Ahead​

Key Takeaways​

Resources​