LangChain Interrupt 2026 Conference Recap
If Interrupt 2025 was about the arrival of the "agent engineer" — model optionality, context engineering, and eval-driven development — then Interrupt 2026 was about what comes after you can build an agent: industrializing it. The phrase on every stage this year was deep agents, and the tools, launches, and war stories all pointed the same direction — from building agents to running, observing, and improving them at scale, often in the most regulated industries on earth.
A year of better models, maturing MCP, and a new vocabulary of skills, sandboxes, harnesses, and subagents showed up everywhere: in LangChain's own product launches, in how teams write evals, and in production deployments now measured in the hundreds of millions of runs.
A full house for the Day 1 keynote — LangChain Interrupt 2026, San Francisco, May 13–14.
Want last year's recap? See the LangChain Interrupt 2025 recap.
I. The Deep Agents Era (Keynote — Harrison Chase & Ankush Gola)
The 2026 keynote reframed the agent from "a model calling tools in a loop" to a deep agent: an agent harness that adds batteries to that loop. A deep agent is a model plus four capabilities:
- An execution environment — a spectrum from a virtual file system (a database exposed to the agent as files) to a full code sandbox. Deep Agent 0.6 adds a Code Interpreter built on QuickJS as the lightweight, multi-tenant middle ground.
- Context management — skills, memory, summarization, context offloading, and prompt caching, with skills progressively disclosed so they don't blow the context window.
- Human-in-the-loop steering — interrupting, editing, and redirecting a running agent.
- Delegation to subagents — decomposing work across specialized agents.
LangChain shipped Deep Agent 0.6 with native open-source model support (GLM, DeepSeek, Nemotron), inference-partner integrations (Fireworks, Baseten, NVIDIA), and a new streaming protocol with front-end SDKs. A recurring principle: memory and context standards should stay open.
"Deep agents is an agent harness, and it basically adds more batteries-included things that supercharge this loop."
II. LangChain's 2026 Launches — the Platform Now Runs and Improves Agents
Where 2025's launches were about control and deployment (LangGraph Platform GA, Studio v2, Open Agent Platform), 2026's launches are about operating agents in production and closing the loop:
| Launch | What it is |
|---|---|
| Managed Deep Agents | A single API for the deep-agents harness with a production runtime, durable checkpointing, Context Hub, and sandboxes (private beta). |
| LangSmith Engine | An ambient agent that watches production traces, clusters and prioritizes issues with trace-backed evidence, and proposes fixes as one-click GitHub PRs — plus online evaluators and ground-truth datasets. An agent that fixes agents. |
| LangSmith Sandboxes | Now GA: ~1-second spin-up, egress proxy that hides API keys from the agent, durable pause/resume, snapshot/fork. For running untrusted agent-written code safely. |
| Context Hub | Versioned agent.md files, skills, and LLM wikis as open-standard, shareable memory. |
| LLM Gateway | Spend limits and PII/secret guardrails in front of model calls (beta). |
| SmithDB | A purpose-built observability database written in Rust (on Apache DataFusion + Vortex) that made trace workloads 6–15× faster — now serving all US Cloud tracing. |
The numbers behind those launches set the tone for the whole conference: 100M+ agent runs served, 150M+ traces per week, and P99 trace payloads that grew from kilobytes to megabytes as agents got deeper. Traces, the keynote argued, are now the center of the agent lifecycle.
➡️ Full product breakdown of all seven launches → Product Announcements for 2026
III. Evals Grew Up
In 2025, "eval-driven development" was the mantra. In 2026 it became infrastructure — and the people writing evals changed.
- Make Legal Write Your Evals (Chime). Evals became the alignment surface between engineers and legal/compliance. Break vague risk into a taxonomy of domains and concrete risks; legal writes structured risk definitions that bootstrap both datasets and LLM-as-judge evaluators. One annotation feeds a four-way improvement flywheel, and compliance signals arrive in hours, not at the release date.
- Evals That Actually Matter (Lyft). Treat AI evals like traditional ML: offline eval as the quality gate, a tau-bench-style simulator with mocked MCP tools, and LLM-as-judge framed around real agent tasks.
- Observing and Testing CX Agents (Cisco). Every thumbs-down is a signal: a continuous loop from trace → AI-diagnosed PR → permanent regression test. "Evals are infrastructure, not a side project."
- Building AI for Healthcare (Abridge). Reference-free and reference-based judges cut release cycles from 1–2 months to days — "you don't have to trade velocity for quality."
"Trust is earned in drops but lost in buckets." — Abridge
IV. Agents in Production — Across the Most Regulated Industries
The case studies this year weren't demos; they were deployments in finance, healthcare, transportation, and the enterprise.
| Company | Talk | Headline |
|---|---|---|
| Toyota | The Production System for Agents | An internal platform took agent builds from 6 months / 6 engineers to 4 days / 1 engineer; 50,000+ agents in production, mapped to Toyota Production System principles. |
| MongoDB | Agents in the Enterprise | ElevenLabs ran 14M production agents after migrating to MongoDB; ~70% of MongoDB's own checked-in code last week came from coding agents. |
| Box | Agents in the Enterprise | Aaron Levie: coding agents win because output is verifiable; knowledge work is the opposite, so expect years of change management. Decade-old single-governance bets now pay off. |
| Abridge | Building AI for Healthcare | A clinical-intelligence platform across 250 of the largest US health systems, 100M+ conversations/year. |
| Bridgewater | Building Pat, the AI Pocket Analyst | "The plan really is the analysis" — plan once, generate code in parallel; human-like search inspection lifted accuracy from ~75% to ~90%. |
| Coinbase | Building Developer Support Agents | A self-improving developer-support system on self-hosted LangSmith tracing + an MCP docs server with RAG fallback. |
| 60% Faster Time-to-Interview | Hiring agents cut time-to-interview by 60%, evolving from chains to a central LLM planner running plan–act–replan. | |
| Cisco | Frontier CX Agents | From chatbot to delegating teammate via dynamic plan-based supervisor graphs — and the lesson that 95% accuracy wasn't enough for adoption. |
A common thread: MCP as the integration layer (swap backends without touching agents), skills as a shared, reusable primitive, and coding-agent harnesses as the substrate for all knowledge work.
V. The Human in the Loop Endures
For all the automation, the closing talks pulled back to people and judgment.
- Future of AI Agents — Andrew Ng. Build with small, high-context generalist tasks under guardrails. Most AI experiments yield incremental efficiency, not transformation — so redesign the whole workflow top-down. Chase growth over cost savings (growth has almost no ceiling), and preserve optionality with vendor-neutral tools and open-weight models.
- The Return of the Data Scientist — Shreya Shankar & Hamel Husain. The data-scientist mindset is back: experiment, measure, improve. The single most-repeated piece of advice of the whole conference:
"Always look at your data."
A full floor for two days of deep-agents talks in San Francisco.
VI. What Changed Since 2025
| Interrupt 2025 | Interrupt 2026 | |
|---|---|---|
| Core abstraction | The "agent engineer"; model optionality | Deep agents — the agent harness (skills, memory, sandbox, subagents) |
| LangChain orchestration | LangGraph (low-level control) | Deep Agents (batteries-included harness on top of LangGraph) |
| Flagship launches | LangGraph Platform GA, Studio v2, Open Agent Platform | Managed Deep Agents, LangSmith Engine, Sandboxes (GA), Context Hub, LLM Gateway |
| Observability | New agent metrics + trajectory views | SmithDB (Rust) 6–15× faster; 150M+ weekly traces; traces as the center of the lifecycle |
| Evals | Eval-driven development; LLM-as-judge | Legal/compliance write the evals; online evals; eval-as-infrastructure; agents that build their own evals |
| Tools & interop | MCP emerging; AGNTCY interop | MCP as the integration layer; skills as a shared primitive |
| The frontier | "Deployment is the next hurdle" | Deployment at scale (100M+ runs, 50k+ agents); agents that improve agents |
| Who builds | "Everyone becomes an agent builder" | The data scientist returns — taste and looking at your data still decide |
VII. The Road Ahead
- Harnesses over frameworks. The durable engineering is in the agent harness — execution environment, context management, steering, delegation — kept simple and rewritable.
- Closed-loop operations. Trace → diagnosis → fix → eval, increasingly run by agents (LangSmith Engine), with humans approving PRs.
- Open memory and context. Skills,
agent.md, and memory as portable, open standards rather than platform lock-in. - Sandboxed, multi-model, multi-modal. Untrusted code runs in sandboxes; enterprises stay model-agnostic on cost; computer-use and richer modalities are next.
- Customer-facing scale. Most enterprises are still on internal agents; the prize — and the hard part — is reliable customer-facing agents at volume.
Key Takeaways
- Deep agents are the organizing idea of 2026 — a harness of skills, memory, sandboxes, and subagents around the model.
- The LangChain platform now runs and improves agents, not just observes them (Managed Deep Agents, LangSmith Engine, Sandboxes, SmithDB).
- Evals are infrastructure — and increasingly written by legal, compliance, and the agents themselves.
- MCP is the integration layer and skills are the new shareable primitive.
- Production is real: 100M+ runs, 50k+ agents at one company, 14M at another, 250 health systems.
- Stay agnostic on models and frameworks; build for portability.
- Always look at your data.
Resources
- Event agenda: interrupt.langchain.com/event-agenda
- Full session digest: Presentation Directory — every talk, in running order
- LangChain: langchain.com · LangSmith: smith.langchain.com · Deep Agents: docs.langchain.com
- Last year: LangChain Interrupt 2025 recap
Talk pages include synthesized summaries plus the complete timestamped transcript (in-person audio, transcribed locally with Whisper large-v3 and lightly cleaned). A few proper nouns and figures may carry minor transcription artifacts; see each page's notes. ~6 agenda sessions were not captured and are summarized from the public agenda.
Conference: May 13–14, 2026 · San Francisco