Skip to main content

Presentation Directory

Every session from LangChain Interrupt 2026, in running order. Talks we recorded link to a full page (synthesis + complete transcript); sessions we did not capture are summarized from the published agenda.

Quick Digest Table

Ref #TrackPresentationTake Aways
1.1KeynotesDay 1 Keynote: The Deep Agents EraDeep agents = batteries-included agent harness
Deep Agent 0.6: open models + Code Interpreter + streaming
SmithDB (Rust) makes traces 6-15x faster
LangSmith Engine: ambient issue-fixing agent
1.2MC Opening Remarks & HousekeepingLast year: do agents work? This year: yes
Logistics + sponsors
Cisco CX is presenting sponsor
1.3Case StudiesBuilding Frontier CX AgentsChatbot to delegating teammate
Dynamic plan-based supervisor graphs
95% accuracy wasn't enough; adoption plateaued
Routing is your first decision
1.4Scaling GTM Agents (not recorded)Not recorded — summary from the public agenda.
1.5The Production System for Agents6 months/6 engineers became 4 days/1 engineer
50,000+ agents in production
Gateball: hours of manual search to ~10s
LangChain mapped to TPS principles
1.6Engineering & EvaluationHow Lyft Builds Evals That Actually Matter in ProductionTreat AI evals like traditional ML
Offline eval = quality gate; don't test on users
Built a tau-bench-style simulator with mocked MCP
Frame LLM-as-judge around agent tasks (talk cuts off)
1.7Make Legal Write Your EvalsEvals are the alignment surface that removes the language barrier between engineers and legal
Break vague risk into a taxonomy of domains, categories, and concrete risks
Legal writes structured risk definitions that bootstrap both datasets and LLM-as-judge evaluators
Aggregate pass rates at every altitude so engineers, compliance, and executives each get the view they need
A feedback flywheel turns one annotation into four improvements; compliance signals arrive in hours, not at the release date
1.8Product & LaunchesIntroducing Managed Deep AgentsAn agent = model + harness; the harness delivers the right context at the right time
Deep Agents harness has four capabilities: execution environment, context management, delegation, steering
Context management uses summarization, offloading, memory, prompt caching, and progressively-disclosed skills
Provider-agnostic and customizable via middleware hooks around the core agent loop
Managed Deep Agents (private beta) adds production runtime, durable checkpointing, Context Hub, and LangChain sandboxes
1.9How We Built It: LangSmith EngineLangSmith Engine is an agent that finds, clusters, and fixes agent issues from production traces
It proposes prompt/code fixes as one-click GitHub PRs and builds online evaluators plus ground-truth datasets
Hardest part is identifying meaningful issues, not generating fixes—Engine was at first 'too good at finding problems'
Traces are the window to the agent's soul; Engine ingests condensed traces and runs on deep agents + LangSmith sandboxes
Per-team priorities are learned via an 'agent coverture' memory file, and Engine even improves itself by feeding its own traces back in
1.10Case StudiesBuilding Deep Agent Sidekick (not recorded)Not recorded — summary from the public agenda.
1.11Intelligent Agents in Aviation (not recorded)Not recorded — summary from the public agenda.
1.12Fireside ChatsAgents in the EnterpriseCoding agents win because output is verifiable, users are technical, and permissions are open—knowledge work is the opposite
Expect years of deployment and change management; doers will be wrong on the take-off due to slow diffusion
Decade-old single-file-system/single-governance bets at Box now pay off for agents
Build a world-class product AND world-class headless APIs; volume will skew headless (Salesforce went headless)
Build on a coding-agent harness for all knowledge work, and expect cost to drive enterprises toward multi-model setups
1.13Case StudiesLessons Learned Building Rippling AI (not recorded)Not recorded — summary from the public agenda.
2.1KeynotesDay 2 Keynote (Closing) & Introduction of Carlos PereiraFree open-source model added (powered by Fireworks)
Closing keynote hands off to Carlos Pereira of Cisco CX
2.2Case StudiesObserving and Testing CX AgentsEvery thumbs-down is a signal, not noise
Continuous feedback loop: traces to AI-diagnosed PRs to permanent regression tests
Evals are infrastructure, not a side project
MCP as integration layer lets you swap backends without touching agents
Keep humans for decisions only
2.3The Etsy Gifting Assistant: From Prototype to Production (not recorded)Not recorded — summary from the public agenda.
2.460% Faster Time-to-Interview: Transforming Hiring with AI Agents60% faster time-to-interview with hiring agents on LangChain/LangGraph
On-stage speakers introduce themselves as 'Grace' and 'Chok' (agenda lists Shang Liu and Tracy He)
Evolved from chains to a central LLM planner running plan-act-replan
Harness engineering and customization are the durable differentiation
2.5Product & LaunchesRun Untrusted Agent Code with LangSmith SandboxesAgents are writing real code today across coding, data analysis, security, and browser control
Untrusted code execution is risky: supply-chain, sandbox-escape, and prompt-injection incidents
LangSmith Sandboxes: fast spin-up (~0.98s), egress proxy, durable pause/resume with no time limit, snapshot/fork
One line to start, available on all plans, bring-your-own Docker images, full tracing
2.6Fireside ChatsFuture of AI AgentsBuild with small, high-context generalist tasks given guardrails
AI experiments often yield incremental efficiency, not transformation; redesign the whole workflow top-down
Chase growth over cost savings: growth has almost no practical ceiling
Preserve optionality with vendor-neutral tools and open-weight models
Unstructured-data re-architecture is the next big enterprise challenge
2.7Agents in the EnterpriseMongoDB = best DB for unstructured data, 'almost a coincidence'
11 Labs: 14M production agents after migrating to MongoDB
~70% of MongoDB's checked-in code last week came from coding agents
Customer-facing agents at scale are the real prize — most enterprises aren't there yet
2.8Case StudiesBuilding AI for HealthcareAbridge: 250 top US health systems, 100M+ conversations/year
'Trust is earned in drops but lost in buckets'
LangGraph + LangSmith + APO judges cut releases from 1-2 months to days
Reference-free + reference-based judges; you don't have to trade velocity for quality
2.9Building Pat, the AI Pocket AnalystPat: hours of research in minutes, hundreds of investors daily
'The plan really is the analysis' — plan once, generate code in parallel
Human-like search inspection lifted accuracy ~15% to 90%
Treat agentic coding as a compiler problem; correctness enforced in architecture (95% identical output)
2.10Building Developer Support AgentsCoinbase dev-support: zero to one self-improving agent system
Self-hosted LangSmith tracing + MCP docs server with RAG fallback
LLM judge on accuracy and risk; intent-based workflow from traces
Treat agent engineering as a discipline; build the glass box first; the team is the multiplier
2.11ClosingThe Return of the Data ScientistThe return of the data scientist: experiment, measure, improve
Most important takeaway: 'always look at your data'
Skills to audit your evals for correctness
(Recording captured only the open/close — middle not picked up)