Skip to main content

Introducing Managed Deep Agents – Sydney Runkle & Victor Moreira

Speaker(s): Sydney Runkle (Software Engineer, LangChain); Victor Moreira (Product Manager, LangChain)
Session: Interrupt 2026 · Day 1 (May 13) · ~2:00 PM PT
Source: in-person audio recording, transcribed locally with Whisper large-v3.

Summary

Sydney Runkle (open source engineer) and a LangChain colleague (Victor Moreira, with a co-presenter referred to as Nick) introduce Managed Deep Agents, launching as a private beta. They frame an agent as a model plus a 'harness'—everything around the model (skills, memory, base system prompt, tools, sub-agents) whose job is to give the model the right context at the right time. Deep Agents is a customizable, provider-agnostic harness with four capabilities: an execution environment (file system plus optional sandbox/code interpreter), context management (summarization, context offloading, memory, prompt caching, progressively-disclosed skills), delegation (a planning tool and out-of-the-box sub-agents), and steering (first-class human-in-the-loop with approve/edit/reject/respond patterns). Managed Deep Agents then adds four production pillars on top: a runtime built on LangGraph deployment (durable execution with checkpointing/replay, task queues, SDKs), multi-layer auth and interoperability (A2A protocol, calling agents in your own graph), Context Hub integration for versioning skills and files, and LangChain sandboxes with an auth proxy and snapshot/restore.

Key Points

  • An agent is a model-based loop (a model calling tools until it completes a task); an agent = model + harness, where the harness is everything connecting the model to the real world and its job is the right context at the right time.
  • Deep Agents is a customizable agent harness purpose-built for complex real-world tasks, with four capabilities: execution environment, context management, delegation, and steering (human-in-the-loop).
  • Execution environment is the backbone: a file system for scratch files and persistent memory, optionally augmented with a sandbox or lighter-weight code interpreter for secure code execution.
  • Context management ships with summarization plus context offloading (periodically evicting large messages to the file system), built-in short/long-term memory, provider-agnostic prompt caching, and skills support via progressive disclosure (minimal skill info up front, full resources pulled in on demand).
  • Delegation provides a planning tool and out-of-the-box sub-agents (general or specialized) that run with isolated context, aid context management, parallelize work, and can each use any model in any client to match capability to task complexity.
  • Steering supports four decision patterns: approve (e.g. an email before send), edit (e.g. a tweet before publishing), reject (e.g. a financial transaction), and respond (agent interrupts to ask the user a question).
  • Deep Agents is provider-agnostic (mix and match Anthropic, OpenAI, Google, local via Ollama, or cheaper open-source/inference providers like Fireworks, NVIDIA, OpenRouter) and customizable via middleware—hooks around the core agent loop for business logic, PII redaction, or dynamic control.
  • Managed Deep Agents (private beta as of launch day) adds four production pillars on LangGraph deployment: a scalable runtime with durable execution (checkpoint/resume from any step, e.g. retry from step 49 rather than restarting), multi-layer auth and interoperability (A2A, agents callable in your own graph in one line), Context Hub versioning for skills/files, and LangChain sandboxes with an auth proxy and snapshot/restore.

Notable Quotes

We can define an agent as a model plus a harness. So the harness is everything that connects the model to the real world.

the job of a harness is to give the model the right context at the right time for the given task. A model is only as powerful as the context that it's given

if your agent fails on step 49 or 50, you don't have to restart that whole run again. You can pick up and retry from step 49.

Slides

Slide titled Middleware enables, listing business logic, policy enforcement, dynamic agent control, fault tolerance, context management, and toolsets Middleware — a set of hooks around the core agent loop — enables custom business logic, policy enforcement (e.g. PII redaction), dynamic agent control, fault tolerance, context management, and toolsets.

Slide of the Managed Deep Agents architecture: Harness, Runtime, Context hub, and Sandboxes Managed Deep Agents = the harness + a production runtime (durable execution) + Context Hub integration + safe in-process code execution via sandboxes.

Full Transcript

Show the full timestamped transcript (auto-generated; lightly cleaned)
[00:00] Hi folks, I'm Sydney and I am an open source engineer at LangChain. Hi, I'm Victor, I'm a product
manager. Today we're going to be walking through the pages. Before we talk about that, Sydney is
going to walk us through the pages.

[00:34] So, first off, what is an agent? An agent is a simple model-based loop. A model calling tools in a
loop until it completes a task and returns finals. What is a harness? We can define an agent as a
model plus a harness. So the harness is everything that connects the model to the real world.
Everything around the model that helps it complete tasks.

[01:06] So this is made up of skills, memory, the base system prompt, tools, sub-events, and any additional
content. What's the job of a harness? So the job of a harness is to give the model the right context
at the right time for the given task. A model is only as powerful as the context that it's given,
and so the harness exists to bridge this gap. Why do you need a harness?

[01:38] Well, agents have a lot of jobs. Agents need to work in an environment where they can take actions.
This action taking is what gives them agency guts, what makes agents useful. They need to connect to
your data so that their actions are relevant to the job. So, what is an agent? An agent is a tool.
They need to know that the actions are relevant for your use case. They need to manage growing
context over long runs so that they can avoid context overflows. They need to be able to parallelize
tasks to complete complex tasks efficiently.

[02:12] They need to connect with the human in the loop for sensitive workflows. And finally, ideally, they
improve over time so that they remain relevant in use cases. So, that's it. So, what is Deep Agents?
Deep Agents is a customizable agent harness that's purpose-built for complex real-world tasks.
First, I'm going to cover the four main capabilities that are part of the Deep Agents harness,

[02:42] and then we'll do kind of a deep dive into each one. So, first up, we have the execution
environment. This is the backbone of a Deep Agent, and it all starts with a pilot. Optionally, you
can also augment this with a sandbox or similarly a code intern. Next, we have I think the most
important capability, which is context management. So, there are lots of utilities built into Deep
Agents that help with this all-order.

[03:14] That includes skill support, out-of-the-box support for short and long term evidence, summarization
capabilities, context options, and more. Context offloading and prompt cache. The third capability
is delegation. As agents run for longer amounts of time and take on complex workloads, they need to
be able to plan and organize tasks, and then also use sub-agents to delegate work. Finally, we build
steering into the Deep Agents harness with first-class human-led support.

[03:53] So, now for our deep dive. Starting with the execution environment, which I mentioned is the
backbone of a Deep Agent, this capability powers all of the rest. So, we start with the file system.
An agent uses the file system to read and write scratch files as it tackles work, load and store
persistent memories in the hot path, invoke skills when relevant for a given task, and many more
things. Agents are excellent at using file systems.

[04:25] They are trained in an environment to use file systems, and also trained on lots of code. And that's
why giving an agent a sandbox or lighter weight cousin the code interpreter is very powerful. When
you give an agent these code execution tools, you're given a secure environment to write and run
code, which makes an agent capable of much more than just writing. Much more creative problem
solving, and kind of dynamic runtime.

[04:59] The second capability is context management. Deep Agents ships with built-in summarization and
context uploading. You can see on the graph here, periodically Deep Agents will evict large
messages, so that could be human messages, tool results, tool calls, to the file system, so that we
don't build up context in the window too quickly. Additionally, summarization is triggered less
frequently, but every so often when history starts to approach the model's context limit,

[05:33] both of these are in an effort to avoid context overflow, which is a problem that plagues long-
running agents, or high-contact agents. What else? Deep Agents also ships with built-in memory
support. I would argue that memory is maybe the most important kind of content, because it's the
context that changes from run to run, and allows your agent to improve over time. It also ships with
provider agnostic prompt caching. This is incredibly important for long-running, high-contact
agents,

[06:06] that need to operate cost-effectively. Finally, Deep Agents ships with skills support out of the
box. Skills are part of the context management system, because of the system, called progressive
discipline. So Deep Agents loads some minimalistic information about what skills your agent has
available, upfront, into the system prompt, and then we give the agent the power to dynamically pull
in full skill resources, and invoke those skills and their scripts,

[06:36] when relevant for a given task. All of this umbrella of context management is again catering to that
need for a harness to get the model, the right context, at the right time, for the given task. The
third capability of the Deep Agent's harness is delegation. So the Deep Agent's harness is equipped
with a planning tool that allows the model to organize work for challenging tasks.

[07:07] It's also equipped with sub-agent support out of the box. Sub-agents can be general purpose, or
specialized. So for example, if you're building a coding agent, maybe you would want to attach some
specialized sub-agents for architecture design, for code review and security review, and then test
writing and texting. Why are sub-agents so important? Well first, they operate with isolated
context,

[07:37] and they actually help with overall context management. So when the main agent invokes a sub-agent,
it starts with fresh context, only relevant to its given task, and then it returns just a
streamlined final result back to that main agent, so that it doesn't pollute the main context.
Secondly, they can be used to parallelize work, so your agent can run tasks end-to-end more
efficiently with that parallelization. Finally,

[08:07] sub-agents can use any model in any client, so you can match model capability with task complexity.
The fourth and final capability of the agent's harness is support for steering, the first class
human-in-the-loop premise. So human-in-the-loop is really critical for two things. The first is
getting real-time user feedback on sensitive actions, or tool calls, as we mentioned before. And the
second is getting real-time feedback

[08:39] when feedback is needed from the user to unlock the model. So what does this look like? There's four
common decision patterns that we build into ePages. The first is an approval flow, so maybe you're
approving an email before it's sent. Second is an edit, so maybe editing a tweet before it's
published. Third is a reject decision, rejecting a proposed financial transaction. And the fourth is
the respond pattern, so that's when the agent interrupts and asks the user for a question

[09:10] to unlock the product. So we've done a deep dive into the process and we've gone back into the
capabilities of the ePages harness. Now let's talk about why ePages. ePages is provider agnostic.
You can use any provider, in any model, swap at any time, and even mix and match. Your main agent
can use a different model than your sub-agent. You can use major providers, like Netropic, OpenAI,
Google. You can use local models with Bulama.

[09:42] Or you can use increasingly performant and much cheaper open source models, like Fireworks, Chrome
providers like Fireworks, NVIDIA, OpenRouter, Basecamp. ePages is highly customizable. Here's a
quick recap of what that core agent loop looks like. And here's what the deep-agent loop looks like.
ePages provides a set of hooks around the core agent loop. We call this system middleware.

[10:12] And middleware enables basically any custom logic that you want to add to your agent. That might
look like bespoke business logic, deterministic code at any point, policy enforcement like PII
redaction, or dynamic agent control. For example, changing the bottleneck rules available at runtime
based on the task agent. Even with a capable artist, going through production is really hard. Your
agent needs to run for long periods of time,

[10:43] and recover from unexpected failures, handle human in the loop and unpredictable behavior, support
first traffic, all while maintaining a secure posture and keeping up with ever-changing
interoperability. And now, I'm going to hand it off to Nick. He's going to cover how we make this
easy. Awesome. Thank you, Susan. All right. So we just saw what a deep-agent is. Now, we're going to
talk about what it takes to actually take one of these agents into production. I see a few familiar
faces in the crowd

[11:15] of people that have actually taken these agents, served them to customers in production. So we've
kind of seen this first. Today, we're going to talk about how we're actually managing deep agents in
private beta in order to make this process as easy as possible. So there's kind of four core pillars
to what managing deep agents are. First is the harness, which Cindy just walked through. The second
is the runtime, which is how we can actually use this agent in production. The third is going to be
the integration with context hub. And then lastly, the way that we can execute

[11:45] safe code in the same process. So let's first talk about this runtime. So managing deep agents is
actually built on top of the length of deployment, which means we get a lot of primitives to handle
that real-time kind of interaction with scalability that agents will actually need to conduct. This
means we get endpoints for grading, updating, even invoking our agents wherever we need them. We'll
get a purpose-built task queue, for example, scaling, in order to handle kind of the first few
requests that we'll take to the agent. You can imagine a use case of support agents,

[12:17] your systems go down, and now everyone's hitting that support agent all at once. You need to be able
to handle that traffic. Lastly is SDKs in order to use these agents wherever we need them,
integration with copilot kit, assistive UI, or gen UI, the list goes on. The second kind of core
pillar of this production runtime is durable execution. This is one of those things that is a little
boring to think about, but is very, very reliable. So because we run deep agents on top of the
LaneGraph runtime, we're able to actually checkpoint each step of your agent's production.

[12:49] This means that these are all stored in durable storage, and we can resume and restart from any one
of these checkpoints. So if your agent fails on step 49 or 50, you don't have to restart that whole
run again. You can pick up and retry from step 49. Lastly, we have the ability to replay and report
our agent from any point in time of that state. This enables some advanced user workflows, so
something like, oh, I want to port this conversation from here, but it also enables human approval
for human input. Since everything is checkpointed into this database,

[13:20] we're able to await human input indefinitely. This is what enables ambient agent use cases, such as
lights and engine, which you'll hear about in a second. Security and op are very, very important to
production. You need multiple layers of op in order to serve these production agent use cases. The
first is going to be inbound from your actual application. How do you authenticate that this user is
who you say they are and they're liable to authenticate? The second is going to be outbound from
inside the agent to your external services.

[13:50] You might have tools or MCP tools that you need to be able to reliably authenticate with and assume
the correct permissions that way. The third is actually who has the ability to create, update, and
manage these agents. Whether it is your AI engineer, currently, or maybe your CTO wants to get in
there and make a quick change, you need to have that level of far back and back to be able to do
that. Who actually has the permissions? Agent interoperability is becoming more and more important
as you want to use your agents in a variety of different ways. One thing that's core to Lancet
deployment that we're bringing

[14:22] into main HTTP agents is the ability to use your agents in your own graph. That means that you can
call your agents that are built on main HTTP agents in a custom LanGraph application that's deployed
on the main HTTP server. Just one line. Second is the ability to support A2A protocol. So a lot of
agents are using A2A today, as a way to handle this agent communication layer. We support this out
of the box in both our standard Lancet deployment and also major. The third is being able to bring
this agent when you actually do work. We have a lot of agents that we build internally

[14:53] and deploy on Lancet deployment, and we love to use them where we're good. Whether it's deep agents
code or it's a cloud desktop, we want to be able to bring this agent to the server. We thought of
all of these different production use cases so that you don't have to. Things like double texting,
things like a run digital life are all things that actually take real engineering work and will take
you weeks if not months to build. We have this all out of the box. The third kind of core pillar
here is context hub integration.

[15:24] So context hub is built into ManagedSuite agents. It's how we version and save every one of these
files that you're using. So things like the agent ID and the skills that Harrison mentioned in the
keynote are becoming more and more popular. So the first thing you're going to do is you're going to
get all of the different categories that your agents are taking about your users. These all get
saved in versions inside of context hub so that your team is able to control different levels of
promotion, right, like you can start staging, go to production for different skills, and stuff like
that. You can really democratize the skills process.

[15:56] The next kind of core feature that is part of context hub in this ManagedSuite agent primitive is
our integration with Lancet. You're going to hear more about Lancet and the engine in the next
session, but essentially what it will be able to do is take real production usage of these agents,
make some quality improvements and changes to your prompts, your system, your skills, things of that
nature, and that leads to better behavior over time as this loop is continued. The fourth kind of
key component of ManagedSuite agents is the sandbox. So as Cindy mentioned before, increasingly
nearly every agent

[16:26] is becoming a coding agent. Even a research agent use case, they might want to be able to crunch
some quick stats and have it to report. So this is a really good way to get the most out of the
process. You want to enable your agents to do these things in production because it can lead to a
lot more creative results. That's why we're launching Lancet sandboxes, and we're integrating them
directly with ManagedSuite agents. There's a few core features to this kind of Lancet sandbox
primitive, the first being an off proxy in order to, at run time, inject those credentials that your
sandbox uses securely so that

[16:57] none of your important environment variables are exposed to the actual agent or the actual sandbox
itself. And the second is the ability to snapshot and restore the data so that your agent always has
the correct execution. We'll have a session on this for a deeper dive tomorrow with Gil, but they
are very, very excellent for most users. So this is kind of everything that you need all together in
order to take an idea or a working deep agent and take it into production. That's why we're
launching ManagedSuite betas, and we're starting a private beta as of today. So we encourage you all
to jump on the wait list and thanks for the time. Thank you. Ladies and gentlemen.

[17:41] And it is caramel weather out in diddyаль safety shore, poderia somobok, need person not at you can
put your hands up. United States has followed up handling this success crowd from гораздо over 1,000
locations one after another.