60% Faster Time-to-Interview: Transforming Hiring with AI Agents – LinkedIn
Speaker(s): LinkedIn — Shang Liu & Tracy He (Senior Software Engineers) [on-stage as "Grace" and "Chok"]
Session: Interrupt 2026 · Day 2 (May 14) · ~11:10 AM PT
Source: in-person audio recording, transcribed locally with Whisper large-v3.
Summary
Two LinkedIn engineers who introduce themselves on stage as 'Grace' and 'Chok' (the agenda lists the speakers as Shang Liu and Tracy He) describe how they built hiring agents on LangChain and LangGraph to cut time-to-interview by 60%. They frame hiring as an inherently agentic, looping problem (small-business managers spend roughly 9-5 hours a week reviewing candidates) and walk through an evolution from chains to a central LLM-driven plan-act-replan loop on LangGraph. They cover deep integration with LinkedIn's internal agent platform and memory systems, observability via LangSmith trace IDs, and an evaluation/optimization loop that today is human-driven. They close with lessons on context management, controlling checkpoint size, and 'harness engineering' as the durable source of differentiation for an agent.
Key Points
- Goal stated up front: cut time-to-interview by 60%; small-business hiring managers spend on average 9-5 hours a week reviewing candidates and deciding who to reach out to (about 2 days a week for small teams)
- Hiring is framed as a looping agent problem (plan, act, observe, replan), not a one-shot task
- Architecture evolved from chains/sequential execution to a central agent (single LLM planner) running a plan-act-replan loop using LLM decision-making for robustness and adaptiveness
- LangGraph was chosen because it complements an existing stack and builds on existing primitives (state, tools), rather than requiring a rewrite
- Deep integration with LinkedIn's internal agent platform and multiple memory/network systems (conversational, plus episodic and semantic memory), with skill registration so teams register their own skills
- Observability uses LangSmith trace IDs; not everything can be sent to LangSmith, so they built a mirroring system feeding human annotation; optimization is today human-driven with plans to automate
- Lessons learned, including five principles: product constraints determine the build; manage context and checkpoint size deliberately; 'harness engineering' (customization beyond raw model capability) is the lasting differentiation that makes the agent succeed
Notable Quotes
Hi everyone, I'm Grace. This is Chok.
And this is the exact shape of an agent problem.
That's the customization. And that's differentiation. And that's how you make the agent succeed and shine.
Full Transcript
Show the full timestamped transcript (auto-generated; lightly cleaned)
[00:00] Hi everyone, I'm Grace. This is Chok. We are here to talk about how we build hiring agents using
LangChain and LangGraph. At the top, this line is going to be by 60% on this list. So, to break the
agenda, we will start by talking about why we need agents for hiring and why is that an agent
problem.
[00:32] Then how we evolve and adopt children, chains, and networks. Then also how we evaluate it and learn
from the problems. So why hiring is an agent problem? When we talk to small business hiring
managers, often times we hear that the biggest challenge is hiring takes too much time. On average,
they spend 9-5 hours a week just reviewing candidates and deciding who to reach out to.
[01:06] For small teams, like startups, that's already 2 days a week gone for any interview that happens.
So, that's why. And also, hiring is a loop. It's not a one-shot task that holds the job at the
perfect timing. When we started with some job distribution and received full applies, we started to
realize how much of your master's environment is actually nice to have. So why is it worth it?
[01:36] Then we have to modify the job distribution again and force more people and then we found ways to
work at the list. And again and again we keep on trying to find the perfect candidate. And this is
the exact shape of an agent problem. Plans, ads, a dog concert in a dog. And that's why we do
hiring. And our process begins with a simple guided intake where our agents collect your hiring
requirements and start the job title.
[02:12] Then our agents generate the job title. And once you've conquered and posted the job, our agents
will immediately source some strong thinking and input into your experience. And you can either
advise them to apply to your job or provide feedback to our agents so that our agents will align
with your expectations when they want. And once you get applied, our agents will evaluate the
application to ask for a portfolio based on the qualifications that you've defined.
[02:43] And you can also spend AI-powered, human-intuitive time. So we're saving time for a meaningful
conversation with the correct talent. And last but not least, you can always check with one agent,
ask any hiring-related questions, or ask the agent to take any action on behalf of you. Now we have
just seen how hard they can automate an entire job. So we're going to go ahead and start. And we're
going to go ahead and start.
[03:13] So we're going to go ahead and start. We're going to talk a little bit more about the architecture
behind it and how we evolve. We started with self-talking portables where we could use like narrow
branches, bids, best practices, and all transition to top-hardening. And we worked on two off-
systems to help a little bit more dynamic. So we evolved to led changes where we implement training
with sequential execution. execution. While this provides some partial abstraction, it's still like
a dynamic decision making algorithm.
[03:51] And our final break came with an automatic graph. This is the two pages of a protocol where we have
a central data center that drives a planned ACP-prep-resigned loop. And the key difference here is
we use LLF for decision making, which makes our system significantly more robust and adaptive. We
just talked about how we shift to agent control
[04:21] models in the next class. And last time we talked about the protocol implementation. Outside we have
three key points. First, we use a single agent to centralize the reasoning. And instead of the
bridges and string the process, now we have one forward LLF, LLF planter, that's responsible for all
the high-level decisions. And this grid is what's involved with the picture. Make sure there's no
theorem, and the log, complex, and we'll test it out on the copy.
[05:01] Next we have our entire workflow operating on a planned ACP-prep-replanned pattern. And this is the
dynamic form of that graph. Finally, we ensure continuous improvement for a closed loop, pre-
planned, and un-served loop. Which we will talk a little bit more later. And overall, our system is
pretty simple, but like the Y is strong, too, and skewed, we're able to view all the requests from
our system. And this is the dynamic form of that graph. And this is the dynamic form of that graph.
And finally, we ensure continuous improvement for a closed loop, pre-planned, and un-served loop.
Which we will talk a little bit more later.
[05:32] And this is the dynamic form of that graph. And finally, we ensure continuous improvement for a
closed loop, pre-planned, and un-served loop. So, Sean, why do you think that I graphed? Good
question. Actually, we did inspect the graph on page 90 of the performance software with our
research team at major. So, there are two reasons. First of all, some agent frameworks don't provide
an optimal double-crochet. Some agent frameworks focus on areas that are not in the same order. And
those are areas that an existing depr is already passed.
[06:02] And that graph is a basic framework that complements an existing depr. Also, link-in regions are
already skipped and change. We have variables, tools, codecs. And that graph builds on those
properties, not on them. So, adopting the graph needs theory-wise cross. And that's the knowledge.
And that's me. Nextin has been deeply integrated into our day-to-day cloud-based networking
practices.
[06:36] And we look forward to bringing into more and more features and new products that's announced
yesterday to our production. And that's amazing. So, the income growth is not only because of
NetBank itself, but also because of the whole ecosystem that's built around it. NetChain, NetGraph,
NetSuite, everything. There you go. Next, I'm going to talk about how we integrate into NetGraph
currently within LinkedIn,
[07:14] and how specifically our entire team does it. So, internally there is an income gap. We have an
Asian platform, which talks to both NetGraph as well as our internal systems, including our massive
platform. So, there are two types of network systems that have been maintained. Asian platform-wise,
or conversational network, which stores the chat histories. And that also talks to the massive
platform with the deep integration.
[07:45] Also, there's another one called the expression network, which stores the same chat points. And
there's also some other things. There's a third network of conditions based on top of that. For
example, episodic mapping and semantic mapping. It empowers the mapping experience. In terms of the
skill registration, the LinkedIn Asian platform allows different teams to register their own skills
based on, for example, their teams or the business team.
[08:16] And this specifically makes use of the hiring intensity, profile, evaluation, and mapping that is
used to achieve our function and our needs. Video warehouse tools. So, LampChain has such a staff.
So, LinkedIn has already integrated to some extent, for example, API detection, contact system
organization, and persistence. I know LinkedIn has already done that as well. And 3.2.1 is supposed
to format tests to ensure some intelligence
[08:46] and that will be very well in our industry. That's the big thing I don't talk about. It's a little
bit later. But I think it's very much less early. So, this is a landmark checkpoint schema that
we're actually doing already. You can see the size of the input and output, which also includes the
suggested products as part of the output. We also have a lot of context-related parameters, which
are the key for our agent. So, especially the context parameters itself.
[09:16] It has tons of parameters. For example, there are two types of parameters. First one, it should be
passed through the whole conversation, on all graph, by time intent, by the up intent, at least up
in 90 seconds. There are also a couple of parameters that we need to be activated based on case-by-
case. Like, for example, pending learning. We formatted pending learning. Once that's done, for the
previous node, for example,
[09:46] it should pass through to the next node, within the context. And this is why both of our
announcement experience, thanks to the announcement space, this field that is provided by
announcement, we got a chance to distribute to the cloud. That's how we actually use day-by-day our
data that we made to travel through. So, the real experience is a trace ID and a root cluster, which
is very powerful, straightforward, and convenient.
[10:19] And that's the information. So, we just saw how we leveraged Lentivit for the observability. This is
actually an essential step for most of the feedback that I talk about earlier. To understand how to
use this data to make our data better, we'll talk about the evaluation process now. So, starting
with evaluating the Lentivit capturing, have we faced that? Well, we have to attach to that task
with the agent,
[10:51] not just the final and the . The content retrieval, and also, like, any decision staff that we have.
However, we saw a link policy. We're not able to send everything to Lentivit yet for Lentivit to dig
up. So, we built something similar for mirroring what Lentivit does. And together, we sent those,
uh, data into our human annotation
[11:21] and also our LNC directory. And we should look at the end users as that, the . And our final step is
to optimize. And today, what we did is we will manually define the return on our product model to,
uh, based on what the annotation service and have all of this entered which is what we just learned
yesterday as well. And we look forward to integrating
[11:52] and close our continuous learning without too much of a . And for today, it's human driven and
tomorrow we will be So, we just talked about how, how agents evolved and how the architect is
designed and along the journey of doing this adaptive agent, we have some lessons to cover. And we
want to share today. So, let's go. Number one, let me walk through five principles.
[12:23] The product constraint determines the build on time. So, Shawn just talked a lot about how we
integrated our convention's interpretive, how they were useful and powerful. But, however, not
everyone fit into our use case. And the story behind this lesson is we have our agent walking
through the computer course. And we want to make some suggestions. For example, it will say, looks
like you are a moving candidate
[12:53] with a degree, without a degree in computer science, you can't fit. So, is education requirement
really a must have? Do you want me to be moving? So, what next is the user computer say yes or no?
But the interesting part is they could just walk away or just completely change their topic. And
they will not be asked. And so, our first instinct was using that graph to interrupt the review.
[13:23] It paused the future window, backfired the graph, and waited until they conclude the research and we
soon accepted the same score, the same score. And this worked beautifully for the graph and the yes
and no score. However, it doesn't really fit all of these things. So, we build contact with the new
step. So, every input we run the graph end to end. And at the end of the graph, we build this
general
[13:54] posture of contact. In the example that I mentioned, we will be scoring something like suggest
moving education requirement and waiting for users to confirm. And when the next message comes in,
our planner will see if it is relevant to the suggestion. Do we want to confirm it? Reject it? Or
confirm it to something else? Meaning, we wait for the contacts to expire. With this approach, it
gives us the same as
[14:25] the field of knowledge we need, who we press, face name, and also keeping our conversation
practical. But most important is we control our checkpoint size. It's not about tracking, it's about
shortness of who we are. We are just persisting with even-monthly contact. And John is going to talk
about why that is important to our agents, and why it takes so long. Yeah, models are
[14:55] holistic as you know. But agencies need to be more determined and specific for our use case based on
harness engineering. And as you all know, with the implement of the model capabilities, a lot of
harness engineering efforts will eventually be placed. We also believe that there are always some
kind of harness engineering that can never be replaced. And that's the customization. And that's
differentiation. And that's how you make the agent succeed and shine.
[15:25] So, I specifically want to share a few lessons that we learned on how to do the harness engineering
to make our agent better and closer to our user requirements. The UWB station has its own functional
code. And we have a lot of tools that we can use. And we have a lot of functional codes. So the
priorities are very different. For example, coding agents focus on the code graphics. And they need
to have more exploration of those pieces to get the final answer. We try different approaches.
[15:56] For example, shopping agents, they need source programming, safe purchasing action boundaries to
secure the safety. And we, the hiring agent, want to ensure the consistency of career-facing action
tasks. As well as our trust compliance. So there are a couple of efforts. First, the context
management that we've been talking about all the time. So, check-point training, history
summarization, as well as context-complex management.
[16:26] Well, if the context providers have confidence in each candidate, then this tells the agent what
needs to be remembered. And output format and customization. So sometimes, an agent's output is out
of the system. So we want to know something consistently across the program. So we have test-width
optimization and fallbacks. Two of the products that have outbound script formulation, programmatic
response, and an open-node switch
[16:56] to help an agent know what the user sees. Plus, node changes and wisdom. For example, when, after
one node execution, I always want to make sure the next node is something certain. So, I can do this
at random. Rather than focusing all the stuff we need to drop, we actually have a few parameters for
this. Like, state flag chaining, one-shot, two-cards, to make sure one-two can only fall once
[17:26] for a certain amount of time. And single-only-two can only fall in another two rather than by
itself. So this ensures what needed customers. So, based on all the parties that are in our agents
that get better and better, closer and closer to our customers' privacy. That's something we're
really happy to see. So, all our parents probably are in the hiring region. And thank you for
listening.