Skip to main content

Presentation Directory

Quick Digest Table

Ref #TrackPresentationTake Aways
-KeynotesOpening Keynote – Harrison Chase• Rise of the agent-engineer role (prompting + product + ML + DevOps).
• Real-world agents will fuse many models; reliability starts with the right context (LangGraph).
• LangSmith → tracing + evals + prompt hub for team work.
2.1Welcome & Product Keynote (Day 2) – Harrison Chase• Launched LangGraph Studio v2 (browser live-reload, LangSmith-native).
• Unveiled open “Agent Platform” (tool-server, RAG-aaS, agent registry).
• Horizontal scaling of long-running jobs framed as next frontier.
2.19From LLMs to Agents: The Next Leap – Adam D’Angelo• Shift from code-completion to task-completion UIs.
• Voice & multimodal interfaces will mainstream agents.
• Fast iteration + human evals beat big-design-up-front.
2.13State of Agents (Fireside) – Andrew Ng × Harrison Chase• Think in degrees of agentic-ness.
• Execution speed & deep technical judgment remain the biggest moat.
• Good eval pipelines isolate the failing step, not the whole graph.
2.2Product & Case-studiesBuilding Replit Agent v2 – Michele Catasta• V2 moved from 60-s to 10-15-min autonomous runs via new model mix.
• Assembly-level debugging: LangSmith traces are critical.
• Next: hour-long jobs, vision-based testing.
2.4Transforming CX with Multi-Agent Customer Experience – Cisco• Supervisor graph routes renewals/support/adoption agents.
• 20 % faster renewals; 60 % low-priority tickets auto-resolved.
• Define business-value metrics first, tools second.
2.4Cisco Demo: AI-Powered Insights from Enterprise Data with LangGraph• Goal: agentic CX platform for 20 k-person org.
• Multi-agent mesh built on LangGraph + Cisco MCP, deploys cloud/on-prem.
• Patterns: choose metrics first; keep a separate eval team; hand big SQL joins to DB, not LLM.
2.4Ask D.A.V.I.D. (J.P. Morgan)• Research copilot with routing, RAG & analytics sub-agents.
• Personalisation layer tailors depth by user role.
• Offline→online eval pipeline with human review for “last mile.”
2.11Aladdin Copilot (BlackRock)• Conversational layer across ~100 Aladdin apps.
• Plugin registry lets 60+ domain teams expose APIs or agents.
• Daily end-to-end tests in CI ensure routing & answer quality.
2.12Agentic Developer Products with LangGraph (Uber)• “LangFX” wraps LangGraph for internal infra (test-gen, build triage, security).
• Deterministic sub-agents feed larger LLM agents for reliability.
• Well-chosen abstractions unlock parallel team adoption.
-Agents at Scale (LinkedIn)• LLM judge for money-transfer approvals: 51 → 79 % accuracy in two weeks.
• Empathy & tone added as eval axes.
• Dozens of prompt/model experiments per day via tracing tooling.
2.16Digital Workforce with LangGraph (Monday.com)• Handles 1 B work-items/year; agents act as always-on coworkers.
• Hierarchical supervisor + specialised sub-agents beat naïve React loop.
• AI usage doubling MoM since launch.
2.16Building & Scaling an AI Agent During HyperGrowth (11x)• Re-built “ALICE 2” in 3 months: clean repo, vendor non-core.
• Tried React → workflow → multi-agent; latter balanced quality & flexibility.
• Keep architecture uncoupled from UX layer.
2.18Unlocking Agent Creation: Architecture Lessons (Box)• Reliability = clear plans + environment grounding.
• Planning templates + model-predictive control keep long-runs on track.
• “Precision steering” UX is as vital as autonomy.
2.5Making Devin – an AI Software Engineer (Cognition)DeepWiki + semantic graphs give macro/micro code understanding.
• Custom RL post-train on 180 CUDA tasks beats larger models.
• Sandboxing vital to avoid reward hacking.
2.15Building Reliable Agentic Systems (Factory)• Delegate whole tasks, not just collaborate, for productivity gains.
• Controlled toolsets + output pre-processing reduce hallucinations.
• Target: offload 50 %+ engineering tickets to agents.
2.14Engineering & EvaluationBreakthrough Agents: Learnings from Building AI Research Agents – Unify• Encapsulate repeatable steps as deterministic sub-agents, compose with LangGraph.
• Wins: auto test-gen, security-rule injection, build-system actions.
• Invest in “Mangrove” state layer so any team can slot in new tools.
2.9Evaluation Frontiers (Shreya Shankar)• Model-only metrics miss business failures; need trajectory & cost dashboards.
-LLM-as-a-Judge for Money-Transfer (Nubank)• Prompt-tuning + larger models clawed accuracy 51 → 79 %.
• Added empathy/tone checks alongside hallucination filters.
2.6Building Reliable Agents & Agent Evals (Harrison Chase)• New LangSmith metrics: per-tool latency/error & trajectory observability.
• AI observability must serve agent engineers, not SREs.
1.1Workshops & TutorialsIntroduction to LangGraphGraph fundamentals, node types and event loops; builds a toy agent end-to-end.
-Building an Agentic ApplicationHands-on notebook: high-level planning → sub-agent orchestration → RAG.
-Evaluating Your AgentAdds offline & online eval harnesses, golden datasets, cost dashboards.
-Human-in-the-LoopPatterns for interruptions, approval gates, safe-fail states.
-MemoryShows episodic vs summarised memory nodes; trade-offs in token budget vs recall.
-Deploying Your AgentCI/CD, containerisation, LangGraph Runner, infra cost controls.