Presentation Directory
Quick Digest Table
Ref # | Track | Presentation | Take Aways |
---|---|---|---|
- | Keynotes | Opening Keynote – Harrison Chase | • Rise of the agent-engineer role (prompting + product + ML + DevOps). • Real-world agents will fuse many models; reliability starts with the right context (LangGraph). • LangSmith → tracing + evals + prompt hub for team work. |
2.1 | Welcome & Product Keynote (Day 2) – Harrison Chase | • Launched LangGraph Studio v2 (browser live-reload, LangSmith-native). • Unveiled open “Agent Platform” (tool-server, RAG-aaS, agent registry). • Horizontal scaling of long-running jobs framed as next frontier. | |
2.19 | From LLMs to Agents: The Next Leap – Adam D’Angelo | • Shift from code-completion to task-completion UIs. • Voice & multimodal interfaces will mainstream agents. • Fast iteration + human evals beat big-design-up-front. | |
2.13 | State of Agents (Fireside) – Andrew Ng × Harrison Chase | • Think in degrees of agentic-ness. • Execution speed & deep technical judgment remain the biggest moat. • Good eval pipelines isolate the failing step, not the whole graph. | |
2.2 | Product & Case-studies | Building Replit Agent v2 – Michele Catasta | • V2 moved from 60-s to 10-15-min autonomous runs via new model mix. • Assembly-level debugging: LangSmith traces are critical. • Next: hour-long jobs, vision-based testing. |
2.4 | Transforming CX with Multi-Agent Customer Experience – Cisco | • Supervisor graph routes renewals/support/adoption agents. • 20 % faster renewals; 60 % low-priority tickets auto-resolved. • Define business-value metrics first, tools second. | |
2.4 | Cisco Demo: AI-Powered Insights from Enterprise Data with LangGraph | • Goal: agentic CX platform for 20 k-person org. • Multi-agent mesh built on LangGraph + Cisco MCP, deploys cloud/on-prem. • Patterns: choose metrics first; keep a separate eval team; hand big SQL joins to DB, not LLM. | |
2.4 | Ask D.A.V.I.D. (J.P. Morgan) | • Research copilot with routing, RAG & analytics sub-agents. • Personalisation layer tailors depth by user role. • Offline→online eval pipeline with human review for “last mile.” | |
2.11 | Aladdin Copilot (BlackRock) | • Conversational layer across ~100 Aladdin apps. • Plugin registry lets 60+ domain teams expose APIs or agents. • Daily end-to-end tests in CI ensure routing & answer quality. | |
2.12 | Agentic Developer Products with LangGraph (Uber) | • “LangFX” wraps LangGraph for internal infra (test-gen, build triage, security). • Deterministic sub-agents feed larger LLM agents for reliability. • Well-chosen abstractions unlock parallel team adoption. | |
- | Agents at Scale (LinkedIn) | • LLM judge for money-transfer approvals: 51 → 79 % accuracy in two weeks. • Empathy & tone added as eval axes. • Dozens of prompt/model experiments per day via tracing tooling. | |
2.16 | Digital Workforce with LangGraph (Monday.com) | • Handles 1 B work-items/year; agents act as always-on coworkers. • Hierarchical supervisor + specialised sub-agents beat naïve React loop. • AI usage doubling MoM since launch. | |
2.16 | Building & Scaling an AI Agent During HyperGrowth (11x) | • Re-built “ALICE 2” in 3 months: clean repo, vendor non-core. • Tried React → workflow → multi-agent; latter balanced quality & flexibility. • Keep architecture uncoupled from UX layer. | |
2.18 | Unlocking Agent Creation: Architecture Lessons (Box) | • Reliability = clear plans + environment grounding. • Planning templates + model-predictive control keep long-runs on track. • “Precision steering” UX is as vital as autonomy. | |
2.5 | Making Devin – an AI Software Engineer (Cognition) | • DeepWiki + semantic graphs give macro/micro code understanding. • Custom RL post-train on 180 CUDA tasks beats larger models. • Sandboxing vital to avoid reward hacking. | |
2.15 | Building Reliable Agentic Systems (Factory) | • Delegate whole tasks, not just collaborate, for productivity gains. • Controlled toolsets + output pre-processing reduce hallucinations. • Target: offload 50 %+ engineering tickets to agents. | |
2.14 | Engineering & Evaluation | Breakthrough Agents: Learnings from Building AI Research Agents – Unify | • Encapsulate repeatable steps as deterministic sub-agents, compose with LangGraph. • Wins: auto test-gen, security-rule injection, build-system actions. • Invest in “Mangrove” state layer so any team can slot in new tools. |
2.9 | Evaluation Frontiers (Shreya Shankar) | • Model-only metrics miss business failures; need trajectory & cost dashboards. | |
- | LLM-as-a-Judge for Money-Transfer (Nubank) | • Prompt-tuning + larger models clawed accuracy 51 → 79 %. • Added empathy/tone checks alongside hallucination filters. | |
2.6 | Building Reliable Agents & Agent Evals (Harrison Chase) | • New LangSmith metrics: per-tool latency/error & trajectory observability. • AI observability must serve agent engineers, not SREs. | |
1.1 | Workshops & Tutorials | Introduction to LangGraph | Graph fundamentals, node types and event loops; builds a toy agent end-to-end. |
- | Building an Agentic Application | Hands-on notebook: high-level planning → sub-agent orchestration → RAG. | |
- | Evaluating Your Agent | Adds offline & online eval harnesses, golden datasets, cost dashboards. | |
- | Human-in-the-Loop | Patterns for interruptions, approval gates, safe-fail states. | |
- | Memory | Shows episodic vs summarised memory nodes; trade-offs in token budget vs recall. | |
- | Deploying Your Agent | CI/CD, containerisation, LangGraph Runner, infra cost controls. |