Presentation Directory

Quick Digest Table

Ref #	Track	Presentation	Take Aways
-	Keynotes	Opening Keynote – Harrison Chase	• Rise of the agent-engineer role (prompting + product + ML + DevOps). • Real-world agents will fuse many models; reliability starts with the right context (LangGraph). • LangSmith → tracing + evals + prompt hub for team work.
2.1		Welcome & Product Keynote (Day 2) – Harrison Chase	• Launched LangGraph Studio v2 (browser live-reload, LangSmith-native). • Unveiled open "Agent Platform" (tool-server, RAG-aaS, agent registry). • Horizontal scaling of long-running jobs framed as next frontier.
2.19		From LLMs to Agents: The Next Leap – Adam D'Angelo	• Shift from code-completion to task-completion UIs. • Voice & multimodal interfaces will mainstream agents. • Fast iteration + human evals beat big-design-up-front.
2.13		State of Agents (Fireside) – Andrew Ng × Harrison Chase	• Think in degrees of agentic-ness. • Execution speed & deep technical judgment remain the biggest moat. • Good eval pipelines isolate the failing step, not the whole graph.
2.2	Product & Case-studies	Building Replit Agent v2 – Michele Catasta	• V2 moved from 60-s to 10-15-min autonomous runs via new model mix. • Assembly-level debugging: LangSmith traces are critical. • Next: hour-long jobs, vision-based testing.
2.4		Transforming CX with Multi-Agent Customer Experience – Cisco	• Supervisor graph routes renewals/support/adoption agents. • 20 % faster renewals; 60 % low-priority tickets auto-resolved. • Define business-value metrics first, tools second.
2.4		Cisco Demo: AI-Powered Insights from Enterprise Data with LangGraph	• Goal: agentic CX platform for 20 k-person org. • Multi-agent mesh built on LangGraph + Cisco MCP, deploys cloud/on-prem. • Patterns: choose metrics first; keep a separate eval team; hand big SQL joins to DB, not LLM.
2.4		Ask D.A.V.I.D. (J.P. Morgan)	• Research copilot with routing, RAG & analytics sub-agents. • Personalisation layer tailors depth by user role. • Offline→online eval pipeline with human review for "last mile."
2.11		Aladdin Copilot (BlackRock)	• Conversational layer across ~100 Aladdin apps. • Plugin registry lets 60+ domain teams expose APIs or agents. • Daily end-to-end tests in CI ensure routing & answer quality.
2.12		Agentic Developer Products with LangGraph (Uber)	• "LangFX" wraps LangGraph for internal infra (test-gen, build triage, security). • Deterministic sub-agents feed larger LLM agents for reliability. • Well-chosen abstractions unlock parallel team adoption.
-		Agents at Scale (LinkedIn)	• LLM judge for money-transfer approvals: 51 → 79 % accuracy in two weeks. • Empathy & tone added as eval axes. • Dozens of prompt/model experiments per day via tracing tooling.
2.16		Digital Workforce with LangGraph (Monday.com)	• Handles 1 B work-items/year; agents act as always-on coworkers. • Hierarchical supervisor + specialised sub-agents beat naïve React loop. • AI usage doubling MoM since launch.
2.16		Building & Scaling an AI Agent During HyperGrowth (11x)	• Re-built "ALICE 2" in 3 months: clean repo, vendor non-core. • Tried React → workflow → multi-agent; latter balanced quality & flexibility. • Keep architecture uncoupled from UX layer.
2.18		Unlocking Agent Creation: Architecture Lessons (Box)	• Reliability = clear plans + environment grounding. • Planning templates + model-predictive control keep long-runs on track. • "Precision steering" UX is as vital as autonomy.
2.5		Making Devin – an AI Software Engineer (Cognition)	• DeepWiki + semantic graphs give macro/micro code understanding. • Custom RL post-train on 180 CUDA tasks beats larger models. • Sandboxing vital to avoid reward hacking.
2.15		Building Reliable Agentic Systems (Factory)	• Delegate whole tasks, not just collaborate, for productivity gains. • Controlled toolsets + output pre-processing reduce hallucinations. • Target: offload 50 %+ engineering tickets to agents.
2.14	Engineering & Evaluation	Breakthrough Agents: Learnings from Building AI Research Agents – Unify	• Encapsulate repeatable steps as deterministic sub-agents, compose with LangGraph. • Wins: auto test-gen, security-rule injection, build-system actions. • Invest in "Mangrove" state layer so any team can slot in new tools.
2.9		Evaluation Frontiers (Shreya Shankar)	• Model-only metrics miss business failures; need trajectory & cost dashboards.
-		LLM-as-a-Judge for Money-Transfer (Nubank)	• Prompt-tuning + larger models clawed accuracy 51 → 79 %. • Added empathy/tone checks alongside hallucination filters.
2.6		Building Reliable Agents & Agent Evals (Harrison Chase)	• New LangSmith metrics: per-tool latency/error & trajectory observability. • AI observability must serve agent engineers, not SREs.
1.1	Workshops & Tutorials	Introduction to LangGraph	Graph fundamentals, node types and event loops; builds a toy agent end-to-end.
-		Building an Agentic Application	Hands-on notebook: high-level planning → sub-agent orchestration → RAG.
-		Evaluating Your Agent	Adds offline & online eval harnesses, golden datasets, cost dashboards.
-		Human-in-the-Loop	Patterns for interruptions, approval gates, safe-fail states.
-		Memory	Shows episodic vs summarised memory nodes; trade-offs in token budget vs recall.
-		Deploying Your Agent	CI/CD, containerisation, LangGraph Runner, infra cost controls.

Quick Digest Table​

Quick Digest Table