Skip to main content

Production Agents at Scale – Diamond Bishop

Speaker: Diamond Bishop, Director of Engineering AI, Datadog

Background: 15+ years building AI agents, "Been through at least one AI winter", former startup founder (acquired by Datadog)

Products in Production

ProductPurposeStatus
Bits AI SREAutomated incident investigationGA
Bits AI DevCode generation for errors/latencyGA
Security AnalystAutomated security investigationsGA

Core Philosophy: "Intelligence is not going to be the bottleneck anymore. The bottleneck is building agents that can operate autonomously."

Five Key Lessons

1. Code Agent-First

Every interface should be agent-friendly. "API mandate for agents: Every team must expose agent-friendly interfaces."

2. Proactive Over Reactive

Background, long-running, event-driven agents are better than chat-based interactions.

  • Use durable execution (Temporal)
  • Run in containers/sandboxes
  • Contained storage and files

3. Eval and Monitoring

"You can't improve what you don't measure."

  • Living, breathing eval system
  • On-my-eval, off-my-eval tracking
  • Make eval system available via MCP so agents can help improve themselves

4. Stay Agnostic

"Whoever is best today will probably not be best tomorrow."

  • Don't build for one model or framework
  • Keep agent harness simple — plan to rewrite
  • Use good memory systems for learning transfer

5. Multiplayer is Changing

Not just human-human. Now: human-human, human-agent, agent-agent. Design appropriate communication channels for each.

Looking Ahead

  • More learning on the job
  • Longer running, more independent agents
  • Build for multi-modal models (computer use coming)
  • Agents with eyes, not just text

Key Quotes

"If you're launching an agent and you don't know how you're doing eval, please don't launch that agent."

"Your agent harnesses are the important thing. Keep them simple. You're going to rewrite them."