Production Agents at Scale – Diamond Bishop
Speaker: Diamond Bishop, Director of Engineering AI, Datadog
Background: 15+ years building AI agents, "Been through at least one AI winter", former startup founder (acquired by Datadog)
Products in Production
| Product | Purpose | Status |
|---|---|---|
| Bits AI SRE | Automated incident investigation | GA |
| Bits AI Dev | Code generation for errors/latency | GA |
| Security Analyst | Automated security investigations | GA |
Core Philosophy: "Intelligence is not going to be the bottleneck anymore. The bottleneck is building agents that can operate autonomously."
Five Key Lessons
1. Code Agent-First
Every interface should be agent-friendly. "API mandate for agents: Every team must expose agent-friendly interfaces."
2. Proactive Over Reactive
Background, long-running, event-driven agents are better than chat-based interactions.
- Use durable execution (Temporal)
- Run in containers/sandboxes
- Contained storage and files
3. Eval and Monitoring
"You can't improve what you don't measure."
- Living, breathing eval system
- On-my-eval, off-my-eval tracking
- Make eval system available via MCP so agents can help improve themselves
4. Stay Agnostic
"Whoever is best today will probably not be best tomorrow."
- Don't build for one model or framework
- Keep agent harness simple — plan to rewrite
- Use good memory systems for learning transfer
5. Multiplayer is Changing
Not just human-human. Now: human-human, human-agent, agent-agent. Design appropriate communication channels for each.
Looking Ahead
- More learning on the job
- Longer running, more independent agents
- Build for multi-modal models (computer use coming)
- Agents with eyes, not just text
Key Quotes
"If you're launching an agent and you don't know how you're doing eval, please don't launch that agent."
"Your agent harnesses are the important thing. Keep them simple. You're going to rewrite them."