2 docs tagged with "failure modes"

Building Reliable Agents - Evaluation Challenges

Exploring the challenges of evaluating agent reliability and LLM performance.

Shreya Shankar from UC Berkeley discusses building reliable agents for data processing, focusing on understanding data and intent specification gaps.