Building Reliable Agents - Evaluation Challenges
Exploring the challenges of evaluating agent reliability and LLM performance.
Exploring the challenges of evaluating agent reliability and LLM performance.
Shreya Shankar from UC Berkeley discusses building reliable agents for data processing, focusing on understanding data and intent specification gaps.