Chapters: 

You will see as you build agent systems, that evaluations are probably the most important part of everything you do. You want to ensure that there are no regressions.

LLMs checking Agentic Workflows

LLM as a Judge

Evaluate the workflow on the basis of reference inputs. 

Graders

Pass or Fail with re