Searching protocol for "judge-patterns"
Automated and human evaluation for LLMs.
Best practices for goal-driven agent design
Evaluate work with an AI judge.
Production-grade context engineering for AI agents
Pattern-driven design for goal-driven agents.
Measure and improve LLM performance.