Searching protocol for "python-assertions"
Configure, run, and judge LLM evaluations.
Benchmark prompts with structured tests.