Skill Explorer

Searching protocol for "judge calibration"

validate-evaluator

Community

Calibrate LLM judges against human labels.

Advanced

byhamelsmu

validate-evaluator

Community

Calibrate LLM judges against human labels.

Advanced

bymarchatton

prompt-engineering-research

Community

Optimize AI image prompts.

Advanced

byNikGor

superforecaster

Community

Well-calibrated forecasts for uncertain questions

Advanced

byagentydragon

learn

Community

Improve CYNIC judgments with targeted feedback.

Few Config

byzeyxx

advanced-evaluation

Community

Make LLM judgments reliable with proven methods.

Advanced

bysamvanme

llm-prompting

Community

Master LLM prompting patterns and safety.

Advanced

byuabbasi

advanced-evaluation

Community

Scale LLM evaluation with bias-aware automation.

Advanced

byKalyanikhandare29

phoenix-evals

Official

Build and run AI evaluators with Phoenix.

Advanced

byArize-ai

agent-evaluation

Community

Rigorous agent testing and validation.

Advanced

bysharkitect-solutions

skill-judge

Community

Audit skills with expert-quality scoring.

Advanced

byfbosch

skill-judge

Community

Elevate your Agent Skills.

Advanced

bykriegcloud