Searching protocol for "classification-metrics"
Unified model evaluation across tasks.
Ensure ML claims are statistically valid
Ensure model quality and fairness.
LLM evaluation with automated metrics.