Skill Explorer

Searching protocol for "skill evaluation"

langsmith-evaluator

Community

Build and run LangSmith evaluations.

Few Config

bydhar174

trulens-evaluation-workflow

Official

Orchestrate end-to-end LLM app evaluations.

Advanced

bytruera

phoenix-evals

Official

Build and run AI evaluators with Phoenix.

Advanced

byArize-ai

re-evaluate

Official

Auto re-evaluate attempts after changes.

Advanced

bybrazil-bench

LangSmith Evaluators

Official

Build and run robust AI evaluations.

Few Config

bylangchain-ai

advanced-evaluation

Community

LLM-based evaluation patterns for scale.

Advanced

bygeorgeguimaraes

agent-evaluation

Community

Evaluate and optimize LLM agents.

Advanced

byslysik

databricks-mlflow-evaluation

Community

End-to-end GenAI evaluation with MLflow.

Advanced

byandregit2026

fiftyone-model-evaluation

Official

Evaluate model predictions against ground truth.

Few Config

byvoxel51

mlflow-evaluation

Official

MLflow GenAI evaluation workflows for agents.

Advanced

bydatabricks-solutions

evaluation-methodology

Community

Standardize book evaluation protocols.

Advanced

byrayanino

reduction-semantics

Community

Define program evaluation via rewrite rules.

Few Config

byrainoftime