Skill Explorer

Searching protocol for "code evaluation"

phoenix-evals

Official

Build and run AI evaluators with Phoenix.

Advanced

byArize-ai

langsmith-evaluator

Official

Build scalable, code-driven LangSmith evaluators.

Advanced

bylangchain-ai

codex-evaluate

Community

Evaluate code quality with local Codex CLI.

Few Config

byrollrat

langsmith-evaluator

Community

Build and run LangSmith evaluations.

Few Config

bydhar174

evaluation

Official

Codex evaluation templates

Advanced

byObjective-Arts

eval-harness

Community

Formal evaluation framework for Claude Code.

Advanced

byyd5768365-hue

evaluating-code-models

Community

Benchmark code generation models.

Advanced

bytianhao909

LangSmith Evaluators

Official

Build and run robust AI evaluations.

Few Config

bylangchain-ai

evaluating-code-models

Community

Benchmark code generation models.

Advanced

bysangrokjung

code-quality-review-all

Official

Audit all evaluations against one quality standard.

Advanced

byUKGovernmentBEIS

agentic-eval

Official

Improve AI outputs with self-critique loops.

Few Config

bygithub

eld-ground-evaluate

Official

Evaluate output against Law constraints.

Advanced

byCAPHTECH