Skill Explorer

Searching protocol for "agent benchmarking"

af-skill-write-agent-benchmarks

Community

Benchmark AI agents with evidence-based tests.

Advanced

bykorchasa

tbench

Community

Benchmark AI agents with Terminal-Bench.

Advanced

byneilmovva

flow-skill-write-agent-benchmarks

Community

Evidence-based benchmarks for AI agents.

Advanced

bykorchasa

eval-recipes Runner Skill

Community

Benchmark amplihack improvements with eval-recipes.

Advanced

byrysweet

tbench

Official

Benchmark Mux agents with Terminal-Bench

Advanced

bycoder

agentifind-benchmark

Community

Benchmark AI agent CODEBASE.md effectiveness.

Advanced

byAvivK5498

worker-benchmarks

Community

Benchmark agentic worker performance.

Advanced

byfrankxai

tbench

Community

Benchmark Unix agents with Terminal-Bench.

Advanced

byonchainengineer

benchmark-loop

Community

Self-optimizing AI agent benchmark runner.

Advanced

by0x0funky

V3 Performance Optimization

Community

Supercharge v3 performance with benchmarks.

Few Config

byruvnet

Evaluate Benchmark Traces

Official

Analyze AI agent benchmark run traces.

Advanced

bysourcegraph

V3 Performance Optimization

Official

Boost v3 performance with benchmarking.

Advanced

byLLM-Dev-Ops