eval-check

Community

Verify agent quality automatically.

Authorrosinbum
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill ensures the AI agent's quality and correctness by automatically running evaluations after code changes, preventing regressions and maintaining performance.

Core Features & Use Cases

  • Automated Code Change Detection: Identifies modified agent files across core, tools, and services.
  • Tiered Evaluation Execution: Runs fast, deterministic tests first, then prompts for confirmation before executing expensive LLM-based evaluations.
  • Selective Evals: Allows running only relevant LLM evaluations based on the type of code changes detected.
  • Use Case: After a developer modifies the agent's prompt templates, this skill runs evaluations to confirm that the agent's responses remain grounded, correct, and properly cited.

Quick Start

Run agent quality evaluations after core agent code changes.

Dependency Matrix

Required Modules

pnpm

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: eval-check
Download link: https://github.com/rosinbum/usopc-athlete-support-agent/archive/main.zip#eval-check

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.