Name: Benchmark Manager
Availability: InStock
Author: sunholo-data

System Documentation

What problem does it solve?

This Skill eliminates benchmark creation errors and debugging frustration by providing expert guidance on AILANG's evaluation system, particularly the critical distinction between prompt types.

Core Features & Use Cases

Benchmark Validation: Automatically check YAML files for common issues like incorrect prompt usage.
Debugging Tools: Show exactly what prompts models receive and test benchmarks efficiently.
Use Case: When your benchmark shows 0% pass rate despite language support, use this Skill to identify and fix the underlying prompt configuration problem.

Quick Start

Use the Benchmark Manager skill to debug the failing json_parse benchmark by showing the full prompt and testing with a cheap model.

Please help me install this Skill: Name: Benchmark Manager Download link: https://github.com/sunholo-data/ailang/archive/main.zip#benchmark-manager Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

Benchmark Manager

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper