Name: calibrate
Availability: InStock
Author: Borda

System Documentation

What problem does it solve?

This Skill rigorously tests AI agents and skills against synthetic problems to measure their performance, identify systematic gaps, and ensure their self-reported confidence aligns with actual accuracy.

Core Features & Use Cases

Performance Benchmarking: Quantifies recall, precision, and F1 scores for agents and skills.
Calibration Analysis: Detects over/under-confidence by comparing reported confidence with actual recall.
Gap Identification: Pinpoints recurring issues and anti-patterns in agent outputs.
Automated Improvement: Generates proposals to update agent instructions based on benchmark results.
Use Case: Run /calibrate sw-engineer full to test the software engineer agent on 10 synthetic coding problems, analyze its performance, and automatically generate updated instructions if needed.

Quick Start

Use the calibrate skill to benchmark all agents and skills with full problem sets and apply any necessary changes.

Please help me install this Skill: Name: calibrate Download link: https://github.com/Borda/.home/archive/main.zip#calibrate Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

calibrate

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper