debug-stuck-eval
OfficialUnstick AI evaluations fast.
AuthorMETR
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill helps diagnose and resolve issues when AI evaluations are not progressing as expected, preventing wasted time and resources.
Core Features & Use Cases
- Status Checking: Quickly assess the state of running evaluations.
- Log Analysis: Pinpoint errors and patterns causing evaluations to hang.
- API Testing: Directly test external API connections to isolate issues.
- Use Case: An evaluation has been running for hours with no new samples completed. Use this Skill to check its status, review logs for errors like "Retrying request to /responses", and test the API directly to find the root cause.
Quick Start
Use the debug-stuck-eval skill to check the status of eval-set-id '12345'.
Dependency Matrix
Required Modules
None requiredComponents
references
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: debug-stuck-eval Download link: https://github.com/METR/inspect-action/archive/main.zip#debug-stuck-eval Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.