debug-stuck-eval

Official

Unstick AI evaluations fast.

AuthorMETR
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill helps diagnose and resolve issues when AI evaluations are not progressing as expected, preventing wasted time and resources.

Core Features & Use Cases

  • Status Checking: Quickly assess the state of running evaluations.
  • Log Analysis: Pinpoint errors and patterns causing evaluations to hang.
  • API Testing: Directly test external API connections to isolate issues.
  • Use Case: An evaluation has been running for hours with no new samples completed. Use this Skill to check its status, review logs for errors like "Retrying request to /responses", and test the API directly to find the root cause.

Quick Start

Use the debug-stuck-eval skill to check the status of eval-set-id '12345'.

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: debug-stuck-eval
Download link: https://github.com/METR/inspect-action/archive/main.zip#debug-stuck-eval

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.