experiment-analysis

Community

Analyze GRPO runs for learning & performance.

Authorbglick13
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill helps diagnose training runs by extracting Elo metrics, tracking learning dynamics, and surface issues in the GRPO pipeline.

Core Features & Use Cases

  • Elo trajectory analysis: Retrieve and interpret Elo progression across checkpoints.
  • Metrics extraction: Pull training metrics from WandB and Axiom logs for diagnostics.
  • Reports & dashboards: Generate summaries and highlight actionable insights for experiment trackers.

Quick Start

Use the provided tool to fetch Elo metrics for a run, for example: uv run python .claude/skills/experiment-analysis/analyze_elo.py <run-name>

Dependency Matrix

Required Modules

wandb

Components

scripts

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: experiment-analysis
Download link: https://github.com/bglick13/diplomacy-v2/archive/main.zip#experiment-analysis

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.