uv-verl-rl-training
CommunityScale LLM RL training with verl.
Authoruv-xiao
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill provides a robust and scalable framework for training Large Language Models (LLMs) using Reinforcement Learning (RL), addressing the complexities of distributed training and various RL algorithms.
Core Features & Use Cases
- Reinforcement Learning: Supports advanced RL algorithms like GRPO, PPO, and others for LLM post-training.
- Scalable Infrastructure: Designed for large-scale training with flexible backend support (FSDP, Megatron-LM, vLLM).
- Use Case: Train a chatbot to follow complex instructions more accurately by using GRPO on a dataset of user prompts and desired responses, leveraging a distributed GPU cluster.
Quick Start
Launch GRPO training for math reasoning using the verl skill.
Dependency Matrix
Required Modules
None requiredComponents
references
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: uv-verl-rl-training Download link: https://github.com/uv-xiao/pkbllm/archive/main.zip#uv-verl-rl-training Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.