uv-pytorch-fsdp2
CommunityMaster PyTorch FSDP2 for large models.
Authoruv-xiao
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill enables coding agents to correctly integrate PyTorch FSDP2 (Fully Sharded Data Parallel 2.0) into training scripts, addressing the complexities of distributed training for large models that exceed single-GPU memory.
Core Features & Use Cases
- Correct FSDP2 Initialization: Ensures proper setup of distributed training environments.
- Sharding Configuration: Guides the application of
fully_shardbottom-up for optimal memory and performance. - Mixed Precision & Offload: Configures mixed precision and CPU offloading for memory-bound scenarios.
- Distributed Checkpointing: Integrates PyTorch Distributed Checkpoint (DCP) for robust saving and loading.
- Use Case: Training a massive language model that requires more GPU memory than available on a single device, by distributing parameters, gradients, and optimizer states across multiple GPUs and nodes.
Quick Start
Integrate PyTorch FSDP2 into your existing training script by following the step-by-step procedure outlined in the skill's documentation.
Dependency Matrix
Required Modules
None requiredComponents
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: uv-pytorch-fsdp2 Download link: https://github.com/uv-xiao/pkbllm/archive/main.zip#uv-pytorch-fsdp2 Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.