uv-pytorch-fsdp2

Name: uv-pytorch-fsdp2
Availability: InStock
Author: uv-xiao

Community

Master PyTorch FSDP2 for large models.

Software Engineering #checkpointing #pytorch #distributed training #gpu memory #large models #fsdp2

Authoruv-xiao

Version1.0.0

Installs0

System Documentation

What problem does it solve?

This Skill enables coding agents to correctly integrate PyTorch FSDP2 (Fully Sharded Data Parallel 2.0) into training scripts, addressing the complexities of distributed training for large models that exceed single-GPU memory.

Core Features & Use Cases

Correct FSDP2 Initialization: Ensures proper setup of distributed training environments.
Sharding Configuration: Guides the application of fully_shard bottom-up for optimal memory and performance.
Mixed Precision & Offload: Configures mixed precision and CPU offloading for memory-bound scenarios.
Distributed Checkpointing: Integrates PyTorch Distributed Checkpoint (DCP) for robust saving and loading.
Use Case: Training a massive language model that requires more GPU memory than available on a single device, by distributing parameters, gradients, and optimizer states across multiple GPUs and nodes.

Quick Start

Integrate PyTorch FSDP2 into your existing training script by following the step-by-step procedure outlined in the skill's documentation.

uv-pytorch-fsdp2

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper