pytorch-fsdp
CommunityMaster PyTorch FSDP for efficient training.
AuthorAum08Desai
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill provides expert guidance and practical examples for implementing and optimizing Fully Sharded Data Parallel (FSDP) training with PyTorch, addressing challenges in large-scale model training.
Core Features & Use Cases
- Parameter Sharding: Understand and implement strategies for sharding model parameters across multiple devices.
- Mixed Precision & CPU Offloading: Learn how to leverage mixed precision training and CPU offloading to reduce memory footprint.
- FSDP2: Get up-to-date information on the latest FSDP features and best practices.
- Use Case: When training a massive language model that exceeds the memory of a single GPU, this Skill helps you configure PyTorch FSDP to distribute the model and data efficiently across your cluster.
Quick Start
Use the pytorch-fsdp skill to learn about the generic join context manager in PyTorch distributed training.
Dependency Matrix
Required Modules
None requiredComponents
references
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: pytorch-fsdp Download link: https://github.com/Aum08Desai/hermes-research-agent/archive/main.zip#pytorch-fsdp Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.