Name: pytorch-fsdp
Availability: InStock
Author: Aum08Desai

System Documentation

What problem does it solve?

This Skill provides expert guidance and practical examples for implementing and optimizing Fully Sharded Data Parallel (FSDP) training with PyTorch, addressing challenges in large-scale model training.

Core Features & Use Cases

Parameter Sharding: Understand and implement strategies for sharding model parameters across multiple devices.
Mixed Precision & CPU Offloading: Learn how to leverage mixed precision training and CPU offloading to reduce memory footprint.
FSDP2: Get up-to-date information on the latest FSDP features and best practices.
Use Case: When training a massive language model that exceeds the memory of a single GPU, this Skill helps you configure PyTorch FSDP to distribute the model and data efficiently across your cluster.

Quick Start

Use the pytorch-fsdp skill to learn about the generic join context manager in PyTorch distributed training.

Please help me install this Skill: Name: pytorch-fsdp Download link: https://github.com/Aum08Desai/hermes-research-agent/archive/main.zip#pytorch-fsdp Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

pytorch-fsdp

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper