deepep-installer

Community

Install and troubleshoot DeepEP on NVIDIA GPUs.

Authoryangwhale
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill helps users install, configure, and troubleshoot DeepEP (DeepSeek Expert Parallelism) on NVIDIA GPU systems, covering end-to-end setup of CUDA, DOCA-OFED, NVSHMEM with IBGDA support, and the DeepEP library itself, including debugging workflows for common failures.

Core Features & Use Cases

  • End-to-end installation: Guides users through CUDA Toolkit, DOCA-OFED, NVSHMEM IBGDA, and DeepEP installation.
  • GPU compatibility and networking: Optimized for B200/H100/A100 GPUs with RoCE/InfiniBand networking, enabling high-performance MoE communication.
  • Troubleshooting: Provides structured steps to diagnose and fix common installation failures, including NIC mapping issues.
  • Use Case: A data center engineer can deploy a production DeepEP environment on a cluster with 8 GPUs per node and 2 nodes, ensuring proper NIC-to-GPU mapping.

Quick Start

Run the installer script to begin Phase 1 (CUDA + DOCA + PeerMappingOverride) and reboot as prompted. After reboot, run Phase 2 to install NVSHMEM (without GDRCopy), PyTorch, and DeepEP with PR #466 GPU-to-NIC mapping. Source the generated environment script and verify that importing DeepEP succeeds.

Dependency Matrix

Required Modules

cuda-toolkit-12-9doca-ofedcmakeninja-buildpython3-venvpython3-pippython3.12-devbuild-essentialgitcurl

Components

scripts

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: deepep-installer
Download link: https://github.com/yangwhale/gpu-tpu-pedia/archive/main.zip#deepep-installer

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.