dgx-spark-vllm

Community

Tune vLLM on DGX Spark for faster AI.

Authorosoleve
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill provides structured guidance for deploying and optimizing vLLM on NVIDIA DGX Spark, reducing setup friction and improving inference throughput.

Core Features & Use Cases

  • Deployment Guidance: Best practices for configuring vLLM on DGX Spark clusters, including model selection, server settings, and workload management.
  • Performance Tuning: Recommendations for context size, batching, memory usage, and parallelism to maximize throughput and minimize latency.
  • Use Case: Run chat, coding assistance, and reasoning workloads on DGX Spark with optimized vLLM settings and real-time monitoring.

Quick Start

Configure the vLLM server on a DGX Spark cluster and run a basic chat prompt to validate setup and performance.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: dgx-spark-vllm
Download link: https://github.com/osoleve/the-fold/archive/main.zip#dgx-spark-vllm

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.