Name: dgx-spark-vllm
Availability: InStock
Author: osoleve

System Documentation

What problem does it solve?

This Skill provides structured guidance for deploying and optimizing vLLM on NVIDIA DGX Spark, reducing setup friction and improving inference throughput.

Core Features & Use Cases

Deployment Guidance: Best practices for configuring vLLM on DGX Spark clusters, including model selection, server settings, and workload management.
Performance Tuning: Recommendations for context size, batching, memory usage, and parallelism to maximize throughput and minimize latency.
Use Case: Run chat, coding assistance, and reasoning workloads on DGX Spark with optimized vLLM settings and real-time monitoring.

Quick Start

Configure the vLLM server on a DGX Spark cluster and run a basic chat prompt to validate setup and performance.

Please help me install this Skill: Name: dgx-spark-vllm Download link: https://github.com/osoleve/the-fold/archive/main.zip#dgx-spark-vllm Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

dgx-spark-vllm

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper