dgx-spark-vllm
CommunityTune vLLM on DGX Spark for faster AI.
Authorosoleve
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill provides structured guidance for deploying and optimizing vLLM on NVIDIA DGX Spark, reducing setup friction and improving inference throughput.
Core Features & Use Cases
- Deployment Guidance: Best practices for configuring vLLM on DGX Spark clusters, including model selection, server settings, and workload management.
- Performance Tuning: Recommendations for context size, batching, memory usage, and parallelism to maximize throughput and minimize latency.
- Use Case: Run chat, coding assistance, and reasoning workloads on DGX Spark with optimized vLLM settings and real-time monitoring.
Quick Start
Configure the vLLM server on a DGX Spark cluster and run a basic chat prompt to validate setup and performance.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: dgx-spark-vllm Download link: https://github.com/osoleve/the-fold/archive/main.zip#dgx-spark-vllm Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.