llm-deploy
OfficialDeploy LLMs with GPU inference servers.
Authortruefoundry
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill streamlines the deployment of Large Language Models (LLMs) and Machine Learning (ML) inference servers, enabling users to serve models efficiently on TrueFoundry with GPU acceleration.
Core Features & Use Cases
- Model Serving: Deploy models using frameworks like vLLM, TGI, or NVIDIA NIM.
- GPU Acceleration: Leverages GPU resources for high-performance inference.
- YAML Manifests: Uses
tfy applywith YAML manifests for declarative deployment. - Use Case: Deploying a Hugging Face model like Llama 3 for real-time text generation or using vLLM for an OpenAI-compatible inference endpoint.
Quick Start
Use the llm-deploy skill to deploy the model google/gemma-2b-it using vLLM.
Dependency Matrix
Required Modules
None requiredComponents
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: llm-deploy Download link: https://github.com/truefoundry/tfy-agent-skills/archive/main.zip#llm-deploy Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.