Name: model-deployment
Availability: InStock
Author: ScientiaCapital

System Documentation

What problem does it solve?

Export and deploy fine-tuned models to production.

Core Features & Use Cases

GGUF export: export fine-tuned models for local or edge inference with llama.cpp/Ollama.
Production deployment: set up and run high-throughput serving with vLLM, Ollama, or Docker-based deployments.
Hub sharing & versioning: publish and version models on HuggingFace Hub for collaboration and reuse.
Use Case: A medical domain team finishes fine-tuning a model and deploys it to a vLLM server behind a load balancer for 24/7 API access.

Quick Start

Export the fine-tuned model to GGUF, e.g., model.save_pretrained_gguf('./gguf_output', tokenizer, quantization_method='q4_k_m').
Deploy with Ollama or vLLM: for Ollama, ollama create my-model -f Modelfile; for vLLM, start the server with the appropriate model path.
Optionally push the model to HuggingFace Hub for sharing and versioning.

Please help me install this Skill: Name: model-deployment Download link: https://github.com/ScientiaCapital/unsloth-mcp-server/archive/main.zip#model-deployment Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

model-deployment

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper