Name: llama-cpp
Availability: InStock
Author: AXGZ21

System Documentation

What problem does it solve?

This Skill enables running large language models (LLMs) on standard hardware, including CPUs, Apple Silicon, and non-NVIDIA GPUs, overcoming the limitations of CUDA-dependent solutions.

Core Features & Use Cases

CPU & Edge Inference: Optimized for running LLMs on consumer-grade CPUs and edge devices.
Apple Silicon Support: Leverages Metal for efficient inference on M1/M2/M3 Macs.
Non-NVIDIA GPU Support: Supports AMD and Intel GPUs via ROCm and other backends.
GGUF Quantization: Utilizes quantized model formats for reduced memory footprint and faster inference.
Use Case: Deploying a chatbot on a laptop or a Raspberry Pi, or running LLMs on a workstation without an NVIDIA GPU.

Quick Start

Use the llama-cpp skill to run inference with the model located at 'models/llama-2-7b-chat.Q4_K_M.gguf' and explain quantum computing.

Please help me install this Skill: Name: llama-cpp Download link: https://github.com/AXGZ21/hermes-agent-railway/archive/main.zip#llama-cpp Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

llama-cpp

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper