Skill Explorer

Searching protocol for "fp16"

gptq

Community

Compress LLMs for efficient deployment

Advanced

bygagan114662

deepspeed

Community

Scale training with DeepSpeed efficiently.

Advanced

byovachiever

gptq

Community

4-bit quantization for large LLMs on consumer GPUs.

Advanced

byovachiever

gptq

Community

Compress LLMs for efficiency

Advanced

byMesferAli

gptq

Community

Compress LLMs for consumer GPUs.

Advanced

byDoanNgocCuong

gptq

Community

Compress LLMs for efficient deployment

Advanced

bychoice5346

gptq

Official

Compress LLMs for efficient deployment.

Advanced

byOrchestra-Research

vector-index-tuning

Community

Tune vector indexes for speed and recall.

Advanced

by48Nauts-Operator

gpu-optimizer

Community

Maximize GPU throughput & prevent OOMs

Advanced

byMathews-Tom

quantization

Community

Lean, fast model quantization for inference.

Advanced

byatrawog

awq-quantization

Community

Compress LLMs for faster inference.

Few Config

byinformatico-madrid

awq-quantization

Community

Compress LLMs with minimal accuracy loss.

Advanced

byDoanNgocCuong