Skill Explorer

Searching protocol for "inference acceleration"

speculative_decoding

Community

Accelerate LLM inference speed.

Advanced

bytianhao909

speculative_decoding

Community

Accelerate LLM inference speed.

Advanced

byDoanNgocCuong

model-pruning

Community

Shrink LLMs, boost inference speed.

Advanced

bychoice5346

nvidia-nim

Community

Deploy GPU-accelerated AI with NVIDIA NIM.

Advanced

byrish2jain

speculative-decoding

Community

Accelerate LLM inference speed.

Advanced

byMesferAli

model-pruning

Official

Compress LLMs, accelerate inference.

Advanced

byOrchestra-Research

speculative-decoding

Community

Accelerate LLM inference speed.

Advanced

byihatesea69

unsloth-core

Community

Accelerate LLM fine-tuning & inference.

Few Config

bycuba6112

llm-deploy

Official

Deploy LLMs with GPU inference servers.

Advanced

bytruefoundry

model-pruning

Community

Compress LLMs, accelerate inference.

Advanced

byDoanNgocCuong

inference

Community

Fast, memory-efficient LLM inference with vLLM.

Advanced

byatrawog

model-pruning

Community

Shrink LLMs, boost inference speed.

Advanced

byihatesea69