Skill Explorer

Searching protocol for "RLHF"

grpo

Community

Robust RLHF with group-relative policy training.

Advanced

byatrawog

verl

Community

Scale RLHF for LLMs with Verl.

Advanced

bytylertitsworth

reward

Community

Train reward models for RLHF pipelines.

Advanced

byatrawog

openrlhf-training

Community

Accelerate RLHF training for large language models.

Advanced

byovachiever

openrlhf-training

Community

Accelerate RLHF training for large models.

Advanced

byzhuangbiaowei

openrlhf-training

Community

Accelerate RLHF training for LLMs.

Advanced

byMesferAli

openrlhf-training

Community

Accelerate RLHF training for LLMs.

Advanced

byDoanNgocCuong

openrlhf-training

Community

Accelerate RLHF training for LLMs.

Advanced

bychoice5346

openrlhf-training

Community

Accelerate LLM RLHF training

Advanced

byinformatico-madrid

openrlhf-training

Official

Accelerate RLHF for LLMs

Advanced

byOrchestra-Research

openrlhf-training

Community

Accelerate RLHF training for large models.

Advanced

bygagan114662

openrlhf-training

Community

Accelerate RLHF training with Ray & vLLM.

Advanced

bytianhao909