Skill Explorer

Searching protocol for "grpo"

grpo-rl-training

Community

GRPO/RL training patterns

Advanced

byovachiever

grpo

Community

Robust RLHF with group-relative policy training.

Advanced

byatrawog

grpo-rl-training

Community

Fine-tune models with GRPO/RL

Advanced

bykwasi-cpu

grpo-rl-training

Community

Master GRPO/RL for advanced model fine-tuning.

Advanced

byDoanNgocCuong

grpo-rl-training

Community

Master GRPO/RL fine-tuning with TRL.

Advanced

bychoice5346

grpo-rl-training

Community

Master GRPO/RL fine-tuning with TRL.

Advanced

bygagan114662

grpo-rl-training

Community

Master GRPO/RL for advanced model fine-tuning.

Advanced

byinformatico-madrid

grpo-finetuning

Official

GRPO fine-tuning for vision-language models

Advanced

byaws-solutions-library-samples

grpo-rl-training

Community

Fine-tune models with custom rewards.

Advanced

byAum08Desai

grpo-rl-training

Community

Fine-tune LLMs with custom rewards for complex tasks.

Advanced

byzhuangbiaowei

unsloth-grpo

Community

Optimize reasoning models with GRPO.

Advanced

bycuba6112

verl-rl-training

Community

Scale LLM RL training with verl.

Advanced

bygagan114662