Skill Explorer

Searching protocol for "preference-data"

dpo

Community

Optimize preferences with implicit reward learning.

Advanced

byatrawog

simpo-training

Community

Efficient LLM alignment without a reference model.

Advanced

byDoanNgocCuong

simpo-training

Community

Efficient LLM alignment without a reference model.

Few Config

byinformatico-madrid

simpo-training

Community

Efficient LLM alignment without a reference model.

Advanced

byAum08Desai

agentuity-cli-cloud-keyvalue-set

Official

Set a key/value in cloud KV.

Few Config

byagentuity

rlhf

Community

Align language models with human feedback.

Advanced

byitsmostafa

ab-testing-statistician

Community

Design and analyze blind audio tests with statistics.

Advanced

byiammarkps

nutritional-specialist

Community

Personalized nutrition coaching that remembers.

Advanced

byAnit-1to10x

newsletter-events-write

Community

Create newsletters automatically from stored events.

Advanced

byaniketpanjwani

simpo-training

Community

Reference-free preference optimization for LLM alignment.

Advanced

byovachiever

python-core-idioms

Community

Master modern Python idioms for clean code.

Few Config

bysraloff

simpo-training

Community

Optimize LLMs with SimPO, no reference needed.

Advanced

bygagan114662