Searching protocol for "rl-loop"
RL training for LLMs with Megatron+SGLang.
Turn tweet data into a Notion-driven RL loop.