Searching protocol for "ultrathink"
Train and evaluate RL agents with SB3.
Quantization and HNSW to maximize search speed & memory.