Searching protocol for "Inspect Evals"
Analyze evaluation logs for samples and metrics.
Evaluate and inspect Nix expressions
View and analyze eval logs with Python.
Port AI evals to Inspect with feasibility
Analyze Hawk evaluation results.
Automate real-browser testing locally.
Interact with live Smalltalk images via MCP.
Streamline Hugging Face model evaluation outputs.
Automate AI evaluation task creation with guided workflows.
Debug Nix evaluation and build issues.
Discover Emacs functions, keys, and modes.
Track model-card evaluations with ease and reliability.