agent-evaluation-mlflow
CommunityEnforce safety gates with MLflow evaluation.
Authorraphaelmansuy
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Implement agent evaluation and safety gates using MLflow 3.x to ensure agents meet quality and safety standards before deployment.
Core Features & Use Cases
- MLflow-based evaluation: Track experiments, automate scoring, and store traces for auditability.
- Built-in scorers: Safety, Correctness, Relevance, Hallucination, and guidelines-based checks.
- Quality gates: Pre-deploy and continuous evaluation to block deployments that fail thresholds.
Quick Start
Run an MLflow-backed evaluation against a sample dataset to validate safety and quality gates.
Dependency Matrix
Required Modules
mlflow
Components
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: agent-evaluation-mlflow Download link: https://github.com/raphaelmansuy/k8s-agent-stack/archive/main.zip#agent-evaluation-mlflow Please download this .zip file, extract it, and install it in the .claude/skills/ directory.