agent-evaluation-mlflow

Community

Enforce safety gates with MLflow evaluation.

Authorraphaelmansuy
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Implement agent evaluation and safety gates using MLflow 3.x to ensure agents meet quality and safety standards before deployment.

Core Features & Use Cases

  • MLflow-based evaluation: Track experiments, automate scoring, and store traces for auditability.
  • Built-in scorers: Safety, Correctness, Relevance, Hallucination, and guidelines-based checks.
  • Quality gates: Pre-deploy and continuous evaluation to block deployments that fail thresholds.

Quick Start

Run an MLflow-backed evaluation against a sample dataset to validate safety and quality gates.

Dependency Matrix

Required Modules

mlflow

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: agent-evaluation-mlflow
Download link: https://github.com/raphaelmansuy/k8s-agent-stack/archive/main.zip#agent-evaluation-mlflow

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository