Name: agent-evaluation-mlflow
Availability: InStock
Author: raphaelmansuy

System Documentation

What problem does it solve?

Implement agent evaluation and safety gates using MLflow 3.x to ensure agents meet quality and safety standards before deployment.

Core Features & Use Cases

MLflow-based evaluation: Track experiments, automate scoring, and store traces for auditability.
Built-in scorers: Safety, Correctness, Relevance, Hallucination, and guidelines-based checks.
Quality gates: Pre-deploy and continuous evaluation to block deployments that fail thresholds.

Quick Start

Run an MLflow-backed evaluation against a sample dataset to validate safety and quality gates.

Please help me install this Skill: Name: agent-evaluation-mlflow Download link: https://github.com/raphaelmansuy/k8s-agent-stack/archive/main.zip#agent-evaluation-mlflow Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

agent-evaluation-mlflow

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper