langfuse-agent-eval

Community

Evaluate and improve agent performance.

Authormberto10
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill streamlines the process of evaluating AI agent performance by automating experiment execution, failure analysis, and documentation of findings.

Core Features & Use Cases

  • Automated Experimentation: Runs evaluation cycles using Langfuse experiments.
  • Failure Analysis: Groups and analyzes agent failures to identify root causes.
  • Recommendation Generation: Provides specific, actionable recommendations for improvement.
  • Use Case: Improve the accuracy and reliability of a customer support agent by running an evaluation cycle, identifying common failure patterns, and implementing targeted fixes.

Quick Start

Use the langfuse agent eval skill to run an evaluation cycle for the 'customer-support-agent'.

Dependency Matrix

Required Modules

langfuse-experiment-runnerlangfuse-trace-analysislangfuse-data-retrievallangfuse-score-analytics

Components

scriptsreferencesassets

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: langfuse-agent-eval
Download link: https://github.com/mberto10/mberto-compound/archive/main.zip#langfuse-agent-eval

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.