langfuse-agent-eval
CommunityEvaluate and improve agent performance.
Software Engineering#ai development#failure analysis#performance improvement#experimentation#langfuse#agent evaluation
Authormberto10
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill streamlines the process of evaluating AI agent performance by automating experiment execution, failure analysis, and documentation of findings.
Core Features & Use Cases
- Automated Experimentation: Runs evaluation cycles using Langfuse experiments.
- Failure Analysis: Groups and analyzes agent failures to identify root causes.
- Recommendation Generation: Provides specific, actionable recommendations for improvement.
- Use Case: Improve the accuracy and reliability of a customer support agent by running an evaluation cycle, identifying common failure patterns, and implementing targeted fixes.
Quick Start
Use the langfuse agent eval skill to run an evaluation cycle for the 'customer-support-agent'.
Dependency Matrix
Required Modules
langfuse-experiment-runnerlangfuse-trace-analysislangfuse-data-retrievallangfuse-score-analytics
Components
scriptsreferencesassets
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: langfuse-agent-eval Download link: https://github.com/mberto10/mberto-compound/archive/main.zip#langfuse-agent-eval Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.