evaluate-rag
CommunityEvaluate RAG pipeline quality.
Authorhamelsmu
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill addresses the challenge of evaluating the performance of Retrieval-Augmented Generation (RAG) systems by providing a structured approach to assess both retrieval and generation components.
Core Features & Use Cases
- Component-wise Evaluation: Separates the assessment of retrieval quality (e.g., Recall@k, MRR) from generation quality (faithfulness, relevance).
- Dataset Generation: Offers methods for creating evaluation datasets, including manual curation and synthetic QA pair generation.
- Chunking Optimization: Guides users on tuning chunking strategies (size, overlap, content-awareness) to improve retrieval.
- Use Case: Debugging a RAG system that returns irrelevant information or generates factually incorrect answers by pinpointing whether the issue lies in document retrieval or the LLM's response generation.
Quick Start
Use the evaluate-rag skill to assess the retrieval quality of the RAG pipeline using the Recall@5 metric.
Dependency Matrix
Required Modules
None requiredComponents
references
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: evaluate-rag Download link: https://github.com/hamelsmu/evals-skills/archive/main.zip#evaluate-rag Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.