evaluate-rag

Community

Evaluate RAG pipeline quality.

Authorhamelsmu
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill addresses the challenge of evaluating the performance of Retrieval-Augmented Generation (RAG) systems by providing a structured approach to assess both retrieval and generation components.

Core Features & Use Cases

  • Component-wise Evaluation: Separates the assessment of retrieval quality (e.g., Recall@k, MRR) from generation quality (faithfulness, relevance).
  • Dataset Generation: Offers methods for creating evaluation datasets, including manual curation and synthetic QA pair generation.
  • Chunking Optimization: Guides users on tuning chunking strategies (size, overlap, content-awareness) to improve retrieval.
  • Use Case: Debugging a RAG system that returns irrelevant information or generates factually incorrect answers by pinpointing whether the issue lies in document retrieval or the LLM's response generation.

Quick Start

Use the evaluate-rag skill to assess the retrieval quality of the RAG pipeline using the Recall@5 metric.

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: evaluate-rag
Download link: https://github.com/hamelsmu/evals-skills/archive/main.zip#evaluate-rag

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.