eval-rag

Name: eval-rag
Availability: InStock
Author: breethomas

Community

Evaluate RAG retrieval and generation precisely

Product & Management #recall #rag #evaluation #retrieval #generation #mrr #ndcg

Authorbreethomas

Version1.0.0

Installs0

System Documentation

What problem does it solve?

Diagnose and quantify where a retrieval-augmented generation (RAG) pipeline fails by separating retrieval quality from generation faithfulness so teams can prioritize the highest-impact fixes and avoid optimizing the wrong component.

Core Features & Use Cases

Retrieval metrics: Compute Recall@k, Precision@k, MRR, and NDCG@k to measure whether the system finds the right document chunks for a query.
Generation evaluation: Assess faithfulness, omissions, misinterpretations, and relevance of model outputs given retrieved context.
Optimization guidance: Build retrieval evaluation datasets, tune chunking and overlap, run grid searches, diagnose multi-hop failures, and produce prioritized engineering recommendations.
Use Case: A PM wants to know whether customer-support answers are failing because the vector store misses FAQ paragraphs or because the LLM hallucinates; this skill yields metrics, diagnostic tables, and next-step fixes.

Quick Start

Evaluate the RAG pipeline for the customer knowledge base and return retrieval metrics (Recall@k, Precision@k, MRR, NDCG@k), a faithfulness/relevance summary of generation failures, and recommended fixes.

eval-rag

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper