advanced-evaluation

Community

Build reliable LLM evaluation systems.

Author466852675
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill addresses the challenge of reliably evaluating LLM outputs, ensuring quality and mitigating biases inherent in automated assessment.

Core Features & Use Cases

  • LLM-as-a-Judge: Implement advanced techniques for using LLMs to evaluate other LLM responses.
  • Bias Mitigation: Actively counter position bias, length bias, and other systematic errors in evaluation.
  • Rubric Generation: Create structured, domain-specific rubrics for consistent scoring.
  • Use Case: You need to compare two AI-generated summaries of a news article. This Skill helps you set up a robust evaluation process to determine which summary is superior based on criteria like accuracy, conciseness, and clarity, while accounting for potential biases.

Quick Start

Use the advanced-evaluation skill to compare two model responses based on accuracy and clarity.

Dependency Matrix

Required Modules

None required

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: advanced-evaluation
Download link: https://github.com/466852675/TISHICIKU-2025/archive/main.zip#advanced-evaluation

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.