ref-hallucination-arena

Official

Benchmark LLM reference accuracy

Authoragentscope-ai
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill addresses the critical issue of Large Language Models (LLMs) fabricating academic references, ensuring the reliability of AI-generated literature reviews and citations.

Core Features & Use Cases

  • Comprehensive Verification: Validates cited papers against major academic databases (Crossref, PubMed, arXiv, DBLP).
  • Detailed Metrics: Measures hallucination rate, per-field accuracy (title, author, year, DOI), and discipline-specific performance.
  • Use Case: When evaluating an LLM's ability to generate a literature review, use this Skill to automatically check if all the provided citations are real and accurate, quantifying its tendency to hallucinate.

Quick Start

Evaluate LLM reference recommendation capabilities using the provided configuration file.

Dependency Matrix

Required Modules

matplotlib

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: ref-hallucination-arena
Download link: https://github.com/agentscope-ai/OpenJudge/archive/main.zip#ref-hallucination-arena

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.