deepeval

Official

Evaluate LLM performance and compliance.

AuthorDTMC-marketplace
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill addresses the need to rigorously evaluate the performance and compliance of Large Language Models (LLMs) against regulatory standards and identify potential risks.

Core Features & Use Cases

  • LLM Evaluation: Test LLMs for hallucination, toxicity, bias, and answer relevancy.
  • Compliance Assessment: Evaluate AI systems against EU AI Act Article 15 requirements.
  • Risk Mitigation: Implement controls and monitoring for AI performance risks.
  • Use Case: A development team is building an AI chatbot for customer service. They use this Skill to ensure the chatbot's responses are accurate, unbiased, and do not violate any regulatory guidelines before deployment.

Quick Start

Use the deepeval skill to evaluate the LLM for compliance with Art. 15 requirements.

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: deepeval
Download link: https://github.com/DTMC-marketplace/governance/archive/main.zip#deepeval

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.