Model Evaluation Patterns
CommunityEnsure model quality and fairness.
Data & Analytics#calibration#fairness#model evaluation#LLM evaluation#classification metrics#sliced evaluation
AuthorHermeticOrmus
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill addresses the critical need to thoroughly evaluate machine learning models beyond simple accuracy, ensuring they are robust, fair, and performant across various scenarios.
Core Features & Use Cases
- Comprehensive Metrics: Generates a full suite of classification metrics with confidence intervals.
- Calibration Analysis: Assesses if model probabilities accurately reflect true likelihoods.
- Sliced Evaluation: Identifies performance disparities across different data subgroups.
- Fairness Metrics: Quantifies bias related to sensitive attributes.
- LLM Evaluation: Integrates BERTScore and RAGAS for generative model assessment.
- Use Case: After training a churn prediction model, use this Skill to generate a complete evaluation report, check its calibration, and verify it doesn't unfairly penalize specific customer segments.
Quick Start
Generate a full evaluation report for the test data using the provided true labels, predicted labels, and predicted probabilities.
Dependency Matrix
Required Modules
scikit-learnfairlearnevaluateragasdatasetsnumpypandasmatplotlibscipy
Components
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: Model Evaluation Patterns Download link: https://github.com/HermeticOrmus/LibreMLOps-Claude-Code/archive/main.zip#model-evaluation-patterns Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.