llm-debug-test-failures
OfficialDebug LLM test failures
AuthorArm-Examples
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill helps developers efficiently diagnose and resolve failing integration tests for Large Language Models (LLMs), pinpointing issues related to model output, configuration, or backend regressions.
Core Features & Use Cases
- Reproduce Failing Tests: Easily re-run specific failing tests with verbose output.
- Inspect Model Responses: Capture detailed logs of prompts, responses, and runtime parameters for analysis.
- Validate Configurations: Verify model configuration files, paths, and runtime settings like context size and batch size.
- Trace Issues: Step through backend integrations (llama.cpp, ONNX Runtime GenAI, MediaPipe, MNN) and upstream framework sources to identify bugs.
- Use Case: When an
llm-cpp-ctestfails due to unexpected model output, use this Skill to rerun the test, capture the exact prompt and response, and inspect the configuration to understand why the output drifted.
Quick Start
Rerun the failing LLM integration tests verbosely from your build directory.
Dependency Matrix
Required Modules
None requiredComponents
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: llm-debug-test-failures Download link: https://github.com/Arm-Examples/LLM-Runner/archive/main.zip#llm-debug-test-failures Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.