llm-debug-test-failures

Name: llm-debug-test-failures
Availability: InStock
Author: Arm-Examples

Official

Debug LLM test failures

Software Engineering #testing #debugging #integration tests #llm #model output #backend regressions

AuthorArm-Examples

Version1.0.0

Installs0

System Documentation

What problem does it solve?

This Skill helps developers efficiently diagnose and resolve failing integration tests for Large Language Models (LLMs), pinpointing issues related to model output, configuration, or backend regressions.

Core Features & Use Cases

Reproduce Failing Tests: Easily re-run specific failing tests with verbose output.
Inspect Model Responses: Capture detailed logs of prompts, responses, and runtime parameters for analysis.
Validate Configurations: Verify model configuration files, paths, and runtime settings like context size and batch size.
Trace Issues: Step through backend integrations (llama.cpp, ONNX Runtime GenAI, MediaPipe, MNN) and upstream framework sources to identify bugs.
Use Case: When an llm-cpp-ctest fails due to unexpected model output, use this Skill to rerun the test, capture the exact prompt and response, and inspect the configuration to understand why the output drifted.

Quick Start

Rerun the failing LLM integration tests verbosely from your build directory.

Dependency Matrix

Required Modules

None required

Components

scriptsreferences