generate-test-data
CommunityGenerate diverse synthetic test datasets
Authorbreethomas
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This skill produces diverse, realistic synthetic inputs to surface failure modes in LLM pipelines when real data is sparse or unrepresentative, avoiding naive random generation and ensuring coverage of hard cases.
Core Features & Use Cases
- Dimension-based Tuple Generation: Define axes of variation (dimensions) and produce combinatorial tuples that target anticipated failures rather than arbitrary variation.
- PM Validation and Iteration: Collaborate with a product manager to draft and refine an initial set of tuples so generated cases reflect real-world scenarios.
- LLM-driven Expansion, Conversion, and Filtering: Expand validated tuples with an LLM, convert tuples into naturalistic user queries, filter for realism, and execute queries through the full pipeline to capture traces for analysis.
- Use Case Example: For a customer support chatbot, define dimensions like query type, user expertise, and complexity; draft 20 tuples, expand and convert them into realistic queries, filter out low-quality prompts, and run the resulting set through the system to produce ~100 diverse traces for error analysis.
Quick Start
Generate 20 dimension-based tuples for a customer support chatbot, convert each tuple into a realistic user query, filter for realism, and run them through the full LLM pipeline.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: generate-test-data Download link: https://github.com/breethomas/bette-think/archive/main.zip#generate-test-data Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.