sft-data-format
CommunityClarify data formats and pipeline metadata.
AuthorHsunGong
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill helps teams validate and document data formats and metadata structures used in the SFT pipeline, reducing integration errors and improving reproducibility.
Core Features & Use Cases
- Data format validation: Ensures input/output data follow the JSON Lines convention and the stage metadata structure.
- Resume key handling: Verifies usage of idx as a resuming key across pipeline steps.
- Metadata documentation: Produces clear metadata schemas and examples for downstream components.
- Use Case: When ingesting datasets into the SFT pipeline, run this skill to confirm formatting, keys, and metadata are consistent before processing.
Quick Start
Use the sft-data-format skill to validate your dataset's JSON Lines formatting, idx resume keys, and think tags.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: sft-data-format Download link: https://github.com/HsunGong/prep/archive/main.zip#sft-data-format Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.