reshard-c4-data
CommunityReversible data resharding with constraints.
Software Engineering#constraint management#data compression#data sharding#data decompression#dataset reorganization#hierarchical structure
AuthorZurybr
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill addresses the challenge of reorganizing large datasets into smaller, manageable shards while strictly adhering to constraints like maximum files per directory and maximum file size, ensuring perfect data reconstruction.
Core Features & Use Cases
- Hierarchical Sharding: Implements nested directory structures to meet item count limits at every level.
- Size Constraint Handling: Splits large files into manageable chunks, tracking them via metadata.
- Reversible Transformation: Guarantees that compressed data can be perfectly decompressed back to its original state.
- Use Case: When preparing a massive dataset for distributed training, you need to ensure no single directory exceeds 1000 files and no individual file is larger than 100MB. This Skill provides the logic to achieve this while maintaining a full audit trail for reconstruction.
Quick Start
Use the reshard-c4-data skill to reorganize the dataset located at '/data/raw' into a new structure at '/data/processed' ensuring no directory has more than 30 files and no file exceeds 15MB.
Dependency Matrix
Required Modules
None requiredComponents
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: reshard-c4-data Download link: https://github.com/Zurybr/lefarma-skills/archive/main.zip#reshard-c4-data Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.