reshard-c4-data

Community

Reversible data resharding with constraints.

AuthorZurybr
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill addresses the challenge of reorganizing large datasets into smaller, manageable shards while strictly adhering to constraints like maximum files per directory and maximum file size, ensuring perfect data reconstruction.

Core Features & Use Cases

  • Hierarchical Sharding: Implements nested directory structures to meet item count limits at every level.
  • Size Constraint Handling: Splits large files into manageable chunks, tracking them via metadata.
  • Reversible Transformation: Guarantees that compressed data can be perfectly decompressed back to its original state.
  • Use Case: When preparing a massive dataset for distributed training, you need to ensure no single directory exceeds 1000 files and no individual file is larger than 100MB. This Skill provides the logic to achieve this while maintaining a full audit trail for reconstruction.

Quick Start

Use the reshard-c4-data skill to reorganize the dataset located at '/data/raw' into a new structure at '/data/processed' ensuring no directory has more than 30 files and no file exceeds 15MB.

Dependency Matrix

Required Modules

None required

Components

scriptsreferences

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: reshard-c4-data
Download link: https://github.com/Zurybr/lefarma-skills/archive/main.zip#reshard-c4-data

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.