unsloth-long-context
CommunityTrain models on extreme context lengths.
Software Engineering#memory efficiency#unsloth#long context#llm training#rope scaling#triton kernels
Authorcuba6112
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill addresses the challenge of training large language models on extremely long text sequences, which often leads to Out-of-Memory (OOM) errors and performance degradation.
Core Features & Use Cases
- Extended Context Training: Enables training models with context lengths of 89K+ tokens on high-VRAM GPUs.
- Memory Optimization: Achieves significant memory savings (30%+) compared to standard libraries like Flash Attention 2.
- Use Case: Training a model to summarize entire books or analyze lengthy codebases where standard context windows are insufficient.
Quick Start
Initialize a model for training with a maximum sequence length of 65536 tokens.
Dependency Matrix
Required Modules
unslothtritontorch
Components
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: unsloth-long-context Download link: https://github.com/cuba6112/skillfactory/archive/main.zip#unsloth-long-context Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.