mhc
CommunityStabilize deep transformers with mHC.
Software Engineering#transformers#mhc#manifold-constrained#doubly-stochastic#sinkhorn-knopp#residual-connections
Authoryonesuke
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Deep Transformer residual connections can become unstable in very deep architectures; mHC uses manifold-constrained doubling to preserve signal flow and improve training stability.
Core Features & Use Cases
- Applies a learned pre- and post-mapping to mix streams while preserving stable residuals.
- Uses Sinkhorn normalization to constrain the residual mapping, enabling scalable deep models.
- Use cases include training very deep Transformer models where gradient vanishing and representation collapse are concerns.
Quick Start
Apply mHC to stabilize very deep transformer networks by projecting the residual mapping onto the doubly stochastic manifold and using pre/post streams for controlled mixing.
Dependency Matrix
Required Modules
jaxjaxlib
Components
scripts
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: mhc Download link: https://github.com/yonesuke/skills/archive/main.zip#mhc Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.