mhc

Community

Stabilize deep transformers with mHC.

Authoryonesuke
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Deep Transformer residual connections can become unstable in very deep architectures; mHC uses manifold-constrained doubling to preserve signal flow and improve training stability.

Core Features & Use Cases

  • Applies a learned pre- and post-mapping to mix streams while preserving stable residuals.
  • Uses Sinkhorn normalization to constrain the residual mapping, enabling scalable deep models.
  • Use cases include training very deep Transformer models where gradient vanishing and representation collapse are concerns.

Quick Start

Apply mHC to stabilize very deep transformer networks by projecting the residual mapping onto the doubly stochastic manifold and using pre/post streams for controlled mixing.

Dependency Matrix

Required Modules

jaxjaxlib

Components

scripts

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: mhc
Download link: https://github.com/yonesuke/skills/archive/main.zip#mhc

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.