Name: sparse-autoencoder-training
Availability: InStock
Author: AXGZ21

System Documentation

What problem does it solve?

This Skill addresses the challenge of polysemanticity in neural networks, where individual neurons represent multiple concepts, making interpretation difficult. It provides tools to train and analyze Sparse Autoencoders (SAEs) that decompose these dense activations into sparse, monosemantic features.

Core Features & Use Cases

Feature Discovery: Identify interpretable concepts learned by language models.
Superposition Analysis: Study how models represent multiple features within single neurons.
Mechanistic Interpretability: Understand the internal workings of neural networks.
Use Case: When analyzing a language model's response to a specific prompt, use this Skill to discover which learned features (e.g., sentiment, topic, grammatical structure) are most active and how they contribute to the output.

Quick Start

Use the saelens skill to load a pre-trained SAE for GPT-2 small and encode model activations.

Please help me install this Skill: Name: sparse-autoencoder-training Download link: https://github.com/AXGZ21/hermes-agent-railway/archive/main.zip#sparse-autoencoder-training Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

sparse-autoencoder-training

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper