Name: cjk-aware-text-metrics
Availability: InStock
Author: shimo4228

System Documentation

What problem does it solve?

Fixed chars-per-token constants fail for Japanese/Chinese/Korean text, leading to underestimation of tokens and downstream costs and rate limits.

Core Features & Use Cases

Detect CJK characters using Unicode ranges and compute weighted token counts that reflect multilingual content.
Apply to multilingual LLM preprocessing, chunking, and cost estimation for mixed-language documents.
Real-world use: process a Japanese document with mixed Latin text to produce accurate token counts for pricing and chunking.

Quick Start

Estimate token counts accurately for multilingual text by weighting CJK and Latin characters.

Please help me install this Skill: Name: cjk-aware-text-metrics Download link: https://github.com/shimo4228/claude-code-learned-skills/archive/main.zip#cjk-aware-text-metrics Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

cjk-aware-text-metrics

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper