text-extraction-and-preservation
OfficialPreserve text whitespace from HTML
Authorkreuzberg-dev
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill addresses the challenge of extracting text content from HTML while accurately preserving the original whitespace, preventing data loss or misinterpretation.
Core Features & Use Cases
- Whitespace Preservation: Maintains original spacing, tabs, and newlines from HTML text nodes.
- HTML Entity Decoding: Correctly converts HTML entities (e.g.,
) into their corresponding characters. - Control Character Handling: Manages special characters within text content.
- Use Case: Extracting code snippets or formatted text from web pages where precise spacing is critical for readability and functionality.
Quick Start
Use the text-extraction-and-preservation skill to extract all text content from the provided HTML, ensuring original whitespace is kept.
Dependency Matrix
Required Modules
None requiredComponents
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: text-extraction-and-preservation Download link: https://github.com/kreuzberg-dev/html-to-markdown/archive/main.zip#text-extraction-and-preservation Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.