metadata-extraction
OfficialExtract structured data from HTML.
Authorkreuzberg-dev
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill automates the extraction of valuable metadata from HTML content, eliminating the need for manual parsing or multiple processing passes.
Core Features & Use Cases
- Comprehensive Metadata Extraction: Gathers document details (title, description, author), headers (with hierarchy), links (classified), images (with attributes), and structured data (JSON-LD, Microdata, RDFa).
- Single-Pass Efficiency: Extracts metadata during the HTML-to-Markdown conversion process, minimizing overhead.
- Configurable: Allows selective extraction of metadata types to optimize performance.
- Use Case: Automatically extract all article headers, links, and images from a blog post's HTML to generate a sitemap, analyze SEO, or build a content index.
Quick Start
Use the metadata-extraction skill to extract all document metadata, headers, links, and images from the provided HTML content.
Dependency Matrix
Required Modules
None requiredComponents
references
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: metadata-extraction Download link: https://github.com/kreuzberg-dev/html-to-markdown/archive/main.zip#metadata-extraction Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.