metadata-extraction

Official

Extract structured data from HTML.

Authorkreuzberg-dev
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill automates the extraction of valuable metadata from HTML content, eliminating the need for manual parsing or multiple processing passes.

Core Features & Use Cases

  • Comprehensive Metadata Extraction: Gathers document details (title, description, author), headers (with hierarchy), links (classified), images (with attributes), and structured data (JSON-LD, Microdata, RDFa).
  • Single-Pass Efficiency: Extracts metadata during the HTML-to-Markdown conversion process, minimizing overhead.
  • Configurable: Allows selective extraction of metadata types to optimize performance.
  • Use Case: Automatically extract all article headers, links, and images from a blog post's HTML to generate a sitemap, analyze SEO, or build a content index.

Quick Start

Use the metadata-extraction skill to extract all document metadata, headers, links, and images from the provided HTML content.

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: metadata-extraction
Download link: https://github.com/kreuzberg-dev/html-to-markdown/archive/main.zip#metadata-extraction

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.