web-content-scraper

Community

Scrape clean web content with image attribution.

Authorsekka1
Version1.0.0
Installs0

System Documentation

What problem does it solve?

This Skill enables reliable extraction of the main article content from web pages while filtering noise like ads, headers, footers, and navigation. It also downloads relevant images and preserves their source URLs for copyright attribution, delivering clean, markdown-ready content for AI contexts.

Core Features & Use Cases

  • Main content extraction: Retrieve the primary article or page content and convert it to markdown.
  • Image attribution: Download images with alt text and preserve source attribution metadata.
  • Robust to site variations: Works across blogs, documentation pages, and care guides by targeting common content regions and removing boilerplate.
  • Use Case: Feed collected web content into your moss wall knowledge base to answer questions with both text and referenced images.

Quick Start

  • Provide a URL to scrape (e.g., https://example.com/article) and return the cleaned main content as markdown, including image captions and attribution URLs.

Dependency Matrix

Required Modules

playwright

Components

scripts

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: web-content-scraper
Download link: https://github.com/sekka1/mosswall/archive/main.zip#web-content-scraper

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.