Name: extraction-skill
Availability: InStock
Author: Sheshiyer

System Documentation

What problem does it solve?

Extracting readable text from PDFs and EPUBs is often slow, unreliable, or requires manual processing; this skill provides a fast, deterministic way to obtain representative text samples for downstream classification and analysis, while gracefully handling errors and timeouts.

Core Features & Use Cases

Efficient text sampling from PDFs (first 5 pages) and last page, and from EPUBs (first chapter + TOC), limited to 10,000 characters for quick analysis.
Robust fallback: uses system utilities first (pdftotext, pdfinfo) and falls back to Python libraries (pdfplumber, ebooklib) when needed.
Timeouts and error logging to prevent pipeline hangs and enable manual review of failures.
Integrates with discovery-skill and downstream analysis-skill for end-to-end PARA/Enneagram classification.

Quick Start

Run the extraction-skill on a batch of PDFs/EPUBs discovered by the discovery-skill to produce text samples in extractions/ for downstream analysis.

Please help me install this Skill: Name: extraction-skill Download link: https://github.com/Sheshiyer/14113-X-vault/archive/main.zip#extraction-skill Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

extraction-skill

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper