Name: pdf-harvester
Availability: InStock
Author: mindmorass

System Documentation

What problem does it solve?

This Skill streamlines the extraction of text, tables, and metadata from PDF documents, enabling fast ingestion into RAG pipelines and searchable archives.

Core Features & Use Cases

Text and layout-preserving extraction from PDFs, including support for tables and conversion to Markdown.
OCR for image-based or scanned documents to recover content with pytesseract.
Academic paper parsing with structure detection for abstracts, sections, and references, plus metadata extraction.

Quick Start

Run a sample PDF through the harvest process to extract text, tables, and metadata, then inspect the resulting data structure.

Please help me install this Skill: Name: pdf-harvester Download link: https://github.com/mindmorass/reflex/archive/main.zip#pdf-harvester Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

pdf-harvester

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper