pdf-extractor
CommunityExtract data from PDFs, including scanned.
AuthorGreenmamba29
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill automates the extraction of structured information, text, tables, images, and form data from PDF documents, including scanned ones requiring OCR.
Core Features & Use Cases
- Content Extraction: Extracts text, tables, images, and form field data.
- OCR Support: Processes scanned PDFs using OCR engines like Tesseract or Google Vision API.
- Batch Processing: Handles large sets of documents efficiently.
- Use Case: Extracting invoice data from a batch of supplier PDFs for accounting.
Quick Start
Use the pdf-extractor skill to extract tables from the file './invoices/lithium_supplier_inv_2026.pdf' and save the output as a CSV file.
Dependency Matrix
Required Modules
tesseractgoogle-cloud-vision
Components
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: pdf-extractor Download link: https://github.com/Greenmamba29/skillsdotmd_web/archive/main.zip#pdf-extractor Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.