math-extractor

Name: math-extractor
Availability: InStock
Author: Develata

Community

Extracts math content from documents.

Education & Research #nlp #definitions #document-processing #tex #math-extraction #theorems #pdf-to-markdown

AuthorDevelata

Version1.0.0

Installs0

System Documentation

What problem does it solve?

Extracts strictly mathematical terms (Definitions, Theorems, Lemmas, Propositions, Proofs) from documents, handling PDF conversion and AI-based cleaning. Use when the user wants to extract math content from a file.

Core Features & Use Cases

Robust PDF Conversion: Uses MinerU for high-quality PDF to Markdown conversion.
Smart Chunking: Splits text by paragraphs to avoid breaking math formulas.
Cost Optimization: Heuristically filters out non-math chunks to save tokens.
Math Protection: Whitelists safe HTML tags to prevent accidental deletion of math inequalities (e.g., a < b).
Encoding Fallback: Automatically tries UTF-8, GBK, and Latin-1 encodings.
Retry Logic: Built-in retries for API calls to handle network instability.
Use Case: Imagine you have a scanned thesis in PDF or a collection of lecture notes; run this skill to extract all mathematical terms and compile them into a clean Markdown file.

Quick Start

Run the Python script with a document path and an output directory to produce a file named <filename>_extracted.md.

math-extractor

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper