math-extractor

Community

Extracts math content from documents.

AuthorDevelata
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Extracts strictly mathematical terms (Definitions, Theorems, Lemmas, Propositions, Proofs) from documents, handling PDF conversion and AI-based cleaning. Use when the user wants to extract math content from a file.

Core Features & Use Cases

  • Robust PDF Conversion: Uses MinerU for high-quality PDF to Markdown conversion.
  • Smart Chunking: Splits text by paragraphs to avoid breaking math formulas.
  • Cost Optimization: Heuristically filters out non-math chunks to save tokens.
  • Math Protection: Whitelists safe HTML tags to prevent accidental deletion of math inequalities (e.g., a < b).
  • Encoding Fallback: Automatically tries UTF-8, GBK, and Latin-1 encodings.
  • Retry Logic: Built-in retries for API calls to handle network instability.
  • Use Case: Imagine you have a scanned thesis in PDF or a collection of lecture notes; run this skill to extract all mathematical terms and compile them into a clean Markdown file.

Quick Start

Run the Python script with a document path and an output directory to produce a file named <filename>_extracted.md.

Dependency Matrix

Required Modules

requests

Components

scripts

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: math-extractor
Download link: https://github.com/Develata/Deve-Skills/archive/main.zip#math-extractor

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.