Docling Chunking

Community

Structure-aware chunking for reliable RAG.

Authororbruno
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Docling Chunking addresses the difficulty of extracting meaningful, navigable chunks from long documents while preserving document structure, provenance, and metadata for robust retrieval and analysis.

Core Features & Use Cases

  • HybridChunker provides balanced structure-aware chunking optimized for embeddings and RAG workflows.
  • HierarchicalChunker preserves exact document structure for precise citations and hierarchical knowledge representations.
  • Rich per-chunk metadata (pages, section headings, provenance) supports high-quality search, auditing, and traceability.
  • Export options enable JSONL and Markdown outputs for ingestion by vector stores, databases, and knowledge bases.
  • Use cases include building document QA pipelines, knowledge bases, and compliant citation trails across multi-document corpora.

Quick Start

Run the example workflow to convert a document and generate hierarchical chunks ready for RAG pipelines.

Dependency Matrix

Required Modules

None required

Components

references

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: Docling Chunking
Download link: https://github.com/orbruno/docling-ccplugin/archive/main.zip#docling-chunking

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.