databricks-parsing
CommunityParse documents into structured text.
AuthorAradhya0510
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill automates the extraction of text and structured data from various document types (PDF, DOCX, PPTX, images), enabling efficient document processing and the creation of custom RAG pipelines.
Core Features & Use Cases
- Document Parsing: Utilizes the
ai_parse_documentSQL function to convert binary documents into structured text. - RAG Pipeline Foundation: Serves as the initial step for building custom Retrieval Augmented Generation pipelines by parsing and chunking documents.
- Use Case: Ingesting a collection of research papers from a Databricks Volume, parsing them into text, and preparing them for a custom RAG system to enable semantic search.
Quick Start
Parse all documents in the '/Volumes/catalog/schema/volume/docs/' directory using the ai_parse_document function.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: databricks-parsing Download link: https://github.com/Aradhya0510/databricks-cv-accelerator/archive/main.zip#databricks-parsing Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.