databricks-unstructured-pdf-generation
CommunityGenerate synthetic PDFs for RAG
AuthorAradhya0510
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill automates the creation of realistic synthetic PDF documents, complete with LLM-generated content and RAG evaluation metadata, streamlining the process of building and testing retrieval systems.
Core Features & Use Cases
- Synthetic PDF Generation: Creates professional PDF documents based on detailed descriptions.
- RAG Evaluation Data: Generates accompanying JSON files with questions and guidelines for testing retrieval systems.
- Unity Catalog Integration: Automatically uploads generated documents to specified Unity Catalog Volumes.
- Use Case: Generate a set of 20 technical documentation PDFs for a new SaaS platform to populate a vector database and test a RAG pipeline's ability to answer user queries accurately.
Quick Start
Use the generate_pdf_documents MCP tool to create 10 PDF documents for a cloud infrastructure platform in the 'my_catalog' catalog and 'my_schema' schema.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: databricks-unstructured-pdf-generation Download link: https://github.com/Aradhya0510/databricks-cv-accelerator/archive/main.zip#databricks-unstructured-pdf-generation Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.