provenance-engine

Community

Ensure traceable, citeable evidence for every data point.

AuthorJustinChaney2023
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Make every extracted value traceable to concrete source evidence (transcript segments and OCR regions). Enable auditors and staff to answer: “Where did this come from?” quickly and reliably.

Core Features & Use Cases

  • Stable IDs: Source spans must have stable identifiers that survive UI rendering, exports, and reprocessing (when possible).
  • Minimal ambiguity: A citation should point to a specific segment/region, not an entire document.
  • Enforceability: The system must be able to reject outputs that lack required citations.
  • Outputs: Generates source_index.json, citations.json, citation_validation_report.json, and UI highlight payloads (span -> offsets/bbox mapping).

Quick Start

Run the provenance-engine to generate traceable citations for your transcripts and OCR outputs. Provide transcript.json and ocr_output.json as inputs, with optional pasted_text that is chunked into spans.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: provenance-engine
Download link: https://github.com/JustinChaney2023/orate/archive/main.zip#provenance-engine

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.