Name: dataset-redaction
Availability: InStock
Author: JustinChaney2023

System Documentation

What problem does it solve?

This Skill creates safe evaluation datasets by redacting PHI and optionally generating synthetic equivalents while preserving document structure necessary for OCR, STT, and LLM benchmarking.

Core Features & Use Cases

Redaction: apply deterministic pseudonymization to patient identifiers across visits.
Synthetic generation: produce realistic, test-ready data with controlled deltas for benchmarking.
Deliverables: provide ready-to-use artifacts such as redaction policies, schemas, and tooling specs for reproducibility.

Quick Start

Run the redaction pipeline on a sample dataset to generate redacted_documents.json and gold_facts.json, then validate the dataset with the provided manifests.

Please help me install this Skill: Name: dataset-redaction Download link: https://github.com/JustinChaney2023/orate/archive/main.zip#dataset-redaction Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

dataset-redaction

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper