schema-reference

Name: schema-reference
Availability: InStock
Author: linus-mcmanamey

Community

Validate schemas, generate accurate PySpark ETL.

Data & Analytics

Authorlinus-mcmanamey

Version1.0.0

Installs0

System Documentation

What problem does it solve?

This Skill prevents common schema-related errors and ensures business logic compliance when generating PySpark ETL code. It automates the process of querying actual schemas, extracting business rules from data dictionaries, and comparing schemas between data layers, guaranteeing that generated code is accurate and robust.

Core Features & Use Cases

Dynamic Schema Querying: Retrieve exact column names, data types, and constraints from DuckDB warehouse.
Business Logic Extraction: Parse data dictionary files to understand relationships, default values, and data quality rules.
Cross-Layer Schema Comparison: Identify differences and required transformations between Bronze, Silver, and Gold layer schemas.
Use Case: Before writing a new PySpark transformation for a Silver layer table, use this skill to query the Bronze source schema, extract relevant business rules from the data dictionary, and compare it against the target Silver schema to ensure all transformations are correctly defined.

Quick Start

Explain the steps to create a new Silver layer table named 's_customer_case' from 'bronze_cms.b_customer_case', ensuring all schema and business logic are correctly applied.

schema-reference

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper