Name: pyspark-databricks
Availability: InStock
Author: Awish021

System Documentation

What problem does it solve?

This Skill streamlines the development and optimization of PySpark ETL pipelines specifically for the Databricks environment, ensuring efficient data processing and cost-effectiveness.

Core Features & Use Cases

ETL Pipeline Development: Author robust PySpark ETL pipelines for data ingestion and transformation.
Performance Optimization: Tune Spark jobs for maximum performance and minimal cost.
Delta Lake Integration: Implement Delta Lake patterns for enhanced data reliability and ACID transactions.
Use Case: Optimize a large-scale PySpark job that processes terabytes of raw event data on Databricks, reducing runtime by 30% and associated cloud costs.

Quick Start

Use the pyspark-databricks skill to build an ETL pipeline that reads parquet events, joins with CSV users, and saves the result as a delta table partitioned by country.

Please help me install this Skill: Name: pyspark-databricks Download link: https://github.com/Awish021/opencode/archive/main.zip#pyspark-databricks Please download this .zip file, extract it, and install it in the .claude/skills/ directory.

pyspark-databricks

System Documentation

What problem does it solve?

Core Features & Use Cases

Quick Start

Dependency Matrix

Required Modules

Components

💻 Claude Code Installation

Agent Skills Search Helper