trino-to-hive-migration
OfficialMigrate Trino to Hive, solve memory errors.
Authortreasure-data
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill provides expert guidance for migrating queries from Trino to Hive when encountering memory errors, timeouts, or performance issues with very large datasets. It helps users leverage Hive's scalability for batch processing, ensuring complex queries complete successfully and reliably.
Core Features & Use Cases
- Problem Identification: Helps determine when a Trino query is failing due to memory limits and when Hive is a more suitable engine for large-scale batch processing.
- Syntax Conversion Guide: Provides a comprehensive mapping of Trino functions and syntax to their Hive equivalents, including
TD_TIME_STRINGtoTD_TIME_FORMATandAPPROX_PERCENTILEtoPERCENTILE. - Optimization for Hive: Offers performance tips for Hive, such as using
MAPJOINhints and dynamic partitioning, to ensure efficient execution of migrated queries. - Use Case: A data scientist's Trino query for a year's worth of event data consistently fails with "Query exceeded per-node memory limit." This skill guides them through converting the query to Hive, replacing Trino-specific functions, and adding Hive optimization hints, allowing the large-scale batch job to complete successfully.
Quick Start
Convert Trino TD_TIME_STRING to Hive TD_TIME_FORMAT
-- Trino: SELECT TD_TIME_STRING(time, 'd!', 'JST') as date -- Hive: SELECT TD_TIME_FORMAT(time, 'yyyy-MM-dd', 'JST') as date FROM your_table WHERE TD_INTERVAL(time, '-1d', 'JST')
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: trino-to-hive-migration Download link: https://github.com/treasure-data/td-skills/archive/main.zip#trino-to-hive-migration Please download this .zip file, extract it, and install it in the .claude/skills/ directory.