gastrohem-media-processor

Community

Transcribe audio, OCR images, fast and smart.

AuthorasapMaki
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Manually transcribing audio and extracting information from images in daily WhatsApp folders is a slow and labor-intensive process. This skill automates media processing, saving significant time and ensuring no critical information is missed from your daily communications.

Core Features & Use Cases

  • Parallel Audio Transcription: Transcribes audio files (.mp3, .ogg, etc.) in parallel using insanely-fast-whisper, generating .json outputs for easy integration.
  • Intelligent Image OCR: Uses Claude's vision capabilities to perform OCR on images, creating natural language .md summaries focused on Gastrohem-relevant information.
  • Use Case: At the end of the day, run this skill to automatically process all voice notes and images from your team's WhatsApp conversations, getting structured text summaries ready for review and further action.

Quick Start

To process all media for today, simply ask: Process media. To process media for a specific date (e.g., 24.10), ask: Process media for 24.10.

Dependency Matrix

Required Modules

insanely-fast-whisper

Components

scripts

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: gastrohem-media-processor
Download link: https://github.com/asapMaki/vozzy-whatsapp/archive/main.zip#gastrohem-media-processor

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.