gemini-audio
CommunityTranscribe, summarize, analyze, and synthesize audio with Gemini.
Authoralex-tgk
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Gemini Audio provides transcription, analysis, and summarization of audio, plus text-to-speech generation. It streamlines workflows for podcasts, interviews, meetings, and multimedia content by turning audio into searchable text and actionable insights.
Core Features & Use Cases
- Transcription with timestamps and multi-speaker support
- Audio summarization and key-point extraction
- Non-speech audio analysis (music, ambient sounds)
- Text-to-speech (TTS) generation with controllable voice styles
- File management via a Files API workflow for reuse across tasks
Quick Start
Configure GEMINI_API_KEY, then run transcribe.py or generate-speech.py to process audio files or synthesize speech.
Dependency Matrix
Required Modules
google-genai
Components
scriptsreferences
💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: gemini-audio Download link: https://github.com/alex-tgk/saasaas/archive/main.zip#gemini-audio Please download this .zip file, extract it, and install it in the .claude/skills/ directory.