cartesia-tts

Official

Real-time voice synthesis with ultra-low latency.

AuthorShakudo-io
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Provides real-time, natural-sounding text-to-speech with ultra-low latency for live voice applications, enabling responsive conversational AI, telephony, and embedded assistants across languages.

Core Features & Use Cases

  • Ultra-low latency TTS with Sonic-3 (~90ms first-byte) for real-time conversations, telephony, and Pipecat pipelines.
  • Streaming and batch endpoints: WebSocket, SSE, and HTTP; supports a multilingual voice library with 40+ languages and 60+ emotion controls.
  • Multilingual support and emotion controls suitable for customer service, IVR, voice assistants, and live narration.
  • Seamless integration with Pipecat workflows and telephony platforms, enabling real-time voice in AI pipelines.
  • Secure API usage with API keys and versioning; flexible output formats (raw PCM, WAV, MP3) and sample rates.

Quick Start

Send a text to the Cartesia TTS API using the sonic-3 model, select a voice, and request streaming audio in your desired format.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: cartesia-tts
Download link: https://github.com/Shakudo-io/opencode-skills/archive/main.zip#cartesia-tts

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.