Text to Speech

18K+ downloads Requires API Key Content & Media

The Text to Speech skill transforms written content into high-quality audio using either ElevenLabs or OpenAI's TTS engine. Choose from dozens of realistic voices across multiple languages and accents, or clone a custom voice using a short audio sample with ElevenLabs. The skill supports streaming output for real-time playback, batch processing for long documents, and SSML tags for fine-grained control over pauses, emphasis, and pronunciation. Audio can be exported as MP3, WAV, or OGG. Practical use cases include accessibility tools that read website content aloud, automated audiobook production from manuscripts, voiceover generation for video scripts, and dynamic IVR system prompts. The skill integrates directly with the Podcast Generator and Summarize skills, letting you summarize an article and immediately convert the result to audio. You can adjust speaking rate, pitch, and volume through simple natural language commands. When using ElevenLabs, multilingual voices allow seamless switching between languages mid-sentence, ideal for localized content production.

Installation

bash

clawhub install tts

API key required. Get a free key from ElevenLabs/OpenAI and add it to your OpenClaw configuration.

voiceTTSaudiospeech

Install: clawhub install tts

Text to Speech

Installation

Related Skills

We'll configure this skill for you