Speech & Transcription

← Back to main list

64 skills

addis-assistant-stt - Provides Speech-to-Text (STT) and text
agent-voice - Command-line blogging platform for AI agents.
announcer - Announce text throughout the house via AirPlay speakers using Airfoil +.
assemblyai-transcribe - Transcribe audio/video with AssemblyAI
audio-gen - Generate audiobooks, podcasts, or educational audio content
audio-reply - Generate audio replies using TTS.
chichi-speech - A RESTful service for high-quality text-to-speech using Qwen3
claw-voice - You are connected to a live user session via voice.
clonev - Clone any voice and generate speech using Coqui XTTS v2.
critical-article-writer - Generate draft articles, outlines
cult-of-carcinization - Give your agent a voice — and ears.
deepdub-tts - Generate speech audio using Deepdub and attach it as a MEDIA
deepgram - — command-line interface for Deepgram speech-to-text.
doubao-api-open-tts - Text-to-Speech service using Doubao (Volcano Engine)
duby - Convert text to speech using Duby.so API.
eachlabs-voice-audio - TTS, STT, voice conversion using ElevenLabs, Whisper, RVC.
easyverein-api - Work with the easyVerein v2.0 REST API
edge-tts - |.
elevenlabs-agents - Create, manage, and deploy ElevenLabs
elevenlabs-media - ElevenLabs music generation and speech-to-text...
elevenlabs-transcribe - Transcribe audio to text using ElevenLabs
elevenlabs-tts - ElevenLabs TTS - the best ElevenLabs integration for OpenClaw.
elevenlabs-voices - High-quality voice synthesis with 18 personas, 32
faster-whisper - Local speech-to-text using faster-whisper.
feishu-minutes - Fetch info, stats, transcript, and media from Feishu
freshbooks-cli - FreshBooks CLI for managing invoices, clients, and billing.
gettr-transcribe-summarize - Download audio from a GETTR post
inworld-tts - Text-to-speech via Inworld.ai API.
jarvis-voice - Metallic AI voice persona with TTS and visual transcript styling.
kokoro-tts - Generate spoken audio from text using the local Kokoro TTS engine.
llmwhisperer - Extract text and layout from images and PDFs using LLMWhisperer
local-stt - Local STT with selectable backends - Parakeet (best accuracy) or Whisper.
local-whisper - Local speech-to-text using OpenAI Whisper.
minimax-tts - name: minimax-tts.
mlx-whisper - Local speech-to-text with MLX Whisper
moodcast - Transform any text into emotionally expressive audio with ambient
openai-whisper - Local speech-to-text with the Whisper CLI (no API key).
openai-whisper-api - Transcribe audio via OpenAI Audio Transcriptions API
parakeet-mlx - Local speech-to-text with Parakeet MLX (ASR) for Apple Silicon
parakeet-stt - >-.
phone-voice - Connect ElevenLabs Agents to your OpenClaw via phone with Twilio.
plaud-unofficial - Use when accessing Plaud voice recorder data
pocket-transcripts - Read transcripts and summaries from Pocket AI
pocket-tts - pocket-tts
qwen-tts - Local text-to-speech using Qwen3-TTS-12Hz-1.7B-CustomVoice.
ringg-voice-agent - Integrate Ringg AI voice agents with OpenClaw
routstr-balance-management - Manage Routstr balance by checking
sapi-tts - Windows SAPI5 text-to-speech with Neural voices.
sound-fx - Generate short sound effects via ElevenLabs SFX (text-to-sound).
spaces - Voice-first social spaces where Moltbook agents hang out.
transcribe - Transcribe audio files to text using local Whisper (Docker).
tts - Text-to-speech using Hume AI or OpenAI API.
tts-whatsapp - Send high-quality text-to-speech voice messages on WhatsApp in 40+
video-subtitles - Generate SRT subtitles from video/audio with translation
voice-agent - Local Voice Input/Output for Agents using the AI Voice Agent
voice-ai-agent - Create, manage, and deploy Voice.ai conversational AI
voice-ai-tts - High-quality voice synthesis with 9 personas, 11 languages
voice-ai-voices - High-quality voice synthesis with 9 personas, 11
voice-transcribe - Transcribe audio files using OpenAI's
voice-ui - Self-evolving voice assistant UI.
webchat-audio-notifications - Add browser audio notifications
whatsapp-voice-chat-integration-open-source - Real-time WhatsApp
whisper-mlx-local - Free local speech-to-text for Telegram and WhatsApp
x-voice-match - Analyze a Twitter/X account's posting style and generate

9.8 KiB Raw Blame History

Speech & Transcription

9.8 KiB

Raw Blame History