# Audio & Voice

- [Overview](/guides/audio-and-voice/audio-voice.md)
- [Whisper Transcription](/guides/audio-and-voice/whisper-transcription.md): Transcribe audio and video with OpenAI Whisper on Clore.ai GPUs
- [WhisperX with Diarization](/guides/audio-and-voice/whisperx.md): Run WhisperX for fast speech transcription with word-level timestamps and speaker diarization on Clore.ai GPUs.
- [Bark TTS](/guides/audio-and-voice/bark-tts.md): Generate realistic speech and audio with Bark AI on Clore.ai
- [XTTS (Coqui)](/guides/audio-and-voice/xtts-coqui.md): Natural speech generation with voice cloning using Coqui XTTS
- [F5-TTS](/guides/audio-and-voice/f5-tts.md): Fast and fluent text-to-speech with F5-TTS on Clore.ai GPUs
- [Zonos TTS Voice Cloning](/guides/audio-and-voice/zonos-tts.md): Run Zonos TTS by Zyphra for voice cloning with emotion and pitch control on Clore.ai GPUs.
- [OpenVoice](/guides/audio-and-voice/openvoice-clone.md): Clone any voice with seconds of audio using OpenVoice on Clore.ai
- [RVC Voice Clone](/guides/audio-and-voice/rvc-voice-clone.md): Clone and convert voices with RVC on Clore.ai GPUs
- [Demucs Separation](/guides/audio-and-voice/demucs-separation.md): Separate music into vocals, drums, bass, and more with Demucs
- [AudioCraft Music](/guides/audio-and-voice/audiocraft-music.md): Generate music and audio with Meta's AudioCraft on Clore.ai
- [Stable Audio](/guides/audio-and-voice/stable-audio.md): Generate music and sound effects with Stable Audio on Clore.ai
- [Dia TTS (Nari Labs)](/guides/audio-and-voice/dia-tts.md): Generate multi-speaker dialog with emotion using Dia TTS by Nari Labs
- [Qwen3-TTS Voice Cloning](/guides/audio-and-voice/qwen3-tts.md): Multilingual voice cloning and TTS with Qwen3-TTS — 10+ languages, streaming, emotion control
- [Kokoro TTS](/guides/audio-and-voice/kokoro-tts.md): Run Kokoro TTS — an ultra-lightweight 82M-parameter text-to-speech model on Clore.ai GPUs.
- [ChatTTS Conversational Speech](/guides/audio-and-voice/chattts.md): Run ChatTTS conversational text-to-speech with fine-grained prosody control on Clore.ai GPUs.
- [Chatterbox Voice Cloning](/guides/audio-and-voice/chatterbox-tts.md): Run Chatterbox TTS by Resemble AI for zero-shot voice cloning and multilingual speech synthesis on Clore.ai GPUs.
- [Kani-TTS-2 Voice Cloning](/guides/audio-and-voice/kani-tts.md): Run Kani-TTS-2 — an ultra-efficient 400M parameter text-to-speech model with voice cloning on Clore.ai GPUs
- [MiniMax Speech 2.6](/guides/audio-and-voice/minimax-speech.md): Deploy MiniMax Speech 2.6 — ultra-low latency voice agent TTS — on Clore.ai GPU servers
- [Fish Speech](/guides/audio-and-voice/fish-speech.md): Run Fish Speech multilingual TTS and zero-shot voice cloning on Clore.ai GPUs
- [StyleTTS2](/guides/audio-and-voice/styletss2.md): Run StyleTTS2 human-level text-to-speech via style diffusion on Clore.ai GPUs
- [MeloTTS](/guides/audio-and-voice/melotts.md): Run MeloTTS high-quality multilingual TTS with fast inference on Clore.ai GPUs
- [Voxtral TTS](/guides/audio-and-voice/voxtral-tts.md)
