> For the complete documentation index, see [llms.txt](https://docs.clore.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.clore.ai/guides/audio-and-voice.md).

# Audio & Voice

- [Overview](https://docs.clore.ai/guides/audio-and-voice/audio-voice.md)
- [Whisper Transcription](https://docs.clore.ai/guides/audio-and-voice/whisper-transcription.md): Transcribe audio and video with OpenAI Whisper on Clore.ai GPUs
- [WhisperX with Diarization](https://docs.clore.ai/guides/audio-and-voice/whisperx.md): Run WhisperX for fast speech transcription with word-level timestamps and speaker diarization on Clore.ai GPUs.
- [Bark TTS](https://docs.clore.ai/guides/audio-and-voice/bark-tts.md): Generate realistic speech and audio with Bark AI on Clore.ai
- [XTTS (Coqui)](https://docs.clore.ai/guides/audio-and-voice/xtts-coqui.md): Natural speech generation with voice cloning using Coqui XTTS
- [F5-TTS](https://docs.clore.ai/guides/audio-and-voice/f5-tts.md): Fast and fluent text-to-speech with F5-TTS on Clore.ai GPUs
- [Zonos TTS Voice Cloning](https://docs.clore.ai/guides/audio-and-voice/zonos-tts.md): Run Zonos TTS by Zyphra for voice cloning with emotion and pitch control on Clore.ai GPUs.
- [OpenVoice](https://docs.clore.ai/guides/audio-and-voice/openvoice-clone.md): Clone any voice with seconds of audio using OpenVoice on Clore.ai
- [RVC Voice Clone](https://docs.clore.ai/guides/audio-and-voice/rvc-voice-clone.md): Clone and convert voices with RVC on Clore.ai GPUs
- [Demucs Separation](https://docs.clore.ai/guides/audio-and-voice/demucs-separation.md): Separate music into vocals, drums, bass, and more with Demucs
- [AudioCraft Music](https://docs.clore.ai/guides/audio-and-voice/audiocraft-music.md): Generate music and audio with Meta's AudioCraft on Clore.ai
- [Stable Audio](https://docs.clore.ai/guides/audio-and-voice/stable-audio.md): Generate music and sound effects with Stable Audio on Clore.ai
- [Dia TTS (Nari Labs)](https://docs.clore.ai/guides/audio-and-voice/dia-tts.md): Generate multi-speaker dialog with emotion using Dia TTS by Nari Labs
- [Qwen3-TTS Voice Cloning](https://docs.clore.ai/guides/audio-and-voice/qwen3-tts.md): Multilingual voice cloning and TTS with Qwen3-TTS — 10+ languages, streaming, emotion control
- [Kokoro TTS](https://docs.clore.ai/guides/audio-and-voice/kokoro-tts.md): Run Kokoro TTS — an ultra-lightweight 82M-parameter text-to-speech model on Clore.ai GPUs.
- [ChatTTS Conversational Speech](https://docs.clore.ai/guides/audio-and-voice/chattts.md): Run ChatTTS conversational text-to-speech with fine-grained prosody control on Clore.ai GPUs.
- [Chatterbox Voice Cloning](https://docs.clore.ai/guides/audio-and-voice/chatterbox-tts.md): Run Chatterbox TTS by Resemble AI for zero-shot voice cloning and multilingual speech synthesis on Clore.ai GPUs.
- [Kani-TTS-2 Voice Cloning](https://docs.clore.ai/guides/audio-and-voice/kani-tts.md): Run Kani-TTS-2 — an ultra-efficient 400M parameter text-to-speech model with voice cloning on Clore.ai GPUs
- [MiniMax Speech 2.6](https://docs.clore.ai/guides/audio-and-voice/minimax-speech.md): Deploy MiniMax Speech 2.6 — ultra-low latency voice agent TTS — on Clore.ai GPU servers
- [Fish Speech](https://docs.clore.ai/guides/audio-and-voice/fish-speech.md): Run Fish Speech multilingual TTS and zero-shot voice cloning on Clore.ai GPUs
- [StyleTTS2](https://docs.clore.ai/guides/audio-and-voice/styletss2.md): Run StyleTTS2 human-level text-to-speech via style diffusion on Clore.ai GPUs
- [MeloTTS](https://docs.clore.ai/guides/audio-and-voice/melotts.md): Run MeloTTS high-quality multilingual TTS with fast inference on Clore.ai GPUs
- [Voxtral TTS](https://docs.clore.ai/guides/audio-and-voice/voxtral-tts.md)
- [MOSS-TTS (CPU-only, 100M)](https://docs.clore.ai/guides/audio-and-voice/moss-tts.md): Run MOSS-TTS — ultra-lightweight 100M-parameter CPU-first multilingual text-to-speech from OpenMOSS (MOSI.AI + Fudan NLP) on Clore.ai.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.clore.ai/guides/audio-and-voice.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.