TTS Engine Comparison
Quick Decision Matrix
XTTS v2
Bark
Kokoro
Fish Speech
MeloTTS
Overview
XTTS v2
Bark
Kokoro
Fish Speech
MeloTTS
Quality Comparison
Naturalness Scores (MOS — Mean Opinion Score, 1-5)
Model
English MOS
Multilingual MOS
Expressiveness
What Each Model Does Best
Model
Standout Quality Feature
Speed Benchmarks
Characters Per Second (CPU vs GPU)
Model
CPU Speed
GPU Speed (RTX 3080)
Real-time Factor
Time to Generate 1 Minute of Audio
Model
CPU
RTX 3080
A100
Language Support
Supported Languages
Model
Languages
Notable
Language Quality Notes
Model
English
Chinese
Japanese
European
Voice Cloning Comparison
Cloning Capabilities
Model
Reference Length
Cloning Quality
Zero-Shot
XTTS v2 Voice Cloning
Fish Speech Voice Cloning
Bark Voice Presets
XTTS v2: Deep Dive
Architecture
Installation on Clore.ai
Docker Deployment
Bark: Deep Dive
Architecture
What Makes Bark Unique
Markup Language
Installation
Kokoro: Deep Dive
Architecture
Voices Available
Streaming Support
Fish Speech: Deep Dive
Architecture
Installation
Python API
Voice Cloning
MeloTTS: Deep Dive
Architecture
Accents and Languages
Batch Processing (Very Fast)
Deployment on Clore.ai
All-in-One TTS Server
VRAM Requirements Summary
Model
CPU
4GB GPU
8GB GPU
16GB GPU
Integration Examples
OpenAI-Compatible API (for drop-in replacement)
LangChain Integration
When to Use Which
Decision Guide
By Application Type
Application
Best Choice
Why
License Summary
Model
License
Commercial?
Notes
Cost on Clore.ai
Useful Links
Summary
Model
Use When
Clore.ai GPU Recommendations
Use Case
Recommended GPU
Est. Cost on Clore.ai
Last updated
Was this helpful?