Milvus

The most scalable open-source vector database for AI applications — built for billions of vectors

Milvus is an open-source vector database purpose-built for scalable similarity search and AI applications. Originally created by Zilliz and donated to the LF AI & Data Foundation, Milvus powers production AI workloads at companies including NVIDIA, AT&T, IBM, and Salesforce. It's the go-to choice when you need to scale to billions of vectors.

GitHub: milvus-io/milvusarrow-up-right — 32K+ ⭐


Milvus vs Qdrant — When to Choose Which

Criteria
Milvus
Qdrant

Scale

Billions of vectors

Hundreds of millions

Architecture

Distributed (multiple services)

Single binary

Setup complexity

Higher

Lower

GPU index support

✅ Native GPU FAISS

Limited

Multi-tenancy

✅ Partitions + aliases

Collection-based

Streaming ingestion

✅ Kafka/Pulsar

Limited

Hybrid search

✅ Dense + sparse

Cloud-managed option

Zilliz Cloud

Qdrant Cloud

circle-check

Milvus Architecture

Milvus in standalone mode (single server) includes:

  • milvus — the main service (proxy, query, data, index coordinators)

  • etcd — metadata storage and service discovery

  • MinIO — object storage for segment data

In distributed mode (cluster), each component scales independently.


Prerequisites

  • Clore.ai account with GPU rental

  • Docker Compose (usually pre-installed)

  • Basic Python knowledge

  • 16GB+ RAM (32GB recommended for production)


Step 1 — Rent a GPU Server on Clore.ai

  1. Go to clore.aiarrow-up-rightMarketplace

  2. Recommended GPU: RTX 4090 or A100 for GPU-accelerated indexing

  3. CPU alternative: Any server with 32GB+ RAM for CPU-based indexing

Minimum Requirements:

  • CPU: 8 cores

  • RAM: 16GB (32GB recommended)

  • Disk: 50GB SSD/NVMe

  • GPU: Optional (required only for GPU index types)

circle-info

GPU index types in Milvus (IVF_FLAT_GPU, IVFSQ8_GPU) require CUDA-capable GPUs and dramatically accelerate index building for large collections. If you plan to index 10M+ vectors frequently, GPU indexing pays for itself quickly.


Step 2 — Deploy Milvus Standalone

Docker Image:

Milvus standalone requires etcd and MinIO. Use Docker Compose for the easiest setup.

Ports:

  • Port 19530: Milvus SDK/gRPC port (primary)

  • Port 9091: Milvus REST API and health check (internal)

Environment Variables:


Step 3 — Set Up with Docker Compose

SSH into your Clore.ai server and create the compose file:

Customize docker-compose.yml

Start Milvus


Step 4 — Install Python Client


Step 5 — Create a Collection

In Milvus, a collection is similar to a database table. It has a schema with typed fields including vector fields.


Step 6 — Create Index

Before loading data for search, create an appropriate index:


Step 7 — Insert Data


Step 8 — Search and Query

Hybrid Search (Dense + Sparse)


Step 9 — Build a RAG Service


Step 10 — Monitor and Manage


Troubleshooting

Services Not Starting

Connection Refused on 19530

Index Build Timeout for Large Collections

High Memory Usage


Index Type Selection Guide

Index Type
Best For
Memory
Speed
GPU Required

FLAT

Small (<1M), exact search

High

Slow

No

IVF_FLAT

Medium (1M–10M)

Medium

Good

No

HNSW

Low latency, <100M

High

Excellent

No

IVF_SQ8

Compressed, large

Low

Good

No

GPU_IVF_FLAT

Fast batch queries

GPU+RAM

Best

Yes

DISKANN

Billion-scale

Low (disk)

Good

No


Performance Benchmarks

Collection Size
Index
GPU
QPS

1M vectors

HNSW

RTX 3090

~8,000

10M vectors

IVF_FLAT

RTX 4090

~2,500

10M vectors

GPU_IVF_FLAT

A100

~12,000

100M vectors

DISKANN

A100

~1,200


Additional Resources


Milvus on Clore.ai is the ideal solution for AI applications that need to scale beyond hundreds of millions of vectors. Combined with GPU-accelerated embedding generation, you can build world-class semantic search and RAG systems at a fraction of managed cloud costs.


Clore.ai GPU Recommendations

Use Case
Recommended GPU
Est. Cost on Clore.ai

Development/Testing

RTX 3090 (24GB)

~$0.12/gpu/hr

Production Vector Search

RTX 3090 (24GB)

~$0.12/gpu/hr

High-throughput Embedding

RTX 4090 (24GB)

~$0.70/gpu/hr

💡 All examples in this guide can be deployed on Clore.aiarrow-up-right GPU servers. Browse available GPUs and rent by the hour — no commitments, full root access.

Last updated

Was this helpful?