# Vector Database Comparison

Choose the right vector database for your AI applications on Clore.ai GPU servers.

{% hint style="info" %}
**Vector databases** store and retrieve high-dimensional embeddings efficiently — the core infrastructure for RAG systems, semantic search, and recommendation engines. This guide compares the four most popular open-source options.
{% endhint %}

***

## Quick Decision Matrix

|                   | ChromaDB               | Qdrant             | Milvus               | Weaviate             |
| ----------------- | ---------------------- | ------------------ | -------------------- | -------------------- |
| **Best for**      | Prototyping, local dev | Production RAG     | Billion-scale search | Knowledge graphs     |
| **Deployment**    | Embedded/Server        | Server/Cloud       | Server/Cloud         | Server/Cloud         |
| **Scalability**   | Single-node            | Multi-node         | Distributed          | Distributed          |
| **GitHub stars**  | 17K+                   | 21K+               | 31K+                 | 12K+                 |
| **License**       | Apache 2.0             | Apache 2.0         | Apache 2.0           | BSD 3-Clause         |
| **Managed cloud** | No                     | Yes (Qdrant Cloud) | Yes (Zilliz)         | Yes (Weaviate Cloud) |
| **Language**      | Python                 | Rust               | Go                   | Go                   |

***

## Overview

### ChromaDB

ChromaDB is the simplest vector database — designed for rapid prototyping and small-to-medium scale applications. It can run entirely in-memory or persist to disk.

**Philosophy**: Zero configuration, maximum developer experience.

```python
import chromadb

client = chromadb.PersistentClient(path="/data/chroma")
collection = client.create_collection("my_docs")

collection.add(
    documents=["Machine learning is great", "Deep learning uses neural networks"],
    ids=["doc1", "doc2"]
)

results = collection.query(
    query_texts=["What is AI?"],
    n_results=2
)
```

### Qdrant

Qdrant is a production-ready vector search engine written in Rust. It focuses on performance, filtering, and operational simplicity.

**Philosophy**: Production performance without operational complexity.

```python
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

client = QdrantClient("localhost", port=6333)
client.create_collection(
    collection_name="my_collection",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
)

client.upsert(
    collection_name="my_collection",
    points=[
        PointStruct(id=1, vector=[...], payload={"text": "document 1"}),
    ]
)

results = client.search(
    collection_name="my_collection",
    query_vector=[...],
    limit=10,
    query_filter=Filter(must=[FieldCondition(key="category", match=MatchValue(value="tech"))])
)
```

### Milvus

Milvus is the most scalable open-source vector database, designed for billion-scale deployments. It has a distributed architecture with Kubernetes support.

**Philosophy**: Massive scale, cloud-native.

```python
from pymilvus import connections, Collection, FieldSchema, CollectionSchema, DataType

connections.connect("default", host="localhost", port=19530)

fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True),
    FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=1536),
    FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=65535),
]
schema = CollectionSchema(fields)
collection = Collection("my_collection", schema)

# Insert data
collection.insert([[1, 2], embeddings, texts])
collection.create_index("embedding", {"metric_type": "COSINE", "index_type": "IVF_FLAT"})
collection.load()

results = collection.search(
    data=[query_embedding],
    anns_field="embedding",
    param={"metric_type": "COSINE", "nprobe": 10},
    limit=10
)
```

### Weaviate

Weaviate combines vector search with knowledge graphs and a GraphQL API. It supports multi-modal search (text, images, audio) out of the box.

**Philosophy**: Schema-rich, multi-modal, knowledge graph capabilities.

```python
import weaviate

client = weaviate.Client("http://localhost:8080")

# Define schema with classes
client.schema.create_class({
    "class": "Document",
    "vectorizer": "text2vec-transformers",
    "properties": [
        {"name": "content", "dataType": ["text"]},
        {"name": "category", "dataType": ["string"]}
    ]
})

# Insert with auto-vectorization
client.data_object.create(
    {"content": "Machine learning tutorial", "category": "tech"},
    "Document"
)

# Semantic search
result = client.query.get("Document", ["content", "category"])\
    .with_near_text({"concepts": ["artificial intelligence"]})\
    .with_limit(5)\
    .do()
```

***

## Performance Benchmarks

### ANN Benchmarks (ann-benchmarks.com, 2024)

#### 1M vectors, 768 dimensions, Cosine similarity

| Database        | QPS (1 thread) | Recall\@10 | Build Time | Index Size |
| --------------- | -------------- | ---------- | ---------- | ---------- |
| ChromaDB (HNSW) | \~2,000        | 98.5%      | 45s        | 2.1GB      |
| Qdrant (HNSW)   | \~8,500        | 99.1%      | 32s        | 1.8GB      |
| Milvus (HNSW)   | \~12,000       | 98.9%      | 28s        | 1.9GB      |
| Weaviate (HNSW) | \~6,000        | 98.7%      | 38s        | 2.0GB      |

#### 10M vectors (scalability test)

| Database | QPS     | RAM Usage      | Notes                  |
| -------- | ------- | -------------- | ---------------------- |
| ChromaDB | \~800   | 22GB           | Struggles at scale     |
| Qdrant   | \~5,200 | 18GB           | Good with quantization |
| Milvus   | \~9,800 | 15GB (indexed) | Best at scale          |
| Weaviate | \~3,500 | 21GB           | Moderate               |

{% hint style="info" %}
**Benchmarks are guides, not gospel.** Performance varies greatly based on index type, hardware, vector dimensions, and query patterns. Always benchmark with your own data.
{% endhint %}

### Filtering Performance (Filtered ANN search)

Filtered search (vector similarity + metadata filter) is crucial for production RAG:

| Database | Filtered QPS | Pre-filter               | Post-filter |
| -------- | ------------ | ------------------------ | ----------- |
| ChromaDB | \~500        | ❌                        | ✅           |
| Qdrant   | \~6,000      | ✅ (HNSW + payload index) | ✅           |
| Milvus   | \~8,000      | ✅                        | ✅           |
| Weaviate | \~3,000      | ✅ (inverted index)       | ✅           |

**Winner for filtered search**: Qdrant and Milvus, which support true pre-filtering without post-filtering performance degradation.

***

## Feature Comparison

### Storage and Indexing

| Feature              | ChromaDB | Qdrant | Milvus | Weaviate |
| -------------------- | -------- | ------ | ------ | -------- |
| HNSW index           | ✅        | ✅      | ✅      | ✅        |
| IVF index            | ❌        | ❌      | ✅      | ❌        |
| DiskANN              | ❌        | ✅      | ✅      | ❌        |
| Scalar quantization  | ❌        | ✅      | ✅      | ✅        |
| Product quantization | ❌        | ✅      | ✅      | ❌        |
| Binary quantization  | ❌        | ✅      | ✅      | ✅        |
| On-disk storage      | ✅        | ✅      | ✅      | ✅        |
| Mmap                 | ❌        | ✅      | ✅      | ✅        |

### Query Capabilities

| Feature                     | ChromaDB  | Qdrant   | Milvus   | Weaviate    |
| --------------------------- | --------- | -------- | -------- | ----------- |
| Vector similarity           | ✅         | ✅        | ✅        | ✅           |
| Hybrid search (BM25+vector) | ❌         | ✅        | ✅        | ✅           |
| Metadata filtering          | ✅ (basic) | ✅ (rich) | ✅ (rich) | ✅ (GraphQL) |
| Keyword search              | ❌         | ✅        | ✅        | ✅           |
| Multi-vector search         | ❌         | ✅        | ✅        | ✅           |
| Sparse vectors (SPLADE)     | ❌         | ✅        | ✅        | ✅           |
| Named vectors               | ❌         | ✅        | ✅        | ✅           |

### Operational Features

| Feature                 | ChromaDB | Qdrant | Milvus | Weaviate |
| ----------------------- | -------- | ------ | ------ | -------- |
| REST API                | ✅        | ✅      | ✅      | ✅        |
| gRPC API                | ❌        | ✅      | ✅      | ❌        |
| GraphQL API             | ❌        | ❌      | ❌      | ✅        |
| Authentication          | Basic    | ✅      | ✅      | ✅        |
| RBAC                    | ❌        | ✅      | ✅      | ✅        |
| Horizontal scaling      | ❌        | ✅      | ✅      | ✅        |
| Kubernetes support      | ❌        | ✅      | ✅      | ✅        |
| Snapshots/Backup        | ❌        | ✅      | ✅      | ✅        |
| Monitoring (Prometheus) | ❌        | ✅      | ✅      | ✅        |

***

## ChromaDB: Deep Dive

### Strengths

✅ **Simplest setup** — `pip install chromadb` and you're done\
✅ **Embedded mode** — no separate server process\
✅ **Auto-embedding** — built-in embedding models\
✅ **LangChain/LlamaIndex** native integration\
✅ **Zero config** — great for prototyping

### Weaknesses

❌ **Limited scale** — struggles beyond 1-2M vectors\
❌ **No distributed mode** — single node only\
❌ **Limited filtering** — no pre-filtering\
❌ **No quantization** — higher memory usage\
❌ **Slow at scale** — Python-based operations

### Deployment on Clore.ai

```bash
# Client/server mode
docker run -d \
  --name chromadb \
  -p 8000:8000 \
  -v $(pwd)/chroma-data:/chroma/chroma \
  chromadb/chroma:latest

# Test
curl http://localhost:8000/api/v1/heartbeat
```

**Best for**: Jupyter notebooks, rapid RAG prototypes, <1M vectors

***

## Qdrant: Deep Dive

### Strengths

✅ **Best filtering** — true pre-filtered vector search\
✅ **Rust performance** — extremely fast, low latency\
✅ **Quantization** — binary/scalar reduces memory 4-32×\
✅ **Sparse vectors** — hybrid dense+sparse search\
✅ **Simple ops** — single binary, no dependencies\
✅ **Good documentation** — excellent guides and examples

### Weaknesses

❌ **Single-writer** in free tier (no distributed writes)\
❌ **Smaller ecosystem** than Milvus\
❌ **No GraphQL** — REST/gRPC only

### Deployment on Clore.ai

```bash
# Simple deployment
docker run -d \
  --name qdrant \
  -p 6333:6333 \
  -p 6334:6334 \
  -v $(pwd)/qdrant-storage:/qdrant/storage \
  qdrant/qdrant:latest

# With authentication
docker run -d \
  --name qdrant \
  -p 6333:6333 \
  -e QDRANT__SERVICE__API_KEY=your-secret-key \
  -v $(pwd)/qdrant-storage:/qdrant/storage \
  qdrant/qdrant:latest

# Test
curl http://localhost:6333/health
```

**Best for**: Production RAG, filtered search, 1-100M vectors

***

## Milvus: Deep Dive

### Strengths

✅ **Massive scale** — tested to 10B+ vectors\
✅ **Distributed** — cloud-native Kubernetes architecture\
✅ **Most index types** — IVF, HNSW, DiskANN, ScaNN\
✅ **GPU acceleration** — GPU-powered index building\
✅ **Enterprise features** — RBAC, audit logs, encryption\
✅ **Zilliz Cloud** — fully managed option

### Weaknesses

❌ **Complex deployment** — requires etcd, MinIO, and Pulsar/Kafka\
❌ **Resource heavy** — minimum 3 nodes recommended\
❌ **Steeper learning curve** — more concepts to understand\
❌ **Overkill for small scale** — don't use for <1M vectors

### Deployment on Clore.ai (Standalone)

```yaml
# docker-compose.yml for Milvus standalone
version: "3.8"
services:
  etcd:
    image: quay.io/coreos/etcd:v3.5.5
    environment:
      - ETCD_AUTO_COMPACTION_MODE=revision
      - ETCD_AUTO_COMPACTION_RETENTION=1000
      - ETCD_QUOTA_BACKEND_BYTES=4294967296
    command: etcd -advertise-client-urls=http://etcd:2379 -listen-client-urls=http://0.0.0.0:2379

  minio:
    image: minio/minio:RELEASE.2023-03-13T19-46-17Z
    environment:
      MINIO_ACCESS_KEY: minioadmin
      MINIO_SECRET_KEY: minioadmin
    command: minio server /minio_data --console-address ":9001"

  milvus:
    image: milvusdb/milvus:v2.4.0
    command: ["milvus", "run", "standalone"]
    environment:
      ETCD_ENDPOINTS: etcd:2379
      MINIO_ADDRESS: minio:9000
    ports:
      - "19530:19530"
      - "9091:9091"
    depends_on:
      - etcd
      - minio
```

```bash
docker compose up -d
# Takes ~60 seconds to fully start
```

**Best for**: Large-scale production, 100M+ vectors, enterprise deployments

***

## Weaviate: Deep Dive

### Strengths

✅ **Multi-modal** — text, images, audio, video\
✅ **Auto-vectorization** — built-in model integrations\
✅ **GraphQL API** — rich querying with graph traversal\
✅ **Module system** — pluggable vectorizers and readers\
✅ **Hybrid search** — BM25 + vector out of the box\
✅ **Generative search** — built-in RAG with generate module

### Weaknesses

❌ **Higher memory** — schema-aware storage is larger\
❌ **No gRPC** — GraphQL only (slower for high QPS)\
❌ **Complex schema** — requires upfront class definition\
❌ **Slower at extreme scale** than Milvus

### Deployment on Clore.ai

```bash
# Simple deployment
docker run -d \
  --name weaviate \
  -p 8080:8080 \
  -p 50051:50051 \
  -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
  -e PERSISTENCE_DATA_PATH=/var/lib/weaviate \
  -e DEFAULT_VECTORIZER_MODULE=none \
  -e CLUSTER_HOSTNAME=node1 \
  -v $(pwd)/weaviate-data:/var/lib/weaviate \
  cr.weaviate.io/semitechnologies/weaviate:1.25.0

# With transformer vectorizer
docker run -d \
  --name weaviate \
  -p 8080:8080 \
  -e DEFAULT_VECTORIZER_MODULE=text2vec-transformers \
  -e TRANSFORMERS_INFERENCE_API=http://t2v-transformers:8080 \
  cr.weaviate.io/semitechnologies/weaviate:1.25.0
```

**Best for**: Multi-modal search, knowledge graphs, generative search

***

## When to Use Which

### Scale-Based Decision

```
< 100K vectors    → ChromaDB (embedded)
100K - 10M        → Qdrant (best balance)
10M - 1B          → Milvus or Qdrant (clustered)
1B+               → Milvus (distributed)
```

### Use-Case-Based Decision

| Use Case        | Best Choice        | Why                          |
| --------------- | ------------------ | ---------------------------- |
| RAG prototype   | ChromaDB           | Zero setup, simple API       |
| Production RAG  | Qdrant             | Fast filtering, simple ops   |
| Semantic search | Qdrant or Milvus   | Best performance             |
| Multi-modal     | Weaviate           | Built-in image/audio support |
| Knowledge graph | Weaviate           | Graph traversal queries      |
| Billion-scale   | Milvus             | Distributed architecture     |
| Hybrid search   | Qdrant or Weaviate | BM25 + vector                |
| Enterprise      | Milvus or Weaviate | RBAC, audit logs             |

***

## Memory Requirements on Clore.ai

### RAM Estimation Formula

```
RAM needed ≈ (vectors × dimensions × 4 bytes) × 1.5 (overhead)

Example: 1M vectors × 1536 dims × 4 bytes × 1.5 = 9.2GB RAM

With quantization (Qdrant binary):
1M × 1536 / 8 × 1.5 = 0.29GB RAM (32× compression!)
```

### Recommended Server Specs

| Dataset Size | ChromaDB | Qdrant   | Milvus   | Weaviate |
| ------------ | -------- | -------- | -------- | -------- |
| 1M vectors   | 16GB RAM | 8GB RAM  | 32GB RAM | 16GB RAM |
| 10M vectors  | ❌        | 32GB RAM | 64GB RAM | 48GB RAM |
| 100M vectors | ❌        | 128GB+   | 256GB+   | 256GB+   |

***

## Quick Comparison: Docker Setup Time

| Database | `docker run` to ready | Dependencies      |
| -------- | --------------------- | ----------------- |
| ChromaDB | \~5 seconds           | None              |
| Qdrant   | \~3 seconds           | None              |
| Milvus   | \~60 seconds          | etcd + MinIO      |
| Weaviate | \~15 seconds          | None (standalone) |

***

## Pricing (Self-Hosted on Clore.ai)

All four databases are **free** to self-host. Cost is just Clore.ai server rental:

```
Example: 1M vectors RAG system
- Qdrant: 8GB RAM server ~$0.10/hr
- ChromaDB: 16GB RAM server ~$0.15/hr  
- Weaviate: 16GB RAM server ~$0.15/hr
- Milvus: 32GB RAM server ~$0.30/hr (+ overhead for etcd/minio)
```

***

## Useful Links

* [ChromaDB Docs](https://docs.trychroma.com)
* [Qdrant Docs](https://qdrant.tech/documentation)
* [Milvus Docs](https://milvus.io/docs)
* [Weaviate Docs](https://weaviate.io/developers/weaviate)
* [ANN Benchmarks](https://ann-benchmarks.com)
* [Vector DB Benchmark by Qdrant](https://qdrant.tech/benchmarks)

***

## Summary

| Start with... | If you need...                                          |
| ------------- | ------------------------------------------------------- |
| **ChromaDB**  | Quick prototype, <1M vectors, minimal setup             |
| **Qdrant**    | Production RAG, great filtering, operational simplicity |
| **Milvus**    | Billion-scale, enterprise, distributed architecture     |
| **Weaviate**  | Multi-modal, knowledge graphs, GraphQL querying         |

For most production RAG applications on Clore.ai, **Qdrant** offers the best balance of performance, features, and operational simplicity. For large-scale or enterprise needs, **Milvus** is the industry standard.

***

## Clore.ai GPU Recommendations

| Use Case            | Recommended GPU | Est. Cost on Clore.ai |
| ------------------- | --------------- | --------------------- |
| Development/Testing | RTX 3090 (24GB) | \~$0.12/gpu/hr        |
| Production          | RTX 4090 (24GB) | \~$0.70/gpu/hr        |
| Large Scale         | A100 80GB       | \~$1.20/gpu/hr        |

> 💡 All examples in this guide can be deployed on [Clore.ai](https://clore.ai/marketplace) GPU servers. Browse available GPUs and rent by the hour — no commitments, full root access.
