# Vector Database Comparison

Choose the right vector database for your AI applications on Clore.ai GPU servers.

{% hint style="info" %}
**Vector databases** store and retrieve high-dimensional embeddings efficiently — the core infrastructure for RAG systems, semantic search, and recommendation engines. This guide compares the four most popular open-source options.
{% endhint %}

***

## Quick Decision Matrix

|                   | ChromaDB               | Qdrant             | Milvus               | Weaviate             |
| ----------------- | ---------------------- | ------------------ | -------------------- | -------------------- |
| **Best for**      | Prototyping, local dev | Production RAG     | Billion-scale search | Knowledge graphs     |
| **Deployment**    | Embedded/Server        | Server/Cloud       | Server/Cloud         | Server/Cloud         |
| **Scalability**   | Single-node            | Multi-node         | Distributed          | Distributed          |
| **GitHub stars**  | 17K+                   | 21K+               | 31K+                 | 12K+                 |
| **License**       | Apache 2.0             | Apache 2.0         | Apache 2.0           | BSD 3-Clause         |
| **Managed cloud** | No                     | Yes (Qdrant Cloud) | Yes (Zilliz)         | Yes (Weaviate Cloud) |
| **Language**      | Python                 | Rust               | Go                   | Go                   |

***

## Overview

### ChromaDB

ChromaDB is the simplest vector database — designed for rapid prototyping and small-to-medium scale applications. It can run entirely in-memory or persist to disk.

**Philosophy**: Zero configuration, maximum developer experience.

```python
import chromadb

client = chromadb.PersistentClient(path="/data/chroma")
collection = client.create_collection("my_docs")

collection.add(
    documents=["Machine learning is great", "Deep learning uses neural networks"],
    ids=["doc1", "doc2"]
)

results = collection.query(
    query_texts=["What is AI?"],
    n_results=2
)
```

### Qdrant

Qdrant is a production-ready vector search engine written in Rust. It focuses on performance, filtering, and operational simplicity.

**Philosophy**: Production performance without operational complexity.

```python
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

client = QdrantClient("localhost", port=6333)
client.create_collection(
    collection_name="my_collection",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
)

client.upsert(
    collection_name="my_collection",
    points=[
        PointStruct(id=1, vector=[...], payload={"text": "document 1"}),
    ]
)

results = client.search(
    collection_name="my_collection",
    query_vector=[...],
    limit=10,
    query_filter=Filter(must=[FieldCondition(key="category", match=MatchValue(value="tech"))])
)
```

### Milvus

Milvus is the most scalable open-source vector database, designed for billion-scale deployments. It has a distributed architecture with Kubernetes support.

**Philosophy**: Massive scale, cloud-native.

```python
from pymilvus import connections, Collection, FieldSchema, CollectionSchema, DataType

connections.connect("default", host="localhost", port=19530)

fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True),
    FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=1536),
    FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=65535),
]
schema = CollectionSchema(fields)
collection = Collection("my_collection", schema)

# Insert data
collection.insert([[1, 2], embeddings, texts])
collection.create_index("embedding", {"metric_type": "COSINE", "index_type": "IVF_FLAT"})
collection.load()

results = collection.search(
    data=[query_embedding],
    anns_field="embedding",
    param={"metric_type": "COSINE", "nprobe": 10},
    limit=10
)
```

### Weaviate

Weaviate combines vector search with knowledge graphs and a GraphQL API. It supports multi-modal search (text, images, audio) out of the box.

**Philosophy**: Schema-rich, multi-modal, knowledge graph capabilities.

```python
import weaviate

client = weaviate.Client("http://localhost:8080")

# Define schema with classes
client.schema.create_class({
    "class": "Document",
    "vectorizer": "text2vec-transformers",
    "properties": [
        {"name": "content", "dataType": ["text"]},
        {"name": "category", "dataType": ["string"]}
    ]
})

# Insert with auto-vectorization
client.data_object.create(
    {"content": "Machine learning tutorial", "category": "tech"},
    "Document"
)

# Semantic search
result = client.query.get("Document", ["content", "category"])\
    .with_near_text({"concepts": ["artificial intelligence"]})\
    .with_limit(5)\
    .do()
```

***

## Performance Benchmarks

### ANN Benchmarks (ann-benchmarks.com, 2024)

#### 1M vectors, 768 dimensions, Cosine similarity

| Database        | QPS (1 thread) | Recall\@10 | Build Time | Index Size |
| --------------- | -------------- | ---------- | ---------- | ---------- |
| ChromaDB (HNSW) | \~2,000        | 98.5%      | 45s        | 2.1GB      |
| Qdrant (HNSW)   | \~8,500        | 99.1%      | 32s        | 1.8GB      |
| Milvus (HNSW)   | \~12,000       | 98.9%      | 28s        | 1.9GB      |
| Weaviate (HNSW) | \~6,000        | 98.7%      | 38s        | 2.0GB      |

#### 10M vectors (scalability test)

| Database | QPS     | RAM Usage      | Notes                  |
| -------- | ------- | -------------- | ---------------------- |
| ChromaDB | \~800   | 22GB           | Struggles at scale     |
| Qdrant   | \~5,200 | 18GB           | Good with quantization |
| Milvus   | \~9,800 | 15GB (indexed) | Best at scale          |
| Weaviate | \~3,500 | 21GB           | Moderate               |

{% hint style="info" %}
**Benchmarks are guides, not gospel.** Performance varies greatly based on index type, hardware, vector dimensions, and query patterns. Always benchmark with your own data.
{% endhint %}

### Filtering Performance (Filtered ANN search)

Filtered search (vector similarity + metadata filter) is crucial for production RAG:

| Database | Filtered QPS | Pre-filter               | Post-filter |
| -------- | ------------ | ------------------------ | ----------- |
| ChromaDB | \~500        | ❌                        | ✅           |
| Qdrant   | \~6,000      | ✅ (HNSW + payload index) | ✅           |
| Milvus   | \~8,000      | ✅                        | ✅           |
| Weaviate | \~3,000      | ✅ (inverted index)       | ✅           |

**Winner for filtered search**: Qdrant and Milvus, which support true pre-filtering without post-filtering performance degradation.

***

## Feature Comparison

### Storage and Indexing

| Feature              | ChromaDB | Qdrant | Milvus | Weaviate |
| -------------------- | -------- | ------ | ------ | -------- |
| HNSW index           | ✅        | ✅      | ✅      | ✅        |
| IVF index            | ❌        | ❌      | ✅      | ❌        |
| DiskANN              | ❌        | ✅      | ✅      | ❌        |
| Scalar quantization  | ❌        | ✅      | ✅      | ✅        |
| Product quantization | ❌        | ✅      | ✅      | ❌        |
| Binary quantization  | ❌        | ✅      | ✅      | ✅        |
| On-disk storage      | ✅        | ✅      | ✅      | ✅        |
| Mmap                 | ❌        | ✅      | ✅      | ✅        |

### Query Capabilities

| Feature                     | ChromaDB  | Qdrant   | Milvus   | Weaviate    |
| --------------------------- | --------- | -------- | -------- | ----------- |
| Vector similarity           | ✅         | ✅        | ✅        | ✅           |
| Hybrid search (BM25+vector) | ❌         | ✅        | ✅        | ✅           |
| Metadata filtering          | ✅ (basic) | ✅ (rich) | ✅ (rich) | ✅ (GraphQL) |
| Keyword search              | ❌         | ✅        | ✅        | ✅           |
| Multi-vector search         | ❌         | ✅        | ✅        | ✅           |
| Sparse vectors (SPLADE)     | ❌         | ✅        | ✅        | ✅           |
| Named vectors               | ❌         | ✅        | ✅        | ✅           |

### Operational Features

| Feature                 | ChromaDB | Qdrant | Milvus | Weaviate |
| ----------------------- | -------- | ------ | ------ | -------- |
| REST API                | ✅        | ✅      | ✅      | ✅        |
| gRPC API                | ❌        | ✅      | ✅      | ❌        |
| GraphQL API             | ❌        | ❌      | ❌      | ✅        |
| Authentication          | Basic    | ✅      | ✅      | ✅        |
| RBAC                    | ❌        | ✅      | ✅      | ✅        |
| Horizontal scaling      | ❌        | ✅      | ✅      | ✅        |
| Kubernetes support      | ❌        | ✅      | ✅      | ✅        |
| Snapshots/Backup        | ❌        | ✅      | ✅      | ✅        |
| Monitoring (Prometheus) | ❌        | ✅      | ✅      | ✅        |

***

## ChromaDB: Deep Dive

### Strengths

✅ **Simplest setup** — `pip install chromadb` and you're done\
✅ **Embedded mode** — no separate server process\
✅ **Auto-embedding** — built-in embedding models\
✅ **LangChain/LlamaIndex** native integration\
✅ **Zero config** — great for prototyping

### Weaknesses

❌ **Limited scale** — struggles beyond 1-2M vectors\
❌ **No distributed mode** — single node only\
❌ **Limited filtering** — no pre-filtering\
❌ **No quantization** — higher memory usage\
❌ **Slow at scale** — Python-based operations

### Deployment on Clore.ai

```bash
# Client/server mode
docker run -d \
  --name chromadb \
  -p 8000:8000 \
  -v $(pwd)/chroma-data:/chroma/chroma \
  chromadb/chroma:latest

# Test
curl http://localhost:8000/api/v1/heartbeat
```

**Best for**: Jupyter notebooks, rapid RAG prototypes, <1M vectors

***

## Qdrant: Deep Dive

### Strengths

✅ **Best filtering** — true pre-filtered vector search\
✅ **Rust performance** — extremely fast, low latency\
✅ **Quantization** — binary/scalar reduces memory 4-32×\
✅ **Sparse vectors** — hybrid dense+sparse search\
✅ **Simple ops** — single binary, no dependencies\
✅ **Good documentation** — excellent guides and examples

### Weaknesses

❌ **Single-writer** in free tier (no distributed writes)\
❌ **Smaller ecosystem** than Milvus\
❌ **No GraphQL** — REST/gRPC only

### Deployment on Clore.ai

```bash
# Simple deployment
docker run -d \
  --name qdrant \
  -p 6333:6333 \
  -p 6334:6334 \
  -v $(pwd)/qdrant-storage:/qdrant/storage \
  qdrant/qdrant:latest

# With authentication
docker run -d \
  --name qdrant \
  -p 6333:6333 \
  -e QDRANT__SERVICE__API_KEY=your-secret-key \
  -v $(pwd)/qdrant-storage:/qdrant/storage \
  qdrant/qdrant:latest

# Test
curl http://localhost:6333/health
```

**Best for**: Production RAG, filtered search, 1-100M vectors

***

## Milvus: Deep Dive

### Strengths

✅ **Massive scale** — tested to 10B+ vectors\
✅ **Distributed** — cloud-native Kubernetes architecture\
✅ **Most index types** — IVF, HNSW, DiskANN, ScaNN\
✅ **GPU acceleration** — GPU-powered index building\
✅ **Enterprise features** — RBAC, audit logs, encryption\
✅ **Zilliz Cloud** — fully managed option

### Weaknesses

❌ **Complex deployment** — requires etcd, MinIO, and Pulsar/Kafka\
❌ **Resource heavy** — minimum 3 nodes recommended\
❌ **Steeper learning curve** — more concepts to understand\
❌ **Overkill for small scale** — don't use for <1M vectors

### Deployment on Clore.ai (Standalone)

```yaml
# docker-compose.yml for Milvus standalone
version: "3.8"
services:
  etcd:
    image: quay.io/coreos/etcd:v3.5.5
    environment:
      - ETCD_AUTO_COMPACTION_MODE=revision
      - ETCD_AUTO_COMPACTION_RETENTION=1000
      - ETCD_QUOTA_BACKEND_BYTES=4294967296
    command: etcd -advertise-client-urls=http://etcd:2379 -listen-client-urls=http://0.0.0.0:2379

  minio:
    image: minio/minio:RELEASE.2023-03-13T19-46-17Z
    environment:
      MINIO_ACCESS_KEY: minioadmin
      MINIO_SECRET_KEY: minioadmin
    command: minio server /minio_data --console-address ":9001"

  milvus:
    image: milvusdb/milvus:v2.4.0
    command: ["milvus", "run", "standalone"]
    environment:
      ETCD_ENDPOINTS: etcd:2379
      MINIO_ADDRESS: minio:9000
    ports:
      - "19530:19530"
      - "9091:9091"
    depends_on:
      - etcd
      - minio
```

```bash
docker compose up -d
# Takes ~60 seconds to fully start
```

**Best for**: Large-scale production, 100M+ vectors, enterprise deployments

***

## Weaviate: Deep Dive

### Strengths

✅ **Multi-modal** — text, images, audio, video\
✅ **Auto-vectorization** — built-in model integrations\
✅ **GraphQL API** — rich querying with graph traversal\
✅ **Module system** — pluggable vectorizers and readers\
✅ **Hybrid search** — BM25 + vector out of the box\
✅ **Generative search** — built-in RAG with generate module

### Weaknesses

❌ **Higher memory** — schema-aware storage is larger\
❌ **No gRPC** — GraphQL only (slower for high QPS)\
❌ **Complex schema** — requires upfront class definition\
❌ **Slower at extreme scale** than Milvus

### Deployment on Clore.ai

```bash
# Simple deployment
docker run -d \
  --name weaviate \
  -p 8080:8080 \
  -p 50051:50051 \
  -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
  -e PERSISTENCE_DATA_PATH=/var/lib/weaviate \
  -e DEFAULT_VECTORIZER_MODULE=none \
  -e CLUSTER_HOSTNAME=node1 \
  -v $(pwd)/weaviate-data:/var/lib/weaviate \
  cr.weaviate.io/semitechnologies/weaviate:1.25.0

# With transformer vectorizer
docker run -d \
  --name weaviate \
  -p 8080:8080 \
  -e DEFAULT_VECTORIZER_MODULE=text2vec-transformers \
  -e TRANSFORMERS_INFERENCE_API=http://t2v-transformers:8080 \
  cr.weaviate.io/semitechnologies/weaviate:1.25.0
```

**Best for**: Multi-modal search, knowledge graphs, generative search

***

## When to Use Which

### Scale-Based Decision

```
< 100K vectors    → ChromaDB (embedded)
100K - 10M        → Qdrant (best balance)
10M - 1B          → Milvus or Qdrant (clustered)
1B+               → Milvus (distributed)
```

### Use-Case-Based Decision

| Use Case        | Best Choice        | Why                          |
| --------------- | ------------------ | ---------------------------- |
| RAG prototype   | ChromaDB           | Zero setup, simple API       |
| Production RAG  | Qdrant             | Fast filtering, simple ops   |
| Semantic search | Qdrant or Milvus   | Best performance             |
| Multi-modal     | Weaviate           | Built-in image/audio support |
| Knowledge graph | Weaviate           | Graph traversal queries      |
| Billion-scale   | Milvus             | Distributed architecture     |
| Hybrid search   | Qdrant or Weaviate | BM25 + vector                |
| Enterprise      | Milvus or Weaviate | RBAC, audit logs             |

***

## Memory Requirements on Clore.ai

### RAM Estimation Formula

```
RAM needed ≈ (vectors × dimensions × 4 bytes) × 1.5 (overhead)

Example: 1M vectors × 1536 dims × 4 bytes × 1.5 = 9.2GB RAM

With quantization (Qdrant binary):
1M × 1536 / 8 × 1.5 = 0.29GB RAM (32× compression!)
```

### Recommended Server Specs

| Dataset Size | ChromaDB | Qdrant   | Milvus   | Weaviate |
| ------------ | -------- | -------- | -------- | -------- |
| 1M vectors   | 16GB RAM | 8GB RAM  | 32GB RAM | 16GB RAM |
| 10M vectors  | ❌        | 32GB RAM | 64GB RAM | 48GB RAM |
| 100M vectors | ❌        | 128GB+   | 256GB+   | 256GB+   |

***

## Quick Comparison: Docker Setup Time

| Database | `docker run` to ready | Dependencies      |
| -------- | --------------------- | ----------------- |
| ChromaDB | \~5 seconds           | None              |
| Qdrant   | \~3 seconds           | None              |
| Milvus   | \~60 seconds          | etcd + MinIO      |
| Weaviate | \~15 seconds          | None (standalone) |

***

## Pricing (Self-Hosted on Clore.ai)

All four databases are **free** to self-host. Cost is just Clore.ai server rental:

```
Example: 1M vectors RAG system
- Qdrant: 8GB RAM server ~$0.10/hr
- ChromaDB: 16GB RAM server ~$0.15/hr  
- Weaviate: 16GB RAM server ~$0.15/hr
- Milvus: 32GB RAM server ~$0.30/hr (+ overhead for etcd/minio)
```

***

## Useful Links

* [ChromaDB Docs](https://docs.trychroma.com)
* [Qdrant Docs](https://qdrant.tech/documentation)
* [Milvus Docs](https://milvus.io/docs)
* [Weaviate Docs](https://weaviate.io/developers/weaviate)
* [ANN Benchmarks](https://ann-benchmarks.com)
* [Vector DB Benchmark by Qdrant](https://qdrant.tech/benchmarks)

***

## Summary

| Start with... | If you need...                                          |
| ------------- | ------------------------------------------------------- |
| **ChromaDB**  | Quick prototype, <1M vectors, minimal setup             |
| **Qdrant**    | Production RAG, great filtering, operational simplicity |
| **Milvus**    | Billion-scale, enterprise, distributed architecture     |
| **Weaviate**  | Multi-modal, knowledge graphs, GraphQL querying         |

For most production RAG applications on Clore.ai, **Qdrant** offers the best balance of performance, features, and operational simplicity. For large-scale or enterprise needs, **Milvus** is the industry standard.

***

## Clore.ai GPU Recommendations

| Use Case            | Recommended GPU | Est. Cost on Clore.ai |
| ------------------- | --------------- | --------------------- |
| Development/Testing | RTX 3090 (24GB) | \~$0.12/gpu/hr        |
| Production          | RTX 4090 (24GB) | \~$0.70/gpu/hr        |
| Large Scale         | A100 80GB       | \~$1.20/gpu/hr        |

> 💡 All examples in this guide can be deployed on [Clore.ai](https://clore.ai/marketplace) GPU servers. Browse available GPUs and rent by the hour — no commitments, full root access.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/comparisons/vector-db-comparison.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
