# ChromaDB

ChromaDB is the **leading open-source vector database** purpose-built for AI applications. It provides a simple, intuitive API for storing, querying, and managing high-dimensional embeddings — the backbone of modern RAG systems, semantic search, recommendation engines, and LLM memory.

ChromaDB abstracts away the complexity of vector similarity search, letting you focus on building AI applications. It supports both in-memory mode for development and a persistent server mode for production deployments, with Docker support for easy deployment on Clore.ai GPU servers.

Key features:

* 🚀 **Simple Python/JavaScript API** — get started in minutes
* 💾 **Persistent storage** — data survives container restarts
* 🔍 **Multiple distance metrics** — cosine, L2, inner product
* 📦 **Integrated embeddings** — built-in support for OpenAI, Cohere, sentence-transformers
* 🏗️ **Multi-tenant** — collections for organizing different datasets
* 🔌 **REST API** — language-agnostic HTTP interface
* ⚡ **Fast** — HNSW index for approximate nearest-neighbor search
* 🔗 **LangChain/LlamaIndex native** — first-class integration

{% hint style="success" %}
All examples can be run on GPU servers rented through [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

***

## Server Requirements

| Parameter | Minimum                   | Recommended                          |
| --------- | ------------------------- | ------------------------------------ |
| GPU       | Any NVIDIA GPU (optional) | NVIDIA RTX 3080+ (for embeddings)    |
| VRAM      | Not required for ChromaDB | 8–16 GB (for local embedding models) |
| RAM       | 4 GB                      | 16–32 GB                             |
| CPU       | 2 cores                   | 8 cores                              |
| Disk      | 10 GB                     | 50–200 GB (for large datasets)       |
| OS        | Ubuntu 20.04+             | Ubuntu 22.04                         |
| Docker    | Required                  | Docker + Docker Compose              |
| Ports     | 22, 8000                  | 22, 8000                             |

{% hint style="info" %}
ChromaDB itself doesn't require a GPU — it runs efficiently on CPU. However, **generating embeddings** (converting text to vectors) benefits greatly from GPU acceleration. If you plan to use local embedding models (sentence-transformers, etc.), choose a server with a GPU.
{% endhint %}

***

## Quick Deploy on CLORE.AI

### 1. Find a suitable server

Go to [CLORE.AI Marketplace](https://clore.ai/marketplace) and choose:

* **CPU-only** for ChromaDB server + API (store pre-computed embeddings)
* **GPU server** if you want to generate embeddings locally as well

### 2. Configure your deployment

**Docker Image:**

```
chromadb/chroma:latest
```

**Port Mappings:**

```
22   → SSH access
8000 → ChromaDB HTTP API
```

**Environment Variables:**

```
IS_PERSISTENT=TRUE
ANONYMIZED_TELEMETRY=FALSE
CHROMA_SERVER_AUTH_CREDENTIALS_FILE=/chroma/auth.txt
```

**Startup Command:**

```bash
uvicorn chromadb.app:app --host 0.0.0.0 --port 8000
```

### 3. Test the deployment

```bash
curl http://<server-ip>:8000/api/v1/heartbeat
# Expected: {"nanosecond heartbeat": <timestamp>}
```

***

## Step-by-Step Setup

### Step 1: SSH into your server

```bash
ssh root@<your-clore-server-ip> -p <ssh-port>
```

### Step 2: Create data directory

```bash
mkdir -p /workspace/chromadb/data
mkdir -p /workspace/chromadb/config
```

### Step 3: Run ChromaDB container

```bash
docker run -d \
  --name chromadb \
  -p 8000:8000 \
  -v /workspace/chromadb/data:/chroma/chroma \
  -e IS_PERSISTENT=TRUE \
  -e ANONYMIZED_TELEMETRY=FALSE \
  -e CHROMA_SERVER_LOG_LEVEL=INFO \
  chromadb/chroma:latest
```

### Step 4: Verify it's running

```bash
# Check health
curl http://localhost:8000/api/v1/heartbeat

# Check version
curl http://localhost:8000/api/v1/version

# List collections
curl http://localhost:8000/api/v1/collections
```

### Step 5: Install Python client

```bash
pip install chromadb
pip install sentence-transformers  # For local GPU embeddings
pip install openai                  # For OpenAI embeddings
```

### Step 6: Test connectivity from Python

```python
import chromadb

client = chromadb.HttpClient(host="<server-ip>", port=8000)
print(f"ChromaDB version: {client.get_version()}")
print(f"Heartbeat: {client.heartbeat()}")
```

### Step 7: (Optional) Enable authentication

```bash
# Create auth credentials
echo "admin:$2y$12$$(openssl rand -hex 16)" > /workspace/chromadb/auth.txt

# Run with auth enabled
docker run -d \
  --name chromadb-auth \
  -p 8000:8000 \
  -v /workspace/chromadb/data:/chroma/chroma \
  -v /workspace/chromadb/auth.txt:/chroma/auth.txt \
  -e IS_PERSISTENT=TRUE \
  -e CHROMA_SERVER_AUTH_CREDENTIALS_FILE=/chroma/auth.txt \
  -e CHROMA_SERVER_AUTH_CREDENTIALS_PROVIDER=chromadb.auth.providers.HtpasswdFileServerAuthCredentialsProvider \
  -e CHROMA_SERVER_AUTH_PROVIDER=chromadb.auth.basic.BasicAuthServerProvider \
  chromadb/chroma:latest
```

***

## Usage Examples

### Example 1: Basic Vector Store Operations

```python
import chromadb
from chromadb.utils import embedding_functions

# Connect to ChromaDB on Clore.ai server
client = chromadb.HttpClient(
    host="<your-clore-server-ip>",
    port=8000
)

# Use sentence-transformers for embeddings (runs on GPU if available)
embedding_fn = embedding_functions.SentenceTransformerEmbeddingFunction(
    model_name="all-MiniLM-L6-v2"
)

# Create a collection
collection = client.get_or_create_collection(
    name="clore_ai_docs",
    embedding_function=embedding_fn,
    metadata={"hnsw:space": "cosine"}  # Distance metric
)

# Add documents
documents = [
    "Clore.ai is a decentralized GPU cloud marketplace for AI workloads.",
    "You can rent NVIDIA RTX 4090, A100, and H100 GPUs on Clore.ai.",
    "Clore.ai supports Docker-based deployments for any AI framework.",
    "Pricing on Clore.ai is competitive compared to AWS and GCP.",
    "The Clore.ai marketplace has thousands of GPU servers worldwide.",
    "You can deploy PyTorch, TensorFlow, JAX, and other ML frameworks.",
    "Clore.ai offers spot pricing for cost-effective GPU computing.",
]

ids = [f"doc_{i}" for i in range(len(documents))]

collection.add(
    documents=documents,
    ids=ids,
    metadatas=[{"source": "docs", "index": i} for i in range(len(documents))]
)

print(f"Added {len(documents)} documents to collection")
print(f"Collection size: {collection.count()} documents")
```

***

### Example 2: Semantic Search

```python
import chromadb
from chromadb.utils import embedding_functions

client = chromadb.HttpClient(host="<server-ip>", port=8000)
embedding_fn = embedding_functions.SentenceTransformerEmbeddingFunction(
    model_name="all-MiniLM-L6-v2"
)

collection = client.get_collection(
    name="clore_ai_docs",
    embedding_function=embedding_fn
)

# Semantic search queries
queries = [
    "How much does GPU rental cost?",
    "What machine learning tools are available?",
    "Tell me about the GPU hardware options",
]

for query in queries:
    results = collection.query(
        query_texts=[query],
        n_results=3,
        include=["documents", "distances", "metadatas"]
    )

    print(f"\n🔍 Query: {query}")
    for i, (doc, dist) in enumerate(zip(
        results["documents"][0],
        results["distances"][0]
    )):
        similarity = 1 - dist  # Convert distance to similarity
        print(f"  {i+1}. [{similarity:.3f}] {doc[:100]}...")
```

***

### Example 3: RAG Pipeline with ChromaDB + OpenAI

```python
import chromadb
from chromadb.utils import embedding_functions
from openai import OpenAI

# Initialize clients
chroma_client = chromadb.HttpClient(host="<server-ip>", port=8000)
openai_client = OpenAI(api_key="your-openai-api-key")

# Use OpenAI embeddings
openai_ef = embedding_functions.OpenAIEmbeddingFunction(
    api_key="your-openai-api-key",
    model_name="text-embedding-3-small"
)

# Get collection
collection = chroma_client.get_or_create_collection(
    name="knowledge_base",
    embedding_function=openai_ef
)

def add_to_knowledge_base(texts, ids=None, metadatas=None):
    """Add documents to ChromaDB knowledge base."""
    if ids is None:
        ids = [f"doc_{i}" for i in range(len(texts))]
    collection.add(documents=texts, ids=ids, metadatas=metadatas or [{}]*len(texts))
    print(f"✓ Added {len(texts)} documents. Total: {collection.count()}")

def rag_query(question, n_context=5):
    """Retrieve relevant context and generate answer with GPT-4."""
    # 1. Retrieve relevant documents
    results = collection.query(
        query_texts=[question],
        n_results=n_context,
        include=["documents", "distances"]
    )

    context_docs = results["documents"][0]
    distances = results["distances"][0]

    # 2. Build context string
    context = "\n\n".join([
        f"[Source {i+1} (relevance: {1-d:.2f})]: {doc}"
        for i, (doc, d) in enumerate(zip(context_docs, distances))
    ])

    # 3. Generate answer with LLM
    messages = [
        {
            "role": "system",
            "content": "You are a helpful assistant. Answer questions based on the provided context. If the answer isn't in the context, say so."
        },
        {
            "role": "user",
            "content": f"Context:\n{context}\n\nQuestion: {question}"
        }
    ]

    response = openai_client.chat.completions.create(
        model="gpt-4-turbo",
        messages=messages,
        temperature=0.1
    )

    answer = response.choices[0].message.content

    return {
        "question": question,
        "answer": answer,
        "sources": context_docs,
        "relevance_scores": [1 - d for d in distances]
    }

# Example usage
knowledge = [
    "Clore.ai is a GPU cloud marketplace with over 45,000 users.",
    "Clore.ai supports Docker-based workload deployment.",
    "GPU servers on Clore.ai range from GTX 1080 to H100.",
    "You can deploy AI applications with SSH access and custom ports.",
]
add_to_knowledge_base(knowledge)

result = rag_query("How many users does Clore.ai have?")
print(f"Q: {result['question']}")
print(f"A: {result['answer']}")
```

***

### Example 4: Multi-Collection Document Management

```python
import chromadb
from chromadb.utils import embedding_functions

client = chromadb.HttpClient(host="<server-ip>", port=8000)
embedding_fn = embedding_functions.SentenceTransformerEmbeddingFunction(
    model_name="all-mpnet-base-v2"  # Higher quality embeddings
)

# Create separate collections for different document types
collections = {
    "technical_docs": client.get_or_create_collection("technical_docs", embedding_function=embedding_fn),
    "faq": client.get_or_create_collection("faq", embedding_function=embedding_fn),
    "blog_posts": client.get_or_create_collection("blog_posts", embedding_function=embedding_fn),
}

# Add documents to respective collections
collections["technical_docs"].add(
    documents=["Docker deployment guide for Clore.ai", "SSH configuration for GPU servers"],
    ids=["tech_001", "tech_002"],
    metadatas=[{"type": "guide", "version": "v2"}, {"type": "config"}]
)

collections["faq"].add(
    documents=["Q: How do I pay? A: Via cryptocurrency.", "Q: What GPUs? A: RTX to H100."],
    ids=["faq_001", "faq_002"],
    metadatas=[{"category": "payment"}, {"category": "hardware"}]
)

# Search across all collections
def search_all_collections(query, n_results=2):
    all_results = []
    for name, col in collections.items():
        results = col.query(query_texts=[query], n_results=n_results)
        for doc, dist in zip(results["documents"][0], results["distances"][0]):
            all_results.append({
                "collection": name,
                "document": doc,
                "similarity": 1 - dist
            })

    # Sort by relevance
    all_results.sort(key=lambda x: x["similarity"], reverse=True)
    return all_results[:n_results * 2]

results = search_all_collections("How do I deploy with Docker?")
for r in results:
    print(f"[{r['collection']}] ({r['similarity']:.3f}) {r['document'][:80]}...")
```

***

### Example 5: Filtering and Metadata Queries

```python
import chromadb

client = chromadb.HttpClient(host="<server-ip>", port=8000)
collection = client.get_collection("technical_docs")

# Add documents with rich metadata
collection.add(
    documents=[
        "Guide: Running PyTorch on NVIDIA A100 GPU clusters",
        "Guide: TensorFlow distributed training on RTX 4090",
        "Tutorial: LLM fine-tuning with LoRA on GPU",
        "Reference: CUDA 12.1 compatibility matrix",
        "Guide: Docker networking for multi-GPU setups",
    ],
    ids=["d1", "d2", "d3", "d4", "d5"],
    metadatas=[
        {"type": "guide", "gpu": "A100", "framework": "pytorch", "year": 2024},
        {"type": "guide", "gpu": "RTX4090", "framework": "tensorflow", "year": 2024},
        {"type": "tutorial", "gpu": "any", "framework": "transformers", "year": 2024},
        {"type": "reference", "gpu": "any", "framework": "cuda", "year": 2023},
        {"type": "guide", "gpu": "multi", "framework": "docker", "year": 2024},
    ]
)

# Query with metadata filter
results = collection.query(
    query_texts=["GPU training guide"],
    n_results=3,
    where={"type": "guide"},  # Only return guides
    include=["documents", "metadatas", "distances"]
)

print("Filtered results (type=guide):")
for doc, meta, dist in zip(
    results["documents"][0],
    results["metadatas"][0],
    results["distances"][0]
):
    print(f"  [{1-dist:.3f}] {doc}")
    print(f"    Metadata: {meta}")
```

***

## Configuration

### Docker Compose (Production)

```yaml
version: '3.8'

services:
  chromadb:
    image: chromadb/chroma:latest
    container_name: chromadb
    ports:
      - "8000:8000"
    volumes:
      - chromadb_data:/chroma/chroma
    environment:
      - IS_PERSISTENT=TRUE
      - ANONYMIZED_TELEMETRY=FALSE
      - CHROMA_SERVER_LOG_LEVEL=INFO
      - ALLOW_RESET=FALSE
      - CHROMA_SEGMENT_CACHE_POLICY=LRU
      - CHROMA_MEMORY_LIMIT_BYTES=2147483648  # 2 GB cache
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/api/v1/heartbeat"]
      interval: 30s
      timeout: 10s
      retries: 3

volumes:
  chromadb_data:
    driver: local
```

### Environment Variables Reference

| Variable                      | Default | Description                      |
| ----------------------------- | ------- | -------------------------------- |
| `IS_PERSISTENT`               | `FALSE` | Enable persistent storage        |
| `ANONYMIZED_TELEMETRY`        | `TRUE`  | Disable usage tracking           |
| `CHROMA_SERVER_LOG_LEVEL`     | `INFO`  | Log verbosity                    |
| `CHROMA_MEMORY_LIMIT_BYTES`   | None    | Max memory for segment cache     |
| `ALLOW_RESET`                 | `FALSE` | Allow resetting all data via API |
| `CHROMA_SERVER_AUTH_PROVIDER` | None    | Authentication provider class    |

***

## Performance Tips

### 1. Choose the Right Embedding Model

| Model                    | Dimensions | Speed | Quality | GPU Required |
| ------------------------ | ---------- | ----- | ------- | ------------ |
| `all-MiniLM-L6-v2`       | 384        | Fast  | Good    | No           |
| `all-mpnet-base-v2`      | 768        | Med   | Better  | Optional     |
| `text-embedding-3-small` | 1536       | Fast  | Great   | API only     |
| `BAAI/bge-large-en-v1.5` | 1024       | Med   | Best    | Yes          |

### 2. Batch Upserts for Speed

```python
# Add in batches of 100-1000 for best performance
BATCH_SIZE = 500

for i in range(0, len(all_documents), BATCH_SIZE):
    batch = all_documents[i:i+BATCH_SIZE]
    collection.add(
        documents=[d["text"] for d in batch],
        ids=[d["id"] for d in batch],
        metadatas=[d["meta"] for d in batch]
    )
    print(f"Batch {i//BATCH_SIZE + 1} done")
```

### 3. HNSW Index Tuning

```python
collection = client.create_collection(
    name="optimized",
    metadata={
        "hnsw:space": "cosine",
        "hnsw:construction_ef": 200,  # Higher = better index quality (slower build)
        "hnsw:search_ef": 100,        # Higher = better recall (slower search)
        "hnsw:M": 32,                 # Higher = better recall (more memory)
    }
)
```

### 4. Persistent Client for Local Use

```python
# For development on the Clore.ai server directly
import chromadb

client = chromadb.PersistentClient(path="/workspace/chromadb/data")
# No server needed, faster for single-process use
```

***

## Troubleshooting

### Issue: Cannot connect to ChromaDB

```bash
# Check container is running
docker ps | grep chromadb

# Check logs
docker logs chromadb --tail 20

# Test from inside container
docker exec chromadb curl http://localhost:8000/api/v1/heartbeat
```

### Issue: Data lost on container restart

```bash
# Ensure volume is mounted
docker inspect chromadb | grep Mounts -A 10

# Re-run with explicit volume
docker run -d -p 8000:8000 \
  -v /workspace/chromadb/data:/chroma/chroma \
  -e IS_PERSISTENT=TRUE \
  chromadb/chroma:latest
```

### Issue: Out of memory errors

```bash
# Limit memory cache
docker run -d -p 8000:8000 \
  -e CHROMA_MEMORY_LIMIT_BYTES=1073741824 \
  -v /workspace/chromadb/data:/chroma/chroma \
  chromadb/chroma:latest
```

### Issue: Slow embedding generation

```bash
# Verify GPU is being used for embeddings
python3 -c "
import torch
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2', device='cuda')
print(f'Embedding model on: {model.device}')
"
```

### Issue: Collection not found after restart

```bash
# Verify persistence is enabled
curl http://localhost:8000/api/v1/collections
# If empty, IS_PERSISTENT was not set or volume wasn't mounted
```

***

## Links

* **GitHub**: <https://github.com/chroma-core/chroma>
* **Official Docs**: <https://docs.trychroma.com>
* **Docker Hub**: <https://hub.docker.com/r/chromadb/chroma>
* **PyPI**: <https://pypi.org/project/chromadb>
* **Discord**: <https://discord.gg/MMeYNTmh3x>
* **CLORE.AI Marketplace**: <https://clore.ai/marketplace>

***

## Clore.ai GPU Recommendations

| Use Case                  | Recommended GPU | Est. Cost on Clore.ai |
| ------------------------- | --------------- | --------------------- |
| Development/Testing       | RTX 3090 (24GB) | \~$0.12/gpu/hr        |
| Production RAG            | RTX 3090 (24GB) | \~$0.12/gpu/hr        |
| High-throughput Embedding | RTX 4090 (24GB) | \~$0.70/gpu/hr        |

> 💡 All examples in this guide can be deployed on [Clore.ai](https://clore.ai/marketplace) GPU servers. Browse available GPUs and rent by the hour — no commitments, full root access.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/rag-and-vector-databases/chromadb.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
