# Weaviate

{% hint style="info" %}
**Weaviate** is an AI-native, open-source vector database designed for semantic search, hybrid search, and RAG (Retrieval-Augmented Generation) applications. It stores both objects and their vector embeddings and supports built-in ML model integration.
{% endhint %}

## Overview

Weaviate goes beyond traditional vector databases by natively integrating ML models for automatic vectorization at import and query time. It supports multiple data types (text, images, video, audio), built-in hybrid search combining BM25 and vector similarity, and multi-tenant deployments. Weaviate is production-ready, cloud-native, and designed to scale from prototypes to billions of vectors.

| Property         | Value                                                     |
| ---------------- | --------------------------------------------------------- |
| **Category**     | Vector Database / RAG Infrastructure                      |
| **Developer**    | Weaviate B.V.                                             |
| **License**      | BSD 3-Clause                                              |
| **GitHub**       | [weaviate/weaviate](https://github.com/weaviate/weaviate) |
| **Stars**        | 12K+                                                      |
| **Docker Image** | `cr.weaviate.io/semitechnologies/weaviate`                |
| **Ports**        | 22 (SSH), 8080 (HTTP API / GraphQL)                       |

***

## Key Features

* **Vector + keyword hybrid search** — combine BM25 full-text with vector similarity in one query
* **Built-in vectorizers** — auto-vectorize data at import with OpenAI, Cohere, HuggingFace, or local models
* **Multi-modal** — store and search text, images, video, audio in one database
* **GraphQL API** — expressive query language for complex semantic queries
* **REST API** — full CRUD operations and schema management
* **Multi-tenancy** — isolate data per tenant with shared infrastructure
* **HNSW indexing** — fast approximate nearest-neighbor search
* **Filtered search** — combine vector search with traditional metadata filters
* **Generative search** — built-in RAG with LLM integration
* **Horizontal scaling** — shard and replicate across multiple nodes
* **Modules system** — plug in vectorizers, readers, generators

***

## Clore.ai Setup

### Step 1 — Choose Hardware

| Use Case                        | Recommended  | RAM    | Storage |
| ------------------------------- | ------------ | ------ | ------- |
| Development / prototyping       | CPU instance | 8 GB   | 20 GB   |
| Small production (< 1M vectors) | CPU instance | 16 GB  | 50 GB   |
| Large scale (10M+ vectors)      | GPU instance | 32 GB+ | 200 GB+ |
| GPU-accelerated vectorization   | RTX 4090     | 24 GB  | 100 GB  |

{% hint style="info" %}
Weaviate itself runs on CPU. Use GPU instances on Clore.ai when you need **local embedding model** inference (e.g., `text2vec-transformers` with a local model) for fast vectorization at import time.
{% endhint %}

### Step 2 — Rent a Server on Clore.ai

1. Go to [clore.ai](https://clore.ai) → **Marketplace**
2. For pure vector search: CPU instances with **≥ 16 GB RAM**
3. For GPU-accelerated embeddings: **RTX 3090 or 4090**
4. Open ports: **22** and **8080**
5. Ensure **≥ 50 GB disk** for vector storage

### Step 3 — Deploy with Docker

**Minimal deployment (no vectorizer):**

```bash
docker run -d \
    --name weaviate \
    -p 8080:8080 \
    -p 50051:50051 \
    -v /opt/weaviate/data:/var/lib/weaviate \
    -e QUERY_DEFAULTS_LIMIT=20 \
    -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
    -e PERSISTENCE_DATA_PATH=/var/lib/weaviate \
    -e DEFAULT_VECTORIZER_MODULE=none \
    -e ENABLE_MODULES="" \
    -e CLUSTER_HOSTNAME=node1 \
    cr.weaviate.io/semitechnologies/weaviate:latest
```

**With OpenAI vectorizer:**

```bash
docker run -d \
    --name weaviate \
    -p 8080:8080 \
    -v /opt/weaviate/data:/var/lib/weaviate \
    -e QUERY_DEFAULTS_LIMIT=20 \
    -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
    -e PERSISTENCE_DATA_PATH=/var/lib/weaviate \
    -e DEFAULT_VECTORIZER_MODULE=text2vec-openai \
    -e ENABLE_MODULES=text2vec-openai,generative-openai \
    -e OPENAI_APIKEY=<your-openai-key> \
    -e CLUSTER_HOSTNAME=node1 \
    cr.weaviate.io/semitechnologies/weaviate:latest
```

**With local HuggingFace vectorizer (GPU-accelerated):**

```yaml
# docker-compose.yml
version: '3.4'

services:
  weaviate:
    image: cr.weaviate.io/semitechnologies/weaviate:latest
    restart: unless-stopped
    ports:
      - "8080:8080"
      - "50051:50051"
    volumes:
      - /opt/weaviate/data:/var/lib/weaviate
    environment:
      QUERY_DEFAULTS_LIMIT: 20
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: text2vec-transformers
      ENABLE_MODULES: 'text2vec-transformers,generative-openai'
      TRANSFORMERS_INFERENCE_API: 'http://t2v-transformers:8080'
      CLUSTER_HOSTNAME: 'node1'

  t2v-transformers:
    image: cr.weaviate.io/semitechnologies/transformers-inference:sentence-transformers-multi-qa-MiniLM-L6-cos-v1
    environment:
      ENABLE_CUDA: '1'
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
```

Start:

```bash
mkdir -p /opt/weaviate/data
docker-compose up -d
```

***

## Accessing the API

### HTTP/REST API

```
http://<server-ip>:8080
```

### GraphQL Endpoint

```
http://<server-ip>:8080/v1/graphql
```

### Health Check

```bash
curl http://<server-ip>:8080/v1/.well-known/ready
# Returns: {}  (HTTP 200 = healthy)
```

### Via SSH

```bash
ssh root@<server-ip> -p 22
```

***

## Python Client

### Installation

```bash
pip install weaviate-client
```

### Connect

```python
import weaviate
import weaviate.classes as wvc

# Connect to your Clore.ai instance
client = weaviate.connect_to_custom(
    http_host="<server-ip>",
    http_port=8080,
    http_secure=False,
    grpc_host="<server-ip>",
    grpc_port=50051,
    grpc_secure=False,
)

print(client.is_ready())  # True if healthy
```

***

## Schema & Collections

### Create a Collection

```python
import weaviate
import weaviate.classes as wvc
from weaviate.classes.config import Configure, Property, DataType

client = weaviate.connect_to_custom(
    http_host="<server-ip>", http_port=8080,
    grpc_host="<server-ip>", grpc_port=50051,
    http_secure=False, grpc_secure=False,
)

# Create a collection (was called "class" in v3)
client.collections.create(
    name="Article",
    vectorizer_config=Configure.Vectorizer.none(),  # We'll provide our own vectors
    # Or: Configure.Vectorizer.text2vec_openai() for auto-vectorization
    properties=[
        Property(name="title", data_type=DataType.TEXT),
        Property(name="content", data_type=DataType.TEXT),
        Property(name="author", data_type=DataType.TEXT),
        Property(name="published_date", data_type=DataType.DATE),
        Property(name="tags", data_type=DataType.TEXT_ARRAY),
        Property(name="view_count", data_type=DataType.INT),
    ],
)
print("Collection 'Article' created")
```

***

## Importing Data

### Batch Import with Pre-computed Vectors

```python
import weaviate
import numpy as np
from sentence_transformers import SentenceTransformer

client = weaviate.connect_to_custom(
    http_host="<server-ip>", http_port=8080,
    grpc_host="<server-ip>", grpc_port=50051,
    http_secure=False, grpc_secure=False,
)

# Load embedding model
encoder = SentenceTransformer("all-MiniLM-L6-v2")

# Sample articles
articles = [
    {"title": "Introduction to RAG", "content": "RAG combines retrieval with generation..."},
    {"title": "Vector Databases Explained", "content": "Vector databases store high-dimensional embeddings..."},
    {"title": "Weaviate Best Practices", "content": "For production Weaviate deployments, consider..."},
    {"title": "GPU Cloud Computing", "content": "Clore.ai provides decentralized GPU access..."},
]

# Batch import with vectors
collection = client.collections.get("Article")

with collection.batch.dynamic() as batch:
    for article in articles:
        # Compute vector
        vector = encoder.encode(article["content"]).tolist()

        batch.add_object(
            properties={
                "title": article["title"],
                "content": article["content"],
            },
            vector=vector,
        )

print(f"Imported {len(articles)} articles")
```

### Auto-vectorize with OpenAI (at import)

```python
# When collection uses text2vec-openai vectorizer,
# just insert data — no vector needed
collection = client.collections.get("ArticleOpenAI")

with collection.batch.dynamic() as batch:
    for article in articles:
        batch.add_object(
            properties={
                "title": article["title"],
                "content": article["content"],
            }
            # No vector = Weaviate auto-vectorizes via OpenAI
        )
```

***

## Querying

### Semantic (Vector) Search

```python
# Find articles semantically similar to a query
results = collection.query.near_text(
    query="how to store embeddings efficiently",
    limit=5,
    return_properties=["title", "content"],
    return_metadata=wvc.query.MetadataQuery(distance=True),
)

for obj in results.objects:
    print(f"Title: {obj.properties['title']}")
    print(f"Distance: {obj.metadata.distance:.4f}")
    print()
```

### Hybrid Search (Vector + BM25)

```python
# Combine semantic and keyword search
results = collection.query.hybrid(
    query="RAG retrieval augmented generation",
    alpha=0.5,  # 0.0 = pure BM25, 1.0 = pure vector, 0.5 = balanced
    limit=5,
    return_properties=["title", "content"],
    return_metadata=wvc.query.MetadataQuery(score=True),
)

for obj in results.objects:
    print(f"Title: {obj.properties['title']}")
    print(f"Hybrid Score: {obj.metadata.score:.4f}")
```

### Keyword Search (BM25)

```python
results = collection.query.bm25(
    query="vector database indexing",
    limit=5,
    return_properties=["title"],
)
```

### Filtered Search

```python
from weaviate.classes.query import Filter

# Combine vector search with metadata filter
results = collection.query.near_text(
    query="machine learning training",
    limit=10,
    filters=Filter.by_property("view_count").greater_than(1000),
    return_properties=["title", "view_count"],
)
```

### GraphQL Query

```python
import requests

query = """
{
    Get {
        Article(
            nearText: {concepts: ["artificial intelligence"]}
            limit: 5
        ) {
            title
            content
            _additional {
                distance
                id
            }
        }
    }
}
"""

response = requests.post(
    "http://<server-ip>:8080/v1/graphql",
    json={"query": query},
)
data = response.json()
for article in data["data"]["Get"]["Article"]:
    print(article["title"])
```

***

## Generative Search (RAG)

```python
from weaviate.classes.generate import GenerateOptions

# Set up collection with generative module (OpenAI)
# Requires ENABLE_MODULES=generative-openai

results = collection.generate.near_text(
    query="how to build a RAG system",
    limit=3,
    grouped_task="Summarize these articles and explain the key steps to build a RAG system.",
    grouped_properties=["title", "content"],
)

print("RAG Answer:")
print(results.generated)
print("\nSource articles:")
for obj in results.objects:
    print(f"  - {obj.properties['title']}")
```

***

## Multi-Tenancy

```python
from weaviate.classes.config import Configure

# Create multi-tenant collection
client.collections.create(
    name="UserDocuments",
    multi_tenancy_config=Configure.multi_tenancy(enabled=True),
    properties=[
        Property(name="content", data_type=DataType.TEXT),
        Property(name="filename", data_type=DataType.TEXT),
    ],
)

# Create tenants
collection = client.collections.get("UserDocuments")
collection.tenants.create([
    wvc.config.Tenant(name="user_alice"),
    wvc.config.Tenant(name="user_bob"),
])

# Insert data for specific tenant
tenant_collection = collection.with_tenant("user_alice")
tenant_collection.data.insert({"content": "Alice's private document", "filename": "doc1.pdf"})

# Query within tenant
results = collection.with_tenant("user_alice").query.near_text(
    query="private document",
    limit=5,
)
```

***

## REST API Examples

```bash
# Create schema class
curl -X POST http://<server-ip>:8080/v1/schema \
    -H "Content-Type: application/json" \
    -d '{
        "class": "Product",
        "vectorizer": "none",
        "properties": [
            {"name": "name", "dataType": ["text"]},
            {"name": "description", "dataType": ["text"]},
            {"name": "price", "dataType": ["number"]}
        ]
    }'

# Add object with vector
curl -X POST http://<server-ip>:8080/v1/objects \
    -H "Content-Type: application/json" \
    -d '{
        "class": "Product",
        "properties": {
            "name": "GPU Cloud Access",
            "description": "Decentralized GPU marketplace",
            "price": 0.5
        },
        "vector": [0.1, 0.2, 0.3, ...]
    }'

# Vector search
curl http://<server-ip>:8080/v1/objects?class=Product&limit=5

# Health check
curl http://<server-ip>:8080/v1/.well-known/ready
```

***

## Troubleshooting

{% hint style="warning" %}
**Weaviate not starting** — Check disk space (`df -h`). Weaviate needs writable space at the data path. Also verify port 8080 is open in Clore.ai settings.
{% endhint %}

{% hint style="warning" %}
**Slow import** — Enable batch import (`collection.batch.dynamic()` or `fixed_size()`). Avoid single-object imports for large datasets. Batch size 100–500 is optimal.
{% endhint %}

{% hint style="info" %}
**High memory usage** — Weaviate keeps vector index in RAM for fast search. For 1M 768-dim vectors: \~6 GB RAM. Plan accordingly when choosing Clore.ai instance size.
{% endhint %}

{% hint style="info" %}
**Cannot connect via Python client** — Ensure both port 8080 (HTTP) and port 50051 (gRPC) are open. The v4 Python client uses gRPC by default.
{% endhint %}

| Issue                   | Fix                                                          |
| ----------------------- | ------------------------------------------------------------ |
| `Connection refused`    | Wait for startup (\~30 sec), check `docker ps`, verify ports |
| `Schema already exists` | Delete collection first: `client.collections.delete("Name")` |
| `Out of memory`         | Increase RAM or reduce vector dimensions                     |
| Slow vector search      | Add HNSW index or check dataset size vs available RAM        |

***

## Performance Tips

1. **Use batch imports** — 10x–50x faster than single inserts
2. **Choose right embedding model** — `all-MiniLM-L6-v2` (384 dims) is fast; `text-embedding-3-large` (3072 dims) is best quality but uses 8x more RAM
3. **Hybrid search alpha** — tune `alpha` for your use case: 0.25 for keyword-heavy queries, 0.75 for semantic queries
4. **HNSW parameters** — `ef` and `efConstruction` control recall vs. speed tradeoff
5. **Tenant isolation** — use multi-tenancy for SaaS apps; it scales much better than separate collections per user

***

## Related Tools

* [Qdrant](https://docs.clore.ai/guides/rag-and-vector-databases/qdrant) — Rust-based vector database with payload filters
* [ChromaDB](https://docs.clore.ai/guides/rag-and-vector-databases/chromadb) — lightweight embeddings database
* [Milvus](https://docs.clore.ai/guides/rag-and-vector-databases/milvus) — high-scale vector database

***

*Weaviate on Clore.ai gives you a production-grade vector database with GPU-accelerated vectorization — ideal for building scalable RAG systems and semantic search applications.*

***

## Clore.ai GPU Recommendations

| Use Case                  | Recommended GPU | Est. Cost on Clore.ai |
| ------------------------- | --------------- | --------------------- |
| Development/Testing       | RTX 3090 (24GB) | \~$0.12/gpu/hr        |
| Production Vector Search  | RTX 3090 (24GB) | \~$0.12/gpu/hr        |
| High-throughput Embedding | RTX 4090 (24GB) | \~$0.70/gpu/hr        |

> 💡 All examples in this guide can be deployed on [Clore.ai](https://clore.ai/marketplace) GPU servers. Browse available GPUs and rent by the hour — no commitments, full root access.
