# Weaviate

{% hint style="info" %}
**Weaviate** is an AI-native, open-source vector database designed for semantic search, hybrid search, and RAG (Retrieval-Augmented Generation) applications. It stores both objects and their vector embeddings and supports built-in ML model integration.
{% endhint %}

## Overview

Weaviate goes beyond traditional vector databases by natively integrating ML models for automatic vectorization at import and query time. It supports multiple data types (text, images, video, audio), built-in hybrid search combining BM25 and vector similarity, and multi-tenant deployments. Weaviate is production-ready, cloud-native, and designed to scale from prototypes to billions of vectors.

| Property         | Value                                                     |
| ---------------- | --------------------------------------------------------- |
| **Category**     | Vector Database / RAG Infrastructure                      |
| **Developer**    | Weaviate B.V.                                             |
| **License**      | BSD 3-Clause                                              |
| **GitHub**       | [weaviate/weaviate](https://github.com/weaviate/weaviate) |
| **Stars**        | 12K+                                                      |
| **Docker Image** | `cr.weaviate.io/semitechnologies/weaviate`                |
| **Ports**        | 22 (SSH), 8080 (HTTP API / GraphQL)                       |

***

## Key Features

* **Vector + keyword hybrid search** — combine BM25 full-text with vector similarity in one query
* **Built-in vectorizers** — auto-vectorize data at import with OpenAI, Cohere, HuggingFace, or local models
* **Multi-modal** — store and search text, images, video, audio in one database
* **GraphQL API** — expressive query language for complex semantic queries
* **REST API** — full CRUD operations and schema management
* **Multi-tenancy** — isolate data per tenant with shared infrastructure
* **HNSW indexing** — fast approximate nearest-neighbor search
* **Filtered search** — combine vector search with traditional metadata filters
* **Generative search** — built-in RAG with LLM integration
* **Horizontal scaling** — shard and replicate across multiple nodes
* **Modules system** — plug in vectorizers, readers, generators

***

## Clore.ai Setup

### Step 1 — Choose Hardware

| Use Case                        | Recommended  | RAM    | Storage |
| ------------------------------- | ------------ | ------ | ------- |
| Development / prototyping       | CPU instance | 8 GB   | 20 GB   |
| Small production (< 1M vectors) | CPU instance | 16 GB  | 50 GB   |
| Large scale (10M+ vectors)      | GPU instance | 32 GB+ | 200 GB+ |
| GPU-accelerated vectorization   | RTX 4090     | 24 GB  | 100 GB  |

{% hint style="info" %}
Weaviate itself runs on CPU. Use GPU instances on Clore.ai when you need **local embedding model** inference (e.g., `text2vec-transformers` with a local model) for fast vectorization at import time.
{% endhint %}

### Step 2 — Rent a Server on Clore.ai

1. Go to [clore.ai](https://clore.ai) → **Marketplace**
2. For pure vector search: CPU instances with **≥ 16 GB RAM**
3. For GPU-accelerated embeddings: **RTX 3090 or 4090**
4. Open ports: **22** and **8080**
5. Ensure **≥ 50 GB disk** for vector storage

### Step 3 — Deploy with Docker

**Minimal deployment (no vectorizer):**

```bash
docker run -d \
    --name weaviate \
    -p 8080:8080 \
    -p 50051:50051 \
    -v /opt/weaviate/data:/var/lib/weaviate \
    -e QUERY_DEFAULTS_LIMIT=20 \
    -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
    -e PERSISTENCE_DATA_PATH=/var/lib/weaviate \
    -e DEFAULT_VECTORIZER_MODULE=none \
    -e ENABLE_MODULES="" \
    -e CLUSTER_HOSTNAME=node1 \
    cr.weaviate.io/semitechnologies/weaviate:latest
```

**With OpenAI vectorizer:**

```bash
docker run -d \
    --name weaviate \
    -p 8080:8080 \
    -v /opt/weaviate/data:/var/lib/weaviate \
    -e QUERY_DEFAULTS_LIMIT=20 \
    -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
    -e PERSISTENCE_DATA_PATH=/var/lib/weaviate \
    -e DEFAULT_VECTORIZER_MODULE=text2vec-openai \
    -e ENABLE_MODULES=text2vec-openai,generative-openai \
    -e OPENAI_APIKEY=<your-openai-key> \
    -e CLUSTER_HOSTNAME=node1 \
    cr.weaviate.io/semitechnologies/weaviate:latest
```

**With local HuggingFace vectorizer (GPU-accelerated):**

```yaml
# docker-compose.yml
version: '3.4'

services:
  weaviate:
    image: cr.weaviate.io/semitechnologies/weaviate:latest
    restart: unless-stopped
    ports:
      - "8080:8080"
      - "50051:50051"
    volumes:
      - /opt/weaviate/data:/var/lib/weaviate
    environment:
      QUERY_DEFAULTS_LIMIT: 20
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: text2vec-transformers
      ENABLE_MODULES: 'text2vec-transformers,generative-openai'
      TRANSFORMERS_INFERENCE_API: 'http://t2v-transformers:8080'
      CLUSTER_HOSTNAME: 'node1'

  t2v-transformers:
    image: cr.weaviate.io/semitechnologies/transformers-inference:sentence-transformers-multi-qa-MiniLM-L6-cos-v1
    environment:
      ENABLE_CUDA: '1'
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
```

Start:

```bash
mkdir -p /opt/weaviate/data
docker-compose up -d
```

***

## Accessing the API

### HTTP/REST API

```
http://<server-ip>:8080
```

### GraphQL Endpoint

```
http://<server-ip>:8080/v1/graphql
```

### Health Check

```bash
curl http://<server-ip>:8080/v1/.well-known/ready
# Returns: {}  (HTTP 200 = healthy)
```

### Via SSH

```bash
ssh root@<server-ip> -p 22
```

***

## Python Client

### Installation

```bash
pip install weaviate-client
```

### Connect

```python
import weaviate
import weaviate.classes as wvc

# Connect to your Clore.ai instance
client = weaviate.connect_to_custom(
    http_host="<server-ip>",
    http_port=8080,
    http_secure=False,
    grpc_host="<server-ip>",
    grpc_port=50051,
    grpc_secure=False,
)

print(client.is_ready())  # True if healthy
```

***

## Schema & Collections

### Create a Collection

```python
import weaviate
import weaviate.classes as wvc
from weaviate.classes.config import Configure, Property, DataType

client = weaviate.connect_to_custom(
    http_host="<server-ip>", http_port=8080,
    grpc_host="<server-ip>", grpc_port=50051,
    http_secure=False, grpc_secure=False,
)

# Create a collection (was called "class" in v3)
client.collections.create(
    name="Article",
    vectorizer_config=Configure.Vectorizer.none(),  # We'll provide our own vectors
    # Or: Configure.Vectorizer.text2vec_openai() for auto-vectorization
    properties=[
        Property(name="title", data_type=DataType.TEXT),
        Property(name="content", data_type=DataType.TEXT),
        Property(name="author", data_type=DataType.TEXT),
        Property(name="published_date", data_type=DataType.DATE),
        Property(name="tags", data_type=DataType.TEXT_ARRAY),
        Property(name="view_count", data_type=DataType.INT),
    ],
)
print("Collection 'Article' created")
```

***

## Importing Data

### Batch Import with Pre-computed Vectors

```python
import weaviate
import numpy as np
from sentence_transformers import SentenceTransformer

client = weaviate.connect_to_custom(
    http_host="<server-ip>", http_port=8080,
    grpc_host="<server-ip>", grpc_port=50051,
    http_secure=False, grpc_secure=False,
)

# Load embedding model
encoder = SentenceTransformer("all-MiniLM-L6-v2")

# Sample articles
articles = [
    {"title": "Introduction to RAG", "content": "RAG combines retrieval with generation..."},
    {"title": "Vector Databases Explained", "content": "Vector databases store high-dimensional embeddings..."},
    {"title": "Weaviate Best Practices", "content": "For production Weaviate deployments, consider..."},
    {"title": "GPU Cloud Computing", "content": "Clore.ai provides decentralized GPU access..."},
]

# Batch import with vectors
collection = client.collections.get("Article")

with collection.batch.dynamic() as batch:
    for article in articles:
        # Compute vector
        vector = encoder.encode(article["content"]).tolist()

        batch.add_object(
            properties={
                "title": article["title"],
                "content": article["content"],
            },
            vector=vector,
        )

print(f"Imported {len(articles)} articles")
```

### Auto-vectorize with OpenAI (at import)

```python
# When collection uses text2vec-openai vectorizer,
# just insert data — no vector needed
collection = client.collections.get("ArticleOpenAI")

with collection.batch.dynamic() as batch:
    for article in articles:
        batch.add_object(
            properties={
                "title": article["title"],
                "content": article["content"],
            }
            # No vector = Weaviate auto-vectorizes via OpenAI
        )
```

***

## Querying

### Semantic (Vector) Search

```python
# Find articles semantically similar to a query
results = collection.query.near_text(
    query="how to store embeddings efficiently",
    limit=5,
    return_properties=["title", "content"],
    return_metadata=wvc.query.MetadataQuery(distance=True),
)

for obj in results.objects:
    print(f"Title: {obj.properties['title']}")
    print(f"Distance: {obj.metadata.distance:.4f}")
    print()
```

### Hybrid Search (Vector + BM25)

```python
# Combine semantic and keyword search
results = collection.query.hybrid(
    query="RAG retrieval augmented generation",
    alpha=0.5,  # 0.0 = pure BM25, 1.0 = pure vector, 0.5 = balanced
    limit=5,
    return_properties=["title", "content"],
    return_metadata=wvc.query.MetadataQuery(score=True),
)

for obj in results.objects:
    print(f"Title: {obj.properties['title']}")
    print(f"Hybrid Score: {obj.metadata.score:.4f}")
```

### Keyword Search (BM25)

```python
results = collection.query.bm25(
    query="vector database indexing",
    limit=5,
    return_properties=["title"],
)
```

### Filtered Search

```python
from weaviate.classes.query import Filter

# Combine vector search with metadata filter
results = collection.query.near_text(
    query="machine learning training",
    limit=10,
    filters=Filter.by_property("view_count").greater_than(1000),
    return_properties=["title", "view_count"],
)
```

### GraphQL Query

```python
import requests

query = """
{
    Get {
        Article(
            nearText: {concepts: ["artificial intelligence"]}
            limit: 5
        ) {
            title
            content
            _additional {
                distance
                id
            }
        }
    }
}
"""

response = requests.post(
    "http://<server-ip>:8080/v1/graphql",
    json={"query": query},
)
data = response.json()
for article in data["data"]["Get"]["Article"]:
    print(article["title"])
```

***

## Generative Search (RAG)

```python
from weaviate.classes.generate import GenerateOptions

# Set up collection with generative module (OpenAI)
# Requires ENABLE_MODULES=generative-openai

results = collection.generate.near_text(
    query="how to build a RAG system",
    limit=3,
    grouped_task="Summarize these articles and explain the key steps to build a RAG system.",
    grouped_properties=["title", "content"],
)

print("RAG Answer:")
print(results.generated)
print("\nSource articles:")
for obj in results.objects:
    print(f"  - {obj.properties['title']}")
```

***

## Multi-Tenancy

```python
from weaviate.classes.config import Configure

# Create multi-tenant collection
client.collections.create(
    name="UserDocuments",
    multi_tenancy_config=Configure.multi_tenancy(enabled=True),
    properties=[
        Property(name="content", data_type=DataType.TEXT),
        Property(name="filename", data_type=DataType.TEXT),
    ],
)

# Create tenants
collection = client.collections.get("UserDocuments")
collection.tenants.create([
    wvc.config.Tenant(name="user_alice"),
    wvc.config.Tenant(name="user_bob"),
])

# Insert data for specific tenant
tenant_collection = collection.with_tenant("user_alice")
tenant_collection.data.insert({"content": "Alice's private document", "filename": "doc1.pdf"})

# Query within tenant
results = collection.with_tenant("user_alice").query.near_text(
    query="private document",
    limit=5,
)
```

***

## REST API Examples

```bash
# Create schema class
curl -X POST http://<server-ip>:8080/v1/schema \
    -H "Content-Type: application/json" \
    -d '{
        "class": "Product",
        "vectorizer": "none",
        "properties": [
            {"name": "name", "dataType": ["text"]},
            {"name": "description", "dataType": ["text"]},
            {"name": "price", "dataType": ["number"]}
        ]
    }'

# Add object with vector
curl -X POST http://<server-ip>:8080/v1/objects \
    -H "Content-Type: application/json" \
    -d '{
        "class": "Product",
        "properties": {
            "name": "GPU Cloud Access",
            "description": "Decentralized GPU marketplace",
            "price": 0.5
        },
        "vector": [0.1, 0.2, 0.3, ...]
    }'

# Vector search
curl http://<server-ip>:8080/v1/objects?class=Product&limit=5

# Health check
curl http://<server-ip>:8080/v1/.well-known/ready
```

***

## Troubleshooting

{% hint style="warning" %}
**Weaviate not starting** — Check disk space (`df -h`). Weaviate needs writable space at the data path. Also verify port 8080 is open in Clore.ai settings.
{% endhint %}

{% hint style="warning" %}
**Slow import** — Enable batch import (`collection.batch.dynamic()` or `fixed_size()`). Avoid single-object imports for large datasets. Batch size 100–500 is optimal.
{% endhint %}

{% hint style="info" %}
**High memory usage** — Weaviate keeps vector index in RAM for fast search. For 1M 768-dim vectors: \~6 GB RAM. Plan accordingly when choosing Clore.ai instance size.
{% endhint %}

{% hint style="info" %}
**Cannot connect via Python client** — Ensure both port 8080 (HTTP) and port 50051 (gRPC) are open. The v4 Python client uses gRPC by default.
{% endhint %}

| Issue                   | Fix                                                          |
| ----------------------- | ------------------------------------------------------------ |
| `Connection refused`    | Wait for startup (\~30 sec), check `docker ps`, verify ports |
| `Schema already exists` | Delete collection first: `client.collections.delete("Name")` |
| `Out of memory`         | Increase RAM or reduce vector dimensions                     |
| Slow vector search      | Add HNSW index or check dataset size vs available RAM        |

***

## Performance Tips

1. **Use batch imports** — 10x–50x faster than single inserts
2. **Choose right embedding model** — `all-MiniLM-L6-v2` (384 dims) is fast; `text-embedding-3-large` (3072 dims) is best quality but uses 8x more RAM
3. **Hybrid search alpha** — tune `alpha` for your use case: 0.25 for keyword-heavy queries, 0.75 for semantic queries
4. **HNSW parameters** — `ef` and `efConstruction` control recall vs. speed tradeoff
5. **Tenant isolation** — use multi-tenancy for SaaS apps; it scales much better than separate collections per user

***

## Related Tools

* [Qdrant](https://docs.clore.ai/guides/rag-and-vector-databases/qdrant) — Rust-based vector database with payload filters
* [ChromaDB](https://docs.clore.ai/guides/rag-and-vector-databases/chromadb) — lightweight embeddings database
* [Milvus](https://docs.clore.ai/guides/rag-and-vector-databases/milvus) — high-scale vector database

***

*Weaviate on Clore.ai gives you a production-grade vector database with GPU-accelerated vectorization — ideal for building scalable RAG systems and semantic search applications.*

***

## Clore.ai GPU Recommendations

| Use Case                  | Recommended GPU | Est. Cost on Clore.ai |
| ------------------------- | --------------- | --------------------- |
| Development/Testing       | RTX 3090 (24GB) | \~$0.12/gpu/hr        |
| Production Vector Search  | RTX 3090 (24GB) | \~$0.12/gpu/hr        |
| High-throughput Embedding | RTX 4090 (24GB) | \~$0.70/gpu/hr        |

> 💡 All examples in this guide can be deployed on [Clore.ai](https://clore.ai/marketplace) GPU servers. Browse available GPUs and rent by the hour — no commitments, full root access.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/rag-and-vector-databases/weaviate.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
