# Weaviate

{% hint style="info" %}
**Weaviate** 是一个以 AI 为本地、开源的向量数据库，专为语义搜索、混合搜索和 RAG（检索增强生成）应用而设计。它同时存储对象及其向量嵌入，并支持内置的机器学习模型集成。
{% endhint %}

## 概览

Weaviate 超越了传统向量数据库，通过原生集成机器学习模型在导入和查询时自动向量化。它支持多种数据类型（文本、图像、视频、音频）、内置将 BM25 与向量相似性结合的混合搜索，以及多租户部署。Weaviate 已具备生产就绪性、云原生特性，并可扩展从原型到数十亿向量。

| 属性            | 数值                                                        |
| ------------- | --------------------------------------------------------- |
| **分类**        | 向量数据库 / RAG 基础设施                                          |
| **开发者**       | Weaviate B.V.                                             |
| **许可**        | BSD 三条款许可                                                 |
| **GitHub**    | [weaviate/weaviate](https://github.com/weaviate/weaviate) |
| **Stars（星标）** | 12K+                                                      |
| **Docker 镜像** | `cr.weaviate.io/semitechnologies/weaviate`                |
| **端口**        | 22（SSH）、8080（HTTP API / GraphQL）                          |

***

## 主要功能

* **向量 + 关键词混合搜索** — 在一次查询中将 BM25 全文与向量相似性结合
* **内置向量化器** — 在导入时使用 OpenAI、Cohere、HuggingFace 或本地模型自动向量化数据
* **多模态** — 在一个数据库中存储和搜索文本、图像、视频、音频
* **GraphQL API** — 用于复杂语义查询的表达性查询语言
* **REST API** — 完整的 CRUD 操作和模式管理
* **多租户** — 在共享基础设施上隔离每个租户的数据
* **HNSW 索引** — 快速的近似最近邻搜索
* **过滤搜索** — 将向量搜索与传统元数据过滤结合
* **生成式搜索** — 内置与大型语言模型集成的 RAG
* **横向扩展** — 在多个节点上分片和复制
* **模块系统** — 插入向量化器、读取器、生成器等模块

***

## Clore.ai 设置

### 第 1 步 — 选择硬件

| 在 Clore.ai 上的预估费用 | 推荐配置     | 内存（RAM） | 存储      |
| ----------------- | -------- | ------- | ------- |
| 开发 / 原型           | CPU 实例   | 8 GB    | 20 GB   |
| 小型生产（< 1M 向量）     | CPU 实例   | 16 GB   | 50 GB   |
| 大规模（10M+ 向量）      | GPU 实例   | 32 GB+  | 200 GB+ |
| GPU 加速向量化         | RTX 4090 | 24 GB   | 100 GB  |

{% hint style="info" %}
Weaviate 本身在 CPU 上运行。当您需要在 Clore.ai 上使用本地嵌入模型进行推理（例如 **本地嵌入模型** 推理（例如， `text2vec-transformers` 与本地模型）以便在导入时实现快速向量化时，请使用 GPU 实例。
{% endhint %}

### 步骤 2 — 在 Clore.ai 租用服务器

1. 前往 [clore.ai](https://clore.ai) → **市场**
2. 纯向量搜索时：使用 CPU 实例，并配备 **≥ 16 GB 内存**
3. 用于 GPU 加速嵌入： **RTX 3090 或 4090**
4. 开放端口： **22** 和 **8080**
5. 确保 **≥ 50 GB 磁盘** 用于向量存储

### 第 3 步 — 使用 Docker 部署

**最小部署（无向量化器）：**

```bash
docker run -d \
    --name weaviate \
    -p 8080:8080 \
    -p 50051:50051 \
    -v /opt/weaviate/data:/var/lib/weaviate \
    -e QUERY_DEFAULTS_LIMIT=20 \
    -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
    -e PERSISTENCE_DATA_PATH=/var/lib/weaviate \
    -e DEFAULT_VECTORIZER_MODULE=none \
    -e ENABLE_MODULES="" \
    -e CLUSTER_HOSTNAME=node1 \
    cr.weaviate.io/semitechnologies/weaviate:latest
```

**使用 OpenAI 向量化器：**

```bash
docker run -d \
    --name weaviate \
    -p 8080:8080 \
    -v /opt/weaviate/data:/var/lib/weaviate \
    -e QUERY_DEFAULTS_LIMIT=20 \
    -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true \
    -e PERSISTENCE_DATA_PATH=/var/lib/weaviate \
    -e DEFAULT_VECTORIZER_MODULE=text2vec-openai \
    -e ENABLE_MODULES=text2vec-openai,generative-openai \
    -e OPENAI_APIKEY=<your-openai-key> \
    -e CLUSTER_HOSTNAME=node1 \
    cr.weaviate.io/semitechnologies/weaviate:latest
```

**使用本地 HuggingFace 向量化器（GPU 加速）：**

```yaml
# docker-compose.yml
version: '3.4'

services:
  weaviate:
    image: cr.weaviate.io/semitechnologies/weaviate:latest
    restart: unless-stopped
    ports:
      - "8080:8080"
      - "50051:50051"
    volumes:
      - /opt/weaviate/data:/var/lib/weaviate
    environment:
      QUERY_DEFAULTS_LIMIT: 20
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
      DEFAULT_VECTORIZER_MODULE: text2vec-transformers
      ENABLE_MODULES: 'text2vec-transformers,generative-openai'
      TRANSFORMERS_INFERENCE_API: 'http://t2v-transformers:8080'
      CLUSTER_HOSTNAME: 'node1'

  t2v-transformers:
    image: cr.weaviate.io/semitechnologies/transformers-inference:sentence-transformers-multi-qa-MiniLM-L6-cos-v1
    environment:
      ENABLE_CUDA: '1'
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
```

开始：

```bash
mkdir -p /opt/weaviate/data
docker-compose up -d
```

***

## 访问 API

### HTTP/REST API

```
http://<server-ip>:8080
```

### GraphQL 端点

```
http://<server-ip>:8080/v1/graphql
```

### 健康检查

```bash
curl http://<server-ip>:8080/v1/.well-known/ready
# 返回：{}  （HTTP 200 = 健康）
```

### 通过 SSH

```bash
ssh root@<server-ip> -p 22
```

***

## Python 客户端

### 安装

```bash
pip install weaviate-client
```

### 连接

```python
import weaviate
import weaviate.classes as wvc

# 连接到您的 Clore.ai 实例
client = weaviate.connect_to_custom(
    http_host="<server-ip>",
    http_port=8080,
    http_secure=False,
    grpc_host="<server-ip>",
    grpc_port=50051,
    grpc_secure=False,
)

print(client.is_ready())  # 如果健康则为 True
```

***

## 模式 & 集合

### 创建集合

```python
import weaviate
import weaviate.classes as wvc
from weaviate.classes.config import Configure, Property, DataType

client = weaviate.connect_to_custom(
    http_host="<server-ip>", http_port=8080,
    grpc_host="<server-ip>", grpc_port=50051,
    http_secure=False, grpc_secure=False,
)

# 创建集合（在 v3 中称为 “class”）
client.collections.create(
    name="Article",
    vectorizer_config=Configure.Vectorizer.none(),  # 我们将提供自己的向量
    # 或：Configure.Vectorizer.text2vec_openai() 用于自动向量化
    properties=[
        Property(name="title", data_type=DataType.TEXT),
        Property(name="content", data_type=DataType.TEXT),
        Property(name="author", data_type=DataType.TEXT),
        Property(name="published_date", data_type=DataType.DATE),
        Property(name="tags", data_type=DataType.TEXT_ARRAY),
        Property(name="view_count", data_type=DataType.INT),
    ],
)
print("集合 'Article' 已创建")
```

***

## 导入数据

### 使用预计算向量的批量导入

```python
import weaviate
import numpy as np
from sentence_transformers import SentenceTransformer

client = weaviate.connect_to_custom(
    http_host="<server-ip>", http_port=8080,
    grpc_host="<server-ip>", grpc_port=50051,
    http_secure=False, grpc_secure=False,
)

# 加载嵌入模型
encoder = SentenceTransformer("all-MiniLM-L6-v2")

# 示例文章
articles = [
    {"title": "Introduction to RAG", "content": "RAG 将检索与生成结合..."},
    {"title": "Vector Databases Explained", "content": "向量数据库存储高维嵌入..."},
    {"title": "Weaviate Best Practices", "content": "关于生产环境 Weaviate 部署，请考虑..."},
    {"title": "GPU Cloud Computing", "content": "Clore.ai 提供去中心化的 GPU 访问..."},
]

# 使用向量进行批量导入
collection = client.collections.get("Article")

with collection.batch.dynamic() as batch:
    for article in articles:
        # 计算向量
        vector = encoder.encode(article["content"]).tolist()

        batch.add_object(
            properties={
                "title": article["title"],
                "content": article["content"],
            },
            vector=vector,
        )

print(f"已导入 {len(articles)} 篇文章")
```

### 在导入时使用 OpenAI 自动向量化

```python
# 当集合使用 text2vec-openai 向量化器时，
# 只需插入数据 — 无需提供向量
collection = client.collections.get("ArticleOpenAI")

with collection.batch.dynamic() as batch:
    for article in articles:
        batch.add_object(
            properties={
                "title": article["title"],
                "content": article["content"],
            }
            # 无需向量 = Weaviate 通过 OpenAI 自动向量化
        )
```

***

## 查询

### 语义（向量）搜索

```python
# 查找与查询语义相似的文章
results = collection.query.near_text(
    query="how to store embeddings efficiently",
    limit=5,
    return_properties=["title", "content"],
    return_metadata=wvc.query.MetadataQuery(distance=True),
)

for obj in results.objects:
    print(f"标题: {obj.properties['title']}")
    print(f"距离: {obj.metadata.distance:.4f}")
    print()
```

### 混合搜索（向量 + BM25）

```python
# 结合语义与关键词搜索
results = collection.query.hybrid(
    query="RAG retrieval augmented generation",
    alpha=0.5,  # 0.0 = 纯 BM25，1.0 = 纯向量，0.5 = 平衡
    limit=5,
    return_properties=["title", "content"],
    return_metadata=wvc.query.MetadataQuery(score=True),
)

for obj in results.objects:
    print(f"标题: {obj.properties['title']}")
    print(f"混合得分: {obj.metadata.score:.4f}")
```

### 关键词搜索（BM25）

```python
results = collection.query.bm25(
    query="vector database indexing",
    limit=5,
    return_properties=["title"],
)
```

### 过滤搜索

```python
from weaviate.classes.query import Filter

# 将向量搜索与元数据过滤结合
results = collection.query.near_text(
    query="machine learning training",
    limit=10,
    filters=Filter.by_property("view_count").greater_than(1000),
    return_properties=["title", "view_count"],
)
```

### GraphQL 查询

```python
import requests

query = """
{
    Get {
        Article(
            nearText: {concepts: ["artificial intelligence"]}
            limit: 5
        ) {
            title
            content
            _additional {
                distance
                id
            }
        }
    }
}
"""

response = requests.post(
    "http://<server-ip>:8080/v1/graphql",
    json={"query": query},
)
data = response.json()
for article in data["data"]["Get"]["Article"]:
    print(article["title"])
```

***

## 生成式搜索（RAG）

```python
from weaviate.classes.generate import GenerateOptions

# 使用生成模块（OpenAI）设置集合
# 需要 ENABLE_MODULES=generative-openai

results = collection.generate.near_text(
    query="how to build a RAG system",
    limit=3,
    grouped_task="Summarize these articles and explain the key steps to build a RAG system.",
    grouped_properties=["title", "content"],
)

print("RAG 答案:")
print(results.generated)
print("\n来源文章:")
for obj in results.objects:
    print(f"  - {obj.properties['title']}")
```

***

## 多租户

```python
from weaviate.classes.config import Configure

# 创建多租户集合
client.collections.create(
    name="UserDocuments",
    multi_tenancy_config=Configure.multi_tenancy(enabled=True),
    properties=[
        Property(name="content", data_type=DataType.TEXT),
        Property(name="filename", data_type=DataType.TEXT),
    ],
)

# 创建租户
collection = client.collections.get("UserDocuments")
collection.tenants.create([
    wvc.config.Tenant(name="user_alice"),
    wvc.config.Tenant(name="user_bob"),
])

# 为特定租户插入数据
tenant_collection = collection.with_tenant("user_alice")
tenant_collection.data.insert({"content": "Alice's private document", "filename": "doc1.pdf"})

# 在租户范围内查询
results = collection.with_tenant("user_alice").query.near_text(
    query="private document",
    limit=5,
)
```

***

## REST API 示例

```bash
# 创建模式类
curl -X POST http://<server-ip>:8080/v1/schema \
    -H "Content-Type: application/json" \
    -d '{
        "class": "Product",
        "vectorizer": "none",
        "properties": [
            {"name": "name", "dataType": ["text"]},
            {"name": "description", "dataType": ["text"]},
            {"name": "price", "dataType": ["number"]}
        ]
    }'

# 添加带向量的对象
curl -X POST http://<server-ip>:8080/v1/objects \
    -H "Content-Type: application/json" \
    -d '{
        "class": "Product",
        "properties": {
            "name": "GPU Cloud Access",
            "description": "去中心化 GPU 市场",
            "price": 0.5
        },
        "vector": [0.1, 0.2, 0.3, ...]
    }'

# 向量搜索
curl http://<server-ip>:8080/v1/objects?class=Product&limit=5

# 健康检查
curl http://<server-ip>:8080/v1/.well-known/ready
```

***

## 故障排除

{% hint style="warning" %}
**Weaviate 无法启动** — 检查磁盘空间（`df -h`）。Weaviate 需要数据路径上的可写空间。还要在 Clore.ai 设置中验证端口 8080 是否已打开。
{% endhint %}

{% hint style="warning" %}
**导入慢** — 启用批量导入（`collection.batch.dynamic()` 或 `fixed_size()`）。对于大型数据集，避免单对象导入。批量大小 100–500 最佳。
{% endhint %}

{% hint style="info" %}
**高内存使用** — Weaviate 将向量索引保存在内存中以实现快速搜索。对于 1M 个 768 维向量：约需 \~6 GB 内存。在选择 Clore.ai 实例大小时请据此规划。
{% endhint %}

{% hint style="info" %}
**无法通过 Python 客户端连接** — 确保端口 8080（HTTP）和 50051（gRPC）都已打开。v4 Python 客户端默认使用 gRPC。
{% endhint %}

| 问题      | 修复                                         |
| ------- | ------------------------------------------ |
| `连接被拒绝` | 等待启动（约 \~30 秒），检查 `docker ps`，验证端口         |
| `模式已存在` | 先删除集合： `client.collections.delete("Name")` |
| `内存不足`  | 增加内存或减少向量维度                                |
| 向量搜索慢   | 添加 HNSW 索引或检查数据集大小与可用内存的比对                 |

***

## 1. 使用 SDXL-Turbo 或 SDXL-Lightning 以实现快速生成

1. **使用批量导入** — 比单次插入快 10x–50x
2. **选择合适的嵌入模型** — `all-MiniLM-L6-v2` （384 维）速度快； `text-embedding-3-large` （3072 维）质量最佳但使用 8 倍更多内存
3. **混合搜索的 alpha** — 针对您的用例进行调优 `alpha` ：关键词偏重查询使用 0.25，语义偏重查询使用 0.75
4. **HNSW 参数** — `ef` 和 `efConstruction` 控制召回与速度的权衡
5. **租户隔离** — 对于 SaaS 应用使用多租户；它比为每个用户创建独立集合的扩展性要好得多

***

## 相关工具

* [Qdrant](/guides/guides_v2-zh/rag-yu-xiang-liang-shu-ju-ku/qdrant.md) — 基于 Rust 的向量数据库，带有效载荷过滤
* [ChromaDB](/guides/guides_v2-zh/rag-yu-xiang-liang-shu-ju-ku/chromadb.md) — 轻量级嵌入数据库
* [Milvus](/guides/guides_v2-zh/rag-yu-xiang-liang-shu-ju-ku/milvus.md) — 高规模向量数据库

***

*在 Clore.ai 上的 Weaviate 为您提供生产级的向量数据库和 GPU 加速的向量化 —— 非常适合构建可扩展的 RAG 系统和语义搜索应用。*

***

## Clore.ai 的 GPU 建议

| 在 Clore.ai 上的预估费用 | 开发/测试 | RTX 3090（24GB） |
| ----------------- | ----- | -------------- |
| \~$0.12/每 GPU/每小时 | 生产    | RTX 4090（24GB） |
| 生产级向量搜索           | 生产    | RTX 4090（24GB） |
| 高吞吐量嵌入            | 大规模   | A100 80GB      |

> GPU 服务器上。浏览可用 GPU 并按小时租用 — 无需承诺，提供完整的 root 访问权限。 [Clore.ai](https://clore.ai/marketplace) GPU 服务器。浏览可用 GPU 并按小时租用 — 无需承诺，提供完整的 root 访问权限。


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/guides_v2-zh/rag-yu-xiang-liang-shu-ju-ku/weaviate.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.