# 3D 高斯溅射

**3D 高斯点喷** 是一种革命性的实时 3D 场景重建技术，拥有超过 **15,000 个 GitHub 点赞**。与基于 NeRF 的方法不同，高斯点喷将场景表示为数百万个微小的 3D 高斯体，可以以 **实时帧率渲染** （100+ FPS）同时达到照片级真实感质量。将其部署到 Clore.ai 的 GPU 云上，从您自己的照片重建并探索 3D 场景。

***

## 什么是 3D 高斯点喷？

传统的 NeRF 方法将场景隐式编码在神经网络中，渲染时需要逐像素的射线遍历。高斯点喷采取了根本不同的方法：

1. **初始化：** 从稀疏点云开始（来自 COLMAP）
2. **表示：** 将每个点展开为具有位置、尺度、旋转、不透明度和球面谐波颜色的 3D 高斯体
3. **优化：** 可微渲染高斯体并针对训练图像进行优化
4. **渲染：** 通过 alpha 合成将高斯体投影到图像平面（极其快速）

**相比 NeRF 的关键优势：**

* 实时渲染（1080p 下 100+ FPS）
* 更好的细节重建
* 显式 3D 表示（可编辑、可导出）
* 更快的训练（30–60 分钟对比数小时）
* 能在消费级 GPU 上运行

***

## 先决条件

| 要求      | 最低要求          | 推荐配置            |
| ------- | ------------- | --------------- |
| GPU 显存  | 12 GB         | 24 GB           |
| GPU     | RTX 3080 12GB | RTX 4090 / A100 |
| 内存（RAM） | 16 GB         | 32 GB           |
| 存储      | 30 GB         | 60 GB           |
| CUDA    | 11.7+         | 12.1+           |

{% hint style="warning" %}
高斯点喷对 CUDA 有严格要求。CUDA 版本必须与 `diff-gaussian-rasterization` 已编译的扩展匹配。使用提供的 Dockerfile 可消除兼容性问题。
{% endhint %}

***

## 步骤 1 — 在 Clore.ai 上租用 GPU

1. 登录到 [clore.ai](https://clore.ai).
2. 点击 **市场** 并筛选 VRAM ≥ 16 GB。
3. 选择一台服务器 — RTX 4090 提供最佳的性价比。
4. 将 Docker 镜像设置为您自定义的镜像（见步骤 2）。
5. 设置开放端口： `22` （SSH）和 `8080` （网页查看器）。
6. 点击 **租用**.

***

## 第 2 步 — Dockerfile

构建包含所有依赖项的自定义 Docker 镜像：

```dockerfile
FROM pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel

ENV DEBIAN_FRONTEND=noninteractive
ENV TORCH_CUDA_ARCH_LIST="6.0;6.1;7.0;7.5;8.0;8.6;8.9;9.0+PTX"

RUN apt-get update && apt-get install -y \
    git wget curl cmake build-essential \
    libboost-program-options-dev libboost-filesystem-dev \
    libboost-graph-dev libboost-system-dev libboost-test-dev \
    libeigen3-dev libflann-dev libfreeimage-dev \
    libmetis-dev libgoogle-glog-dev libgflags-dev \
    libsqlite3-dev libglew-dev qtbase5-dev libqt5opengl5-dev \
    libcgal-dev libceres-dev \
    ffmpeg libgl1 libglib2.0-0 \
    openssh-server \
    python3-pip python3-dev \
    && rm -rf /var/lib/apt/lists/*

# 安装 COLMAP
RUN apt-get update && apt-get install -y colmap && rm -rf /var/lib/apt/lists/*

# 配置 SSH
RUN mkdir /var/run/sshd && \
    echo 'root:clore123' | chpasswd && \
    sed -i 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config

WORKDIR /workspace

# 克隆原始 3DGS 仓库
RUN git clone https://github.com/graphdeco-inria/gaussian-splatting /workspace/gaussian-splatting \
    --recursive

# 安装 Python 依赖项
RUN cd /workspace/gaussian-splatting && \
    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 && \
    pip install -r requirements.txt

# 构建 CUDA 扩展
RUN cd /workspace/gaussian-splatting && \
    pip install submodules/diff-gaussian-rasterization && \
    pip install submodules/simple-knn

# 安装网页查看器依赖
RUN pip install viser==0.1.29 nerfview==0.0.4 trimesh

EXPOSE 22 8080

CMD service ssh start && tail -f /dev/null
```

### 构建并推送

构建镜像并将其推送到你自己的 Docker Hub 帐户（替换 `YOUR_DOCKERHUB_USERNAME` 为你的实际用户名）：

```bash
docker build -t YOUR_DOCKERHUB_USERNAME/gaussian-splatting:latest .
docker push YOUR_DOCKERHUB_USERNAME/gaussian-splatting:latest
```

{% hint style="info" %}
在 Docker Hub 上没有官方预构建的 3D Gaussian Splatting 镜像。官方仓库 [graphdeco-inria/gaussian-splatting](https://github.com/graphdeco-inria/gaussian-splatting) 不提供镜像——请从上面的 Dockerfile 构建。镜像必须使用与目标 GPU 匹配的正确 CUDA 架构标志进行构建。
{% endhint %}

使用 `YOUR_DOCKERHUB_USERNAME/gaussian-splatting:latest` 在您的 Clore.ai 配置中。

***

## 步骤 3 — 通过 SSH 连接

```bash
ssh root@<clore-host> -p <assigned-ssh-port>
```

验证构建：

```bash
cd /workspace/gaussian-splatting
python -c "from diff_gaussian_rasterization import GaussianRasterizationSettings; print('CUDA extension OK')"
```

***

## 步骤 4 — 准备您的数据集

### 选项 A：使用 Tandt（Tanks and Temples）数据集

用于快速测试的经典基准数据集：

```bash
mkdir -p /workspace/data && cd /workspace/data

# 下载小的测试场景
wget https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/datasets/input/tandt.zip
unzip tandt.zip
```

### 选项 B：处理您自己的照片

```bash
# 上传照片
scp -P <port> -r ./my_photos/ root@<clore-host>:/workspace/data/

# 运行 COLMAP 处理脚本（随 3DGS 提供）
cd /workspace/gaussian-splatting

python convert.py \
    -s /workspace/data/my_photos \
    --no_gpu   # 可选：如果 COLMAP 的 GPU 求解器冲突
```

{% hint style="info" %}
参数 `convert.py` 该脚本运行完整的 COLMAP 管道：特征提取、匹配、稀疏重建和去畸变。根据图像数量，这需要 5–30 分钟。
{% endhint %}

### 选项 C：从视频处理

```bash
# 以 2fps 从视频中提取帧
ffmpeg -i /workspace/data/my_video.mp4 \
    -vf fps=2 \
    /workspace/data/frames/frame_%04d.jpg

# 然后对这些帧运行 COLMAP 处理
python convert.py -s /workspace/data/frames
```

***

## 步骤 5 — 训练高斯点喷模型

### 标准训练

```bash
cd /workspace/gaussian-splatting

python train.py \
    -s /workspace/data/my_photos \
    -m /workspace/output/my_scene \
    --iterations 30000 \
    --eval
```

### 在 Tandt 数据集上训练

```bash
python train.py \
    -s /workspace/data/tandt/truck \
    -m /workspace/output/truck \
    --iterations 30000 \
    --eval
```

### 快速训练（快速预览）

```bash
python train.py \
    -s /workspace/data/my_photos \
    -m /workspace/output/my_scene_fast \
    --iterations 7000
```

{% hint style="info" %}
在 RTX 4090 上训练到 7,000 次迭代大约需要 \~10 分钟，可得到良好的质量预览。完整的 30,000 次迭代大约需要 \~30–40 分钟，生成最终质量结果。
{% endhint %}

### 训练进度

监控训练输出——您会看到诸如以下的指标：

```
[ITER 1000] Evaluating train: L1 0.04, PSNR 26.12 dB
[ITER 7000] Evaluating train: L1 0.02, PSNR 29.45 dB
[ITER 30000] Evaluating train: L1 0.01, PSNR 32.80 dB
```

PSNR 超过 30 dB 表示高质量重建。

***

## 步骤 6 — 渲染与可视化

### 从训练模型渲染

```bash
python render.py \
    -m /workspace/output/my_scene \
    --skip_train
```

渲染结果保存到 `/workspace/output/my_scene/test/ours_30000/renders/`.

### 创建飞行穿越视频

```bash
# 将渲染帧转换为视频
ffmpeg -framerate 24 \
    -pattern_type glob \
    -i '/workspace/output/my_scene/test/ours_30000/renders/*.png' \
    -c:v libx264 \
    -pix_fmt yuv420p \
    /workspace/output/flythrough.mp4
```

### 评估指标

```bash
python metrics.py -m /workspace/output/my_scene
```

预期输出：

```
SSIM : 0.8324
PSNR : 32.81
LPIPS: 0.1893
```

***

## 步骤 7 — 交互式网页查看器

要交互式地探索训练好的场景：

### 使用 nerfview/viser

```python
# /workspace/view_splat.py
import viser
import numpy as np
from plyfile import PlyData
import torch

server = viser.ViserServer(host="0.0.0.0", port=8080)
print("查看器运行在 http://0.0.0.0:8080")

# 加载 PLY 文件
ply_path = "/workspace/output/my_scene/point_cloud/iteration_30000/point_cloud.ply"
plydata = PlyData.read(ply_path)

xyz = np.stack([
    plydata['vertex']['x'],
    plydata['vertex']['y'],
    plydata['vertex']['z'],
], axis=-1)

# 将点云添加到查看器
server.add_point_cloud(
    name="/splat",
    points=xyz,
    colors=np.ones((len(xyz), 3)) * 0.7,
    point_size=0.003,
)

import base64
while True:
    time.sleep(0.01)
```

```bash
python /workspace/view_splat.py &
```

然后打开： `http://<clore-host>:<public-port-8080>`

### 替代方案：使用 SuperSplat（基于浏览器的查看器）

下载 `.ply` 文件并在 [SuperSplat](https://playcanvas.com/super-splat):

```bash
# 从本地机器下载
scp -P <port> root@<clore-host>:/workspace/output/my_scene/point_cloud/iteration_30000/point_cloud.ply ./
```

然后将 `.ply` 拖放到 SuperSplat 浏览器中，位置： `https://playcanvas.com/super-splat`

***

## 高级选项

### 控制高斯体数量

```bash
# 更高的稠密化以获得更详细的场景
python train.py \
    -s /workspace/data/my_photos \
    -m /workspace/output/my_scene \
    --densify_until_iter 15000 \
    --densify_grad_threshold 0.0002
```

### 白色背景（用于物体）

```bash
python train.py \
    -s /workspace/data/my_object \
    -m /workspace/output/my_object \
    --white_background
```

### 大规模场景

```bash
# 对户外场景增加不透明度重置间隔
python train.py \
    -s /workspace/data/outdoor \
    -m /workspace/output/outdoor \
    --opacity_reset_interval 5000 \
    --iterations 50000
```

***

## 替代：使用 gsplat 的高斯点喷

`gsplat` 是一个更快、内存高效的实现：

```bash
pip install gsplat

# 使用 gsplat 训练
python examples/simple_trainer.py \
    --data_dir /workspace/data/my_photos \
    --result_dir /workspace/gsplat_output
```

***

## 故障排除

### CUDA 扩展构建失败

```
error: no kernel image is available for execution on the device
```

**解决方案：** 为您的特定 GPU 架构重新构建：

```bash
export TORCH_CUDA_ARCH_LIST="8.6"  # 适用于 RTX 3090/4090
cd /workspace/gaussian-splatting
pip install submodules/diff-gaussian-rasterization --force-reinstall
```

### COLMAP 无法重建

**将批量大小减小到 1**

* 确保图像重叠 ≥ 50%
* 使用更多照片（建议 100+）
* 对于视频帧尝试顺序匹配：添加 `--match sequential` 到 convert.py

### 训练期间内存不足

```bash
# 减少最大的高斯体数量
python train.py \
    -s /workspace/data/my_photos \
    -m /workspace/output/my_scene \
    --max_num_splats 2000000  # 默认约为 ~6M
```

### 场景中出现漂浮物

来自高斯初始化的漂浮伪影：

* 增加 `--densify_grad_threshold` 以更有选择性地处理
* 使用 `--prune_opacity_threshold 0.005` 以便更早移除低不透明度的高斯体

***

## Clore.ai 的 GPU 建议

高斯点喷训练对 GPU 计算要求高，频繁调用 CUDA 内核。显存决定了场景复杂度上限（高斯体数量）；计算能力决定训练速度。

| GPU           | 显存（VRAM） | Clore.ai 价格 | 30K 次迭代训练  | 最大高斯体数    |
| ------------- | -------- | ----------- | ---------- | --------- |
| RTX 3090      | 24 GB    | \~$0.12/小时  | \~45–55 分钟 | \~6M      |
| RTX 4090      | 24 GB    | \~$0.70/小时  | \~30–35 分钟 | \~6M      |
| A100 40GB     | 40 GB    | \~$1.20/小时  | \~12–18 分钟 | \~10M+    |
| RTX 3080 12GB | 12 GB    | \~$0.08/小时  | \~70 分钟    | \~3M（受限制） |

{% hint style="info" %}
**RTX 3090，费用约 $0.12/小时，是高斯点喷的最佳选择** 用于高斯点喷。完整的 30K 次迭代训练运行的 GPU 时间费用约为 $0.09–0.11。对于同一会话中的多个场景，成本可忽略不计。

用于快速实验：先训练到 7,000 次迭代（在 RTX 3090 上约 \~15 分钟，约 $0.03）。在网页查看器中检查质量。只有在生成最终输出时再运行完整的 30K 次迭代。
{% endhint %}

**COLMAP 预处理说明：** COLMAP（基于运动的结构）可在 CPU/GPU 上运行，但大部分计算负载在 CPU 上。对于少于 200 张图像的场景，大多数 Clore.ai 服务器的 CPU 足够。对于 500+ 图像的数据集，请选择具有 16+ CPU 内核的服务器。

***

## 有用的资源

* [3D Gaussian Splatting GitHub](https://github.com/graphdeco-inria/gaussian-splatting)
* [原始论文（SIGGRAPH 2023）](https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/)
* [gsplat — 快速实现](https://github.com/nerfstudio-project/gsplat)
* [SuperSplat — 浏览器查看器](https://playcanvas.com/super-splat)
* [高斯点喷社区（Reddit）](https://www.reddit.com/r/gaussiansplatting/)
* [精彩的 Gaussian Splatting 收藏](https://github.com/MrNeRF/awesome-3D-gaussian-splatting)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/guides_v2-zh/3d-sheng-cheng/gaussian-splatting.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.