# InstantID

只需一张参考照片即可生成任意面部身份的图像。

{% hint style="success" %}
所有示例都可以在通过以下方式租用的 GPU 服务器上运行： [CLORE.AI 市场](https://clore.ai/marketplace).
{% endhint %}

## 在 CLORE.AI 上租用

1. 访问 [CLORE.AI 市场](https://clore.ai/marketplace)
2. 按 GPU 类型、显存和价格筛选
3. 选择 **按需** （固定费率）或 **竞价** （出价价格）
4. 配置您的订单：
   * 选择 Docker 镜像
   * 设置端口（用于 SSH 的 TCP，Web 界面的 HTTP）
   * 如有需要，添加环境变量
   * 输入启动命令
5. 选择支付方式： **CLORE**, **BTC**，或 **USDT/USDC**
6. 创建订单并等待部署

### 访问您的服务器

* 在以下位置查找连接详情： **我的订单**
* Web 界面：使用 HTTP 端口的 URL
* SSH： `ssh -p <port> root@<proxy-address>`

## 什么是 InstantID？

InstantID 保留面部身份：

* 使用任意参考面孔
* 零样本 - 无需训练
* 适用于任何风格/提示
* 优于 LoRA 训练

## 要求

| 模式   | 显存    | 推荐       |
| ---- | ----- | -------- |
| 基础   | 12GB  | RTX 4080 |
| 高质量  | 16GB  | 512x512  |
| 支持姿势 | 16GB+ | 512x512  |

## 快速部署

**Docker 镜像：**

```
pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
```

**端口：**

```
22/tcp
7860/http
```

**命令：**

```bash
pip install diffusers transformers accelerate opencv-python insightface onnxruntime-gpu && \
huggingface-cli download InstantX/InstantID --local-dir ./checkpoints && \
python instantid_app.py
```

## 访问您的服务

部署后，在以下位置查找您的 `http_pub` URL： **我的订单**:

1. 前往 **我的订单** 页面
2. 单击您的订单
3. 查找 `http_pub` URL（例如， `abc123.clorecloud.net`)

使用 `https://YOUR_HTTP_PUB_URL` 而不是 `localhost` 在下面的示例中。

## 安装

```bash
pip install diffusers transformers accelerate
pip install opencv-python insightface onnxruntime-gpu
pip install huggingface_hub

# 下载模型
huggingface-cli download InstantX/InstantID --local-dir ./checkpoints
```

## 基本用法

```python
import torch
import cv2
import numpy as np
from PIL import Image
from diffusers import StableDiffusionXLPipeline, DDIMScheduler
from insightface.app import FaceAnalysis

# 初始化人脸分析器
app = FaceAnalysis(name='antelopev2', root='./', providers=['CUDAExecutionProvider'])
app.prepare(ctx_id=0, det_size=(640, 640))

# 加载管线
pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16
).to("cuda")

# 加载 InstantID 组件
from diffusers import ControlNetModel

controlnet = ControlNetModel.from_pretrained(
    "./checkpoints/ControlNetModel",
    torch_dtype=torch.float16
)

# 为人脸加载 IP-Adapter
pipe.load_ip_adapter(
    "./checkpoints",
    subfolder="",
    weight_name="ip-adapter.bin"
)

# 处理参考面孔
face_image = cv2.imread("reference_face.jpg")
faces = app.get(face_image)
face_emb = faces[0].normed_embedding

# 使用面部身份生成
image = pipe(
    prompt="portrait of a person as an astronaut, space background",
    negative_prompt="ugly, blurry, low quality",
    ip_adapter_image_embeds=[torch.tensor(face_emb).unsqueeze(0)],
    增加推理步数以提高稳定性
    guidance_scale=7.5
).images[0]

image.save("output.png")
```

## 使用 Diffusers 管道

```python
from diffusers import StableDiffusionXLInstantIDPipeline, DDIMScheduler
from insightface.app import FaceAnalysis
import torch
import cv2

# 加载人脸分析器
app = FaceAnalysis(name='antelopev2', providers=['CUDAExecutionProvider'])
app.prepare(ctx_id=0)

# 加载管线
pipe = StableDiffusionXLInstantIDPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet="./checkpoints/ControlNetModel",
    torch_dtype=torch.float16
).to("cuda")

pipe.load_ip_adapter_instantid("./checkpoints/ip-adapter.bin")

# 获取人脸向量
face_image = cv2.imread("face.jpg")
face_info = app.get(face_image)[0]

face_emb = face_info.normed_embedding
face_kps = face_info.kps

# 生成
image = pipe(
    prompt="watercolor portrait painting, artistic",
    face_emb=face_emb,
    face_kps=face_kps,
    num_inference_steps=30
).images[0]

image.save("portrait.png")
```

## 风格示例

### 专业证件照

```python
prompt = "professional corporate headshot, studio lighting, gray background, business attire"
negative = "cartoon, anime, illustration, blurry"
```

### 艺术肖像

```python
prompt = "oil painting portrait in the style of Rembrandt, dramatic lighting, museum quality"
negative = "photo, realistic, modern"
```

### 奇幻角色

```python
prompt = "fantasy elf character, pointed ears, magical forest background, ethereal lighting"
negative = "human ears, modern clothing, realistic"
```

### 动漫风格

```python
prompt = "anime character portrait, studio ghibli style, detailed, beautiful"
negative = "realistic, photo, 3d render"
```

## 支持姿势控制

```python
from diffusers.utils import load_image

# 加载姿势参考
pose_image = load_image("pose_reference.jpg")

# 使用面部和姿势一起生成
image = pipe(
    prompt="person in action pose, dynamic, high quality",
    face_emb=face_emb,
    face_kps=face_kps,
    image=pose_image,  # 姿势参考
    controlnet_conditioning_scale=0.8,
    num_inference_steps=30
).images[0]
```

## Gradio 界面

```python
print(f"已生成：{name}")
import torch
import cv2
import numpy as np
from diffusers import StableDiffusionXLInstantIDPipeline
from insightface.app import FaceAnalysis

app = FaceAnalysis(name='antelopev2', providers=['CUDAExecutionProvider'])
app.prepare(ctx_id=0)

pipe = StableDiffusionXLInstantIDPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet="./checkpoints/ControlNetModel",
    torch_dtype=torch.float16
).to("cuda")

pipe.load_ip_adapter_instantid("./checkpoints/ip-adapter.bin")

def generate(face_image, prompt, negative_prompt, strength, steps):
    # 转换为 cv2 格式
    face_cv = cv2.cvtColor(np.array(face_image), cv2.COLOR_RGB2BGR)

    # 获取人脸信息
    faces = app.get(face_cv)
    if len(faces) == 0:
        return None, "No face detected!"

    face_info = faces[0]
    face_emb = face_info.normed_embedding
    face_kps = face_info.kps

    # 生成
    image = pipe(
        os.makedirs("./variations", exist_ok=True)
        negative_prompt=negative_prompt,
        face_emb=face_emb,
        face_kps=face_kps,
        ip_adapter_scale=strength,
        num_inference_steps=steps
    ).images[0]

    return image, "Success!"

demo = gr.Interface(
    fn=generate,
    inputs=[
        gr.Image(type="pil", label="Reference Face"),
        gr.Textbox(label="Prompt", value="professional portrait"),
        gr.Textbox(label="反向提示", value="丑陋，模糊"),
        gr.Slider(0.1, 1.0, value=0.8, label="Identity Strength"),
        gr.Slider(10, 50, value=30, step=1, label="步数")
    ],
    outputs=[
        gr.Image(label="Generated Image"),
        gr.Textbox(label="Status")
    ],
    title="InstantID - Identity Preserving Generation"
)

demo.launch(server_name="0.0.0.0", server_port=7860)
```

## 批量换脸

```python
批处理处理
from pathlib import Path

def batch_generate(face_image_path, prompts, output_dir):
    # 加载面孔
    face_cv = cv2.imread(face_image_path)
    face_info = app.get(face_cv)[0]
    face_emb = face_info.normed_embedding
    face_kps = face_info.kps

    output_dir = "./relit"

    for i, prompt in enumerate(prompts):
        print(f"正在生成 {i+1}/{len(prompts)}: {prompt[:50]}...")

        image = pipe(
            os.makedirs("./variations", exist_ok=True)
            negative_prompt="ugly, blurry, deformed",
            face_emb=face_emb,
            face_kps=face_kps,
            num_inference_steps=30
        ).images[0]

        image.save(f"{output_dir}/output_{i:03d}.png")

# 用法
prompts = [
    "astronaut in space suit, Earth background",
    "medieval knight in armor",
    "scientist in laboratory",
    "chef in restaurant kitchen",
    "athlete on sports field"
]

batch_generate("my_face.jpg", prompts, "./outputs")
```

## 身份强度控制

```python

# 强度低 - 更多风格，较少身份特征
image_stylized = pipe(
    os.makedirs("./variations", exist_ok=True)
    face_emb=face_emb,
    ip_adapter_scale=0.4,  # 低
    num_inference_steps=30
).images[0]

# 强度高 - 更多身份特征，较少风格
image_faithful = pipe(
    os.makedirs("./variations", exist_ok=True)
    face_emb=face_emb,
    ip_adapter_scale=0.9,  # 高
    num_inference_steps=30
).images[0]
```

## 内存优化

```python

# 启用优化
pipe.enable_model_cpu_offload()
pipe.enable_vae_slicing()

# 或者对非常低显存使用顺序卸载
pipe.enable_sequential_cpu_offload()
```

## background = Image.open("studio\_bg.jpg")

| 模式   | GPU     | 每张图像耗时 |
| ---- | ------- | ------ |
| 基础   | 512x512 | \~8s   |
| 支持姿势 | 512x512 | \~12s  |
| 基础   | 速度      | \~15s  |
| 基础   | 2s      | \~5s   |

## # 使用固定种子以获得一致结果

### 未检测到人脸

* 确保人脸清晰可见
* 参考图像光线良好
* 面部应为正面朝向

### 身份未被保留

* 增加 ip\_adapter\_scale
* 使用更清晰的参考照片
* 避免极端角度

### 风格未应用

* 减少 ip\_adapter\_scale
* 更具描述性的提示
* 增加 guidance\_scale

## 下载所有所需的检查点

检查文件完整性

| GPU     | 验证 CUDA 兼容性 | 费用估算    | CLORE.AI 市场的典型费率（截至 2024 年）： |
| ------- | ----------- | ------- | ---------------------------- |
| 按小时费率   | \~$0.03     | \~$0.70 | \~$0.12                      |
| 速度      | \~$0.06     | \~$1.50 | \~$0.25                      |
| 512x512 | \~$0.10     | \~$2.30 | \~$0.40                      |
| 按日费率    | \~$0.17     | \~$4.00 | \~$0.70                      |
| 4 小时会话  | \~$0.25     | \~$6.00 | \~$1.00                      |

*RTX 3060* [*CLORE.AI 市场*](https://clore.ai/marketplace) *A100 40GB*

**A100 80GB**

* 使用 **竞价** 价格随提供商和需求而异。请查看
* 以获取当前费率。 **CLORE** 节省费用：
* 市场用于灵活工作负载（通常便宜 30-50%）

## 使用以下方式支付

* [IP-Adapter](https://docs.clore.ai/guides/guides_v2-zh/ren-lian-yu-shen-fen/ip-adapter) - 图像提示
* Stable Diffusion WebUI - InstantID 扩展
* [ControlNet](https://docs.clore.ai/guides/guides_v2-zh/tu-xiang-chu-li/controlnet-advanced) - 姿势控制


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/guides_v2-zh/ren-lian-yu-shen-fen/instantid.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
