# SDXL Turbo 与 LCM

在 CLORE.AI GPU 上使用 SDXL Turbo 和 潜在一致性模型（LCM）以 1-4 步生成图像。

{% hint style="success" %}
所有示例都可以在通过以下方式租用的 GPU 服务器上运行： [CLORE.AI 市场](https://clore.ai/marketplace).
{% endhint %}

## 为什么选择 SDXL Turbo / LCM？

* **实时速度** - 以 1-4 步生成图像，而不是 30-50 步
* **相同的质量** - 与完整 SDXL 相当，但步数减少约 10 倍
* **交互性** - 足够快速以用于实时应用
* **低显存** - 高效的内存使用
* **兼容 LoRA** - 可与现有 SDXL LoRA 一起使用

## 1024x1024

| A100            | 步数  | 性能  | 质量 | 显存   |
| --------------- | --- | --- | -- | ---- |
| SDXL Turbo      | 1-4 | 最快  | 良好 | 8GB  |
| SDXL Lightning  | 2-8 | 非常快 | 很棒 | 8GB  |
| LCM-SDXL        | 4-8 | 快速  | 很棒 | 8GB  |
| LCM-LoRA + SDXL | 4-8 | 快速  | 优秀 | 10GB |
| SD Turbo (1.5)  | 1-4 | 最快  | 良好 | 4GB  |

## 在 CLORE.AI 上快速部署

**Docker 镜像：**

```
pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
```

**端口：**

```
22/tcp
7860/http
```

**命令：**

```bash
pip install diffusers transformers accelerate gradio && \
python -c "
print(f"已生成：{name}")
import torch
from diffusers import AutoPipelineForText2Image

pipe = AutoPipelineForText2Image.from_pretrained(
    'stabilityai/sdxl-turbo',
    torch_dtype=torch.float16,
    variant='fp16'
).to('cuda')

def generate(prompt, steps, seed):
    generator = torch.Generator('cuda').manual_seed(seed) if seed > 0 else None
    image = pipe(prompt, num_inference_steps=steps, guidance_scale=0.0, generator=generator).images[0]
    return image

gr.Interface(
    fn=generate,
    inputs=[
        gr.Textbox(label='Prompt'),
        gr.Slider(1, 4, value=1, step=1, label='Steps'),
        gr.Number(value=-1, label='Seed')
    ],
    outputs=gr.Image(),
    title='SDXL Turbo - 实时生成'
).launch(server_name='0.0.0.0', server_port=7860)
"
```

## 访问您的服务

部署后，在以下位置查找您的 `http_pub` URL： **我的订单**:

1. 前往 **我的订单** 页面
2. 单击您的订单
3. 查找 `http_pub` URL（例如， `abc123.clorecloud.net`)

使用 `https://YOUR_HTTP_PUB_URL` 而不是 `localhost` 在下面的示例中。

## 硬件要求

| A100           | 最低 GPU        | 推荐       |
| -------------- | ------------- | -------- |
| SD Turbo       | RTX 3060 8GB  | RTX 3070 |
| SDXL Turbo     | RTX 3070 8GB  | RTX 3080 |
| SDXL Lightning | RTX 3070 8GB  | 速度       |
| LCM-SDXL       | RTX 3080 10GB | 512x512  |

## 安装

```bash
pip install diffusers transformers accelerate torch
```

## SDXL Turbo

### 基本用法

```python
import torch
from diffusers import AutoPipelineForText2Image

pipe = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/sdxl-turbo",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")

# 以 1 步生成！
image = pipe(
    prompt="一张电影感的幼浣熊照片，穿着精致的意大利牧师袍",
    num_inference_steps=1,
    guidance_scale=0.0  # Turbo 不使用 CFG
).images[0]

image.save("raccoon.png")
```

### 最佳设置

```python
# 1 步 - 最快，质量良好
image = pipe(prompt, num_inference_steps=1, guidance_scale=0.0).images[0]

# 2 步 - 更好细节
image = pipe(prompt, num_inference_steps=2, guidance_scale=0.0).images[0]

# 4 步 - Turbo 的最佳质量
image = pipe(prompt, num_inference_steps=4, guidance_scale=0.0).images[0]
```

## SDXL Lightning

### 两步生成

```python
import torch
from diffusers import StableDiffusionXLPipeline, EulerDiscreteScheduler
from huggingface_hub import hf_hub_download

base = "stabilityai/stable-diffusion-xl-base-1.0"
repo = "ByteDance/SDXL-Lightning"
ckpt = "sdxl_lightning_2step_unet.safetensors"

# 加载基础模型
pipe = StableDiffusionXLPipeline.from_pretrained(
    base,
    torch_dtype=torch.float16,
    variant="fp16"
).to("cuda")

# 加载 lightning unet
pipe.unet.load_state_dict(
    torch.load(hf_hub_download(repo, ckpt), map_location="cuda")
)

# 配置调度器
pipe.scheduler = EulerDiscreteScheduler.from_config(
    pipe.scheduler.config,
    timestep_spacing="trailing"
)

# 以 2 步生成
image = pipe(
    "一个在花园里微笑的女孩",
    num_inference_steps=2,
    guidance_scale=0.0
).images[0]

image.save("lightning.png")
```

### 4 步（更高质量）

```python
ckpt = "sdxl_lightning_4step_unet.safetensors"
# ... 相同的设置 ...

image = pipe(
    prompt,
    num_inference_steps=4,
    guidance_scale=0.0
).images[0]
```

## LCM-LoRA

与任何 SDXL 模型配合使用以实现快速生成：

```python
import torch
from diffusers import DiffusionPipeline, LCMScheduler

pipe = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")

# 加载 LCM-LoRA
pipe.load_lora_weights("latent-consistency/lcm-lora-sdxl")

# 设置 LCM 调度器
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)

# 以 4 步生成
image = pipe(
    "丛林中的宇航员，冷色调，颜色沉稳，细节丰富，8k",
    num_inference_steps=4,
    guidance_scale=1.0  # LCM 使用较低的 CFG
).images[0]

image.save("lcm_lora.png")
```

### 使用自定义 LoRA

```python
# 加载 base + LCM-LoRA + 风格 LoRA
pipe.load_lora_weights("latent-consistency/lcm-lora-sdxl", adapter_name="lcm")
pipe.load_lora_weights("your-style-lora", adapter_name="style")

# 组合适配器
pipe.set_adapters(["lcm", "style"], adapter_weights=[1.0, 0.8])

image = pipe(prompt, num_inference_steps=4, guidance_scale=1.5).images[0]
```

## SD Turbo（SD 1.5）

以降低显存需求：

```python
import torch
from diffusers import AutoPipelineForText2Image

pipe = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/sd-turbo",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")

image = pipe(
    "一张猫的照片",
    num_inference_steps=1,
    guidance_scale=0.0
).images[0]
```

## 图像到图像（Image-to-Image）

### SDXL Turbo 图像到图像（Img2Img）

```python
from diffusers import AutoPipelineForImage2Image
from diffusers.utils import load_image

pipe = AutoPipelineForImage2Image.from_pretrained(
    "stabilityai/sdxl-turbo",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")

init_image = load_image("input.jpg").resize((512, 512))

image = pipe(
    prompt="猫巫师，甘道夫，指环王，细节丰富，奇幻",
    image=init_image,
    num_inference_steps=2,
    strength=0.5,
    guidance_scale=0.0
).images[0]
```

## 批量生成

```python
import torch
from diffusers import AutoPipelineForText2Image

pipe = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/sdxl-turbo",
    torch_dtype=torch.float16
).to("cuda")

prompts = [
    "山间的日落",
    "夜晚的未来城市",
    "花园里的可爱机器人",
    "雾中的古老寺庙"
]

# 批量生成
images = pipe(
    prompts,
    num_inference_steps=1,
    guidance_scale=0.0
).images

for i, img in enumerate(images):
    img.save(f"batch_{i}.png")
```

## 实时流式传输

```python
print(f"已生成：{name}")
import torch
from diffusers import AutoPipelineForText2Image

pipe = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/sdxl-turbo",
    torch_dtype=torch.float16
).to("cuda")

def generate_realtime(prompt):
    if not prompt:
        return None
    image = pipe(
        prompt,
        num_inference_steps=1,
        guidance_scale=0.0,
        width=512,
        height=512
    ).images[0]
    return image

demo = gr.Interface(
    fn=generate_realtime,
    inputs=gr.Textbox(label="Prompt"),
    outputs=gr.Image(label="Generated"),
    live=True,  # 在输入时更新
    title="实时 SDXL Turbo"
)

demo.launch(server_name="0.0.0.0", server_port=7860)
```

## 性能比较

| A100           | 步数 | 分辨率      | 速度   | 512x512 | 2s      |
| -------------- | -- | -------- | ---- | ------- | ------- |
| SDXL（基础）       | 30 | RTX 4090 | 8s   | 5s      | FBC（合成） |
| SDXL Turbo     | 1  | 分辨率      | 0.3秒 | 0.2s    | 0.15s   |
| SDXL Turbo     | 4  | 分辨率      | 0.8秒 | 0.5秒    | 0.4s    |
| SDXL Lightning | 2  | RTX 4090 | 0.8秒 | 0.5秒    | 0.4s    |
| SDXL Lightning | 4  | RTX 4090 | 1.2秒 | 0.8秒    | 0.6s    |
| LCM-SDXL       | 4  | RTX 4090 | 1.5s | 1.0s    | 0.7s    |

## 质量比较

| 方面    | SDXL 30 步 | Turbo 4 步 | Lightning 4 步 |
| ----- | --------- | --------- | ------------- |
| 详细信息  | 优秀        | 良好        | 很棒            |
| 文本呈现  | 良好        | 差         | 差             |
| 人脸    | 很棒        | 良好        | 良好            |
| 一致性   | 优秀        | 良好        | 很棒            |
| 风格多样性 | 优秀        | 良好        | 很棒            |

## 何时使用哪种工具

| 模型变体       | 推荐             | 步数  |
| ---------- | -------------- | --- |
| 实时预览       | SDXL Turbo     | 1   |
| 交互式应用      | SDXL Turbo     | 1-2 |
| 快速迭代       | SDXL Lightning | 2-4 |
| 使用自定义 LoRA | LCM-LoRA       | 4-8 |
| 最高质量       | SDXL Lightning | 8   |
| 低显存        | SD Turbo       | 1-2 |

## 下载所有所需的检查点

典型 CLORE.AI 市场价格：

| GPU           | 验证 CUDA 兼容性 | 每小时图像数（1 步） |
| ------------- | ----------- | ----------- |
| RTX 3060 12GB | \~$0.03     | \~3,000     |
| RTX 3090 24GB | \~$0.06     | \~8,000     |
| RTX 4090 24GB | \~$0.10     | \~12,000    |
| 按日费率          | \~$0.17     | \~15,000    |

*价格有所不同。查看* [*CLORE.AI 市场*](https://clore.ai/marketplace) *A100 40GB*

## # 使用固定种子以获得一致结果

### 模糊的结果

* SDXL Turbo 原生输出 512x512
* 使用 SDXL Lightning 可得到 1024x1024
* 添加放大后处理

### guidance\_scale 错误

```python
# SDXL Turbo：始终使用 0.0
image = pipe(prompt, guidance_scale=0.0).images[0]

# LCM：使用 1.0-2.0
image = pipe(prompt, guidance_scale=1.5).images[0]

# Lightning：使用 0.0
image = pipe(prompt, guidance_scale=0.0).images[0]
```

### LoRA 无法工作

```python
# 对于 LCM-LoRA，必须使用 LCMScheduler
from diffusers import LCMScheduler

pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
pipe.load_lora_weights("latent-consistency/lcm-lora-sdxl")
```

### 内存不足

```python
# Enable memory optimizations
pipe.enable_model_cpu_offload()
pipe.enable_vae_slicing()

# 或使用更小的模型
# 使用 SD Turbo 而不是 SDXL Turbo
```

## 使用以下方式支付

* [FLUX.1](https://docs.clore.ai/guides/guides_v2-zh/tu-xiang-sheng-cheng/flux) - 最高质量的生成
* [Stable Diffusion WebUI](https://docs.clore.ai/guides/guides_v2-zh/tu-xiang-sheng-cheng/stable-diffusion-webui) - 完整的用户界面
* [ComfyUI](https://docs.clore.ai/guides/guides_v2-zh/tu-xiang-sheng-cheng/comfyui) - 基于节点的工作流
* [Real-ESRGAN](https://docs.clore.ai/guides/guides_v2-zh/tu-xiang-chu-li/real-esrgan-upscaling) - 放大结果


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/guides_v2-zh/tu-xiang-sheng-cheng/sdxl-turbo.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.