# SDXL Turbo & LCM

Generate images in 1-4 steps with SDXL Turbo and Latent Consistency Models on CLORE.AI GPUs.

{% hint style="success" %}
All examples can be run on GPU servers rented through [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## Why SDXL Turbo / LCM?

* **Real-time speed** - Generate images in 1-4 steps vs 30-50
* **Same quality** - Comparable to full SDXL with 10x fewer steps
* **Interactive** - Fast enough for real-time applications
* **Low VRAM** - Efficient memory usage
* **LoRA compatible** - Use with existing SDXL LoRAs

## Model Variants

| Model           | Steps | Speed     | Quality   | VRAM |
| --------------- | ----- | --------- | --------- | ---- |
| SDXL Turbo      | 1-4   | Fastest   | Good      | 8GB  |
| SDXL Lightning  | 2-8   | Very Fast | Great     | 8GB  |
| LCM-SDXL        | 4-8   | Fast      | Great     | 8GB  |
| LCM-LoRA + SDXL | 4-8   | Fast      | Excellent | 10GB |
| SD Turbo (1.5)  | 1-4   | Fastest   | Good      | 4GB  |

## Quick Deploy on CLORE.AI

**Docker Image:**

```
pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
```

**Ports:**

```
22/tcp
7860/http
```

**Command:**

```bash
pip install diffusers transformers accelerate gradio && \
python -c "
import gradio as gr
import torch
from diffusers import AutoPipelineForText2Image

pipe = AutoPipelineForText2Image.from_pretrained(
    'stabilityai/sdxl-turbo',
    torch_dtype=torch.float16,
    variant='fp16'
).to('cuda')

def generate(prompt, steps, seed):
    generator = torch.Generator('cuda').manual_seed(seed) if seed > 0 else None
    image = pipe(prompt, num_inference_steps=steps, guidance_scale=0.0, generator=generator).images[0]
    return image

gr.Interface(
    fn=generate,
    inputs=[
        gr.Textbox(label='Prompt'),
        gr.Slider(1, 4, value=1, step=1, label='Steps'),
        gr.Number(value=-1, label='Seed')
    ],
    outputs=gr.Image(),
    title='SDXL Turbo - Real-time Generation'
).launch(server_name='0.0.0.0', server_port=7860)
"
```

## Accessing Your Service

After deployment, find your `http_pub` URL in **My Orders**:

1. Go to **My Orders** page
2. Click on your order
3. Find the `http_pub` URL (e.g., `abc123.clorecloud.net`)

Use `https://YOUR_HTTP_PUB_URL` instead of `localhost` in examples below.

## Hardware Requirements

| Model          | Minimum GPU   | Recommended |
| -------------- | ------------- | ----------- |
| SD Turbo       | RTX 3060 8GB  | RTX 3070    |
| SDXL Turbo     | RTX 3070 8GB  | RTX 3080    |
| SDXL Lightning | RTX 3070 8GB  | RTX 3090    |
| LCM-SDXL       | RTX 3080 10GB | RTX 4090    |

## Installation

```bash
pip install diffusers transformers accelerate torch
```

## SDXL Turbo

### Basic Usage

```python
import torch
from diffusers import AutoPipelineForText2Image

pipe = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/sdxl-turbo",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")

# Generate in 1 step!
image = pipe(
    prompt="A cinematic shot of a baby raccoon wearing an intricate italian priest robe",
    num_inference_steps=1,
    guidance_scale=0.0  # Turbo doesn't use CFG
).images[0]

image.save("raccoon.png")
```

### Best Settings

```python
# 1 step - fastest, good quality
image = pipe(prompt, num_inference_steps=1, guidance_scale=0.0).images[0]

# 2 steps - better details
image = pipe(prompt, num_inference_steps=2, guidance_scale=0.0).images[0]

# 4 steps - best quality for turbo
image = pipe(prompt, num_inference_steps=4, guidance_scale=0.0).images[0]
```

## SDXL Lightning

### 2-Step Generation

```python
import torch
from diffusers import StableDiffusionXLPipeline, EulerDiscreteScheduler
from huggingface_hub import hf_hub_download

base = "stabilityai/stable-diffusion-xl-base-1.0"
repo = "ByteDance/SDXL-Lightning"
ckpt = "sdxl_lightning_2step_unet.safetensors"

# Load base model
pipe = StableDiffusionXLPipeline.from_pretrained(
    base,
    torch_dtype=torch.float16,
    variant="fp16"
).to("cuda")

# Load lightning unet
pipe.unet.load_state_dict(
    torch.load(hf_hub_download(repo, ckpt), map_location="cuda")
)

# Configure scheduler
pipe.scheduler = EulerDiscreteScheduler.from_config(
    pipe.scheduler.config,
    timestep_spacing="trailing"
)

# Generate in 2 steps
image = pipe(
    "A girl smiling in a garden",
    num_inference_steps=2,
    guidance_scale=0.0
).images[0]

image.save("lightning.png")
```

### 4-Step (Higher Quality)

```python
ckpt = "sdxl_lightning_4step_unet.safetensors"
# ... same setup ...

image = pipe(
    prompt,
    num_inference_steps=4,
    guidance_scale=0.0
).images[0]
```

## LCM-LoRA

Use with any SDXL model for fast generation:

```python
import torch
from diffusers import DiffusionPipeline, LCMScheduler

pipe = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")

# Load LCM-LoRA
pipe.load_lora_weights("latent-consistency/lcm-lora-sdxl")

# Set LCM scheduler
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)

# Generate in 4 steps
image = pipe(
    "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k",
    num_inference_steps=4,
    guidance_scale=1.0  # LCM uses low CFG
).images[0]

image.save("lcm_lora.png")
```

### With Custom LoRAs

```python
# Load base + LCM-LoRA + style LoRA
pipe.load_lora_weights("latent-consistency/lcm-lora-sdxl", adapter_name="lcm")
pipe.load_lora_weights("your-style-lora", adapter_name="style")

# Combine adapters
pipe.set_adapters(["lcm", "style"], adapter_weights=[1.0, 0.8])

image = pipe(prompt, num_inference_steps=4, guidance_scale=1.5).images[0]
```

## SD Turbo (SD 1.5)

For lower VRAM requirements:

```python
import torch
from diffusers import AutoPipelineForText2Image

pipe = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/sd-turbo",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")

image = pipe(
    "A photo of a cat",
    num_inference_steps=1,
    guidance_scale=0.0
).images[0]
```

## Image-to-Image

### SDXL Turbo Img2Img

```python
from diffusers import AutoPipelineForImage2Image
from diffusers.utils import load_image

pipe = AutoPipelineForImage2Image.from_pretrained(
    "stabilityai/sdxl-turbo",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")

init_image = load_image("input.jpg").resize((512, 512))

image = pipe(
    prompt="cat wizard, gandalf, lord of the rings, detailed, fantasy",
    image=init_image,
    num_inference_steps=2,
    strength=0.5,
    guidance_scale=0.0
).images[0]
```

## Batch Generation

```python
import torch
from diffusers import AutoPipelineForText2Image

pipe = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/sdxl-turbo",
    torch_dtype=torch.float16
).to("cuda")

prompts = [
    "A sunset over mountains",
    "A futuristic city at night",
    "A cute robot in a garden",
    "An ancient temple in fog"
]

# Batch generate
images = pipe(
    prompts,
    num_inference_steps=1,
    guidance_scale=0.0
).images

for i, img in enumerate(images):
    img.save(f"batch_{i}.png")
```

## Real-time Streaming

```python
import gradio as gr
import torch
from diffusers import AutoPipelineForText2Image

pipe = AutoPipelineForText2Image.from_pretrained(
    "stabilityai/sdxl-turbo",
    torch_dtype=torch.float16
).to("cuda")

def generate_realtime(prompt):
    if not prompt:
        return None
    image = pipe(
        prompt,
        num_inference_steps=1,
        guidance_scale=0.0,
        width=512,
        height=512
    ).images[0]
    return image

demo = gr.Interface(
    fn=generate_realtime,
    inputs=gr.Textbox(label="Prompt"),
    outputs=gr.Image(label="Generated"),
    live=True,  # Update as you type
    title="Real-time SDXL Turbo"
)

demo.launch(server_name="0.0.0.0", server_port=7860)
```

## Performance Comparison

| Model          | Steps | Resolution | RTX 3090 | RTX 4090 | A100  |
| -------------- | ----- | ---------- | -------- | -------- | ----- |
| SDXL (base)    | 30    | 1024x1024  | 8s       | 5s       | 4s    |
| SDXL Turbo     | 1     | 512x512    | 0.3s     | 0.2s     | 0.15s |
| SDXL Turbo     | 4     | 512x512    | 0.8s     | 0.5s     | 0.4s  |
| SDXL Lightning | 2     | 1024x1024  | 0.8s     | 0.5s     | 0.4s  |
| SDXL Lightning | 4     | 1024x1024  | 1.2s     | 0.8s     | 0.6s  |
| LCM-SDXL       | 4     | 1024x1024  | 1.5s     | 1.0s     | 0.7s  |

## Quality Comparison

| Aspect         | SDXL 30 steps | Turbo 4 steps | Lightning 4 steps |
| -------------- | ------------- | ------------- | ----------------- |
| Details        | Excellent     | Good          | Great             |
| Text rendering | Good          | Poor          | Poor              |
| Faces          | Great         | Good          | Good              |
| Consistency    | Excellent     | Good          | Great             |
| Style variety  | Excellent     | Good          | Great             |

## When to Use What

| Use Case          | Recommended    | Steps |
| ----------------- | -------------- | ----- |
| Real-time preview | SDXL Turbo     | 1     |
| Interactive apps  | SDXL Turbo     | 1-2   |
| Quick iterations  | SDXL Lightning | 2-4   |
| With custom LoRAs | LCM-LoRA       | 4-8   |
| Maximum quality   | SDXL Lightning | 8     |
| Low VRAM          | SD Turbo       | 1-2   |

## Cost Estimate

Typical CLORE.AI marketplace rates:

| GPU           | Hourly Rate | Images/Hour (1-step) |
| ------------- | ----------- | -------------------- |
| RTX 3060 12GB | \~$0.03     | \~3,000              |
| RTX 3090 24GB | \~$0.06     | \~8,000              |
| RTX 4090 24GB | \~$0.10     | \~12,000             |
| A100 40GB     | \~$0.17     | \~15,000             |

*Prices vary. Check* [*CLORE.AI Marketplace*](https://clore.ai/marketplace) *for current rates.*

## Troubleshooting

### Blurry Results

* SDXL Turbo outputs 512x512 natively
* Use SDXL Lightning for 1024x1024
* Add upscaling post-process

### guidance\_scale Error

```python
# SDXL Turbo: always use 0.0
image = pipe(prompt, guidance_scale=0.0).images[0]

# LCM: use 1.0-2.0
image = pipe(prompt, guidance_scale=1.5).images[0]

# Lightning: use 0.0
image = pipe(prompt, guidance_scale=0.0).images[0]
```

### LoRA Not Working

```python
# For LCM-LoRA, must use LCMScheduler
from diffusers import LCMScheduler

pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
pipe.load_lora_weights("latent-consistency/lcm-lora-sdxl")
```

### Out of Memory

```python
# Enable memory optimizations
pipe.enable_model_cpu_offload()
pipe.enable_vae_slicing()

# Or use smaller model
# SD Turbo instead of SDXL Turbo
```

## Next Steps

* [FLUX.1](https://docs.clore.ai/guides/image-generation/flux) - Highest quality generation
* [Stable Diffusion WebUI](https://docs.clore.ai/guides/image-generation/stable-diffusion-webui) - Full UI
* [ComfyUI](https://docs.clore.ai/guides/image-generation/comfyui) - Node-based workflows
* [Real-ESRGAN](https://docs.clore.ai/guides/image-processing/real-esrgan-upscaling) - Upscale results


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/image-generation/sdxl-turbo.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
