# DreamBooth

Train Stable Diffusion to generate images of specific subjects.

{% hint style="success" %}
All examples can be run on GPU servers rented through [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## Renting on CLORE.AI

1. Visit [CLORE.AI Marketplace](https://clore.ai/marketplace)
2. Filter by GPU type, VRAM, and price
3. Choose **On-Demand** (fixed rate) or **Spot** (bid price)
4. Configure your order:
   * Select Docker image
   * Set ports (TCP for SSH, HTTP for web UIs)
   * Add environment variables if needed
   * Enter startup command
5. Select payment: **CLORE**, **BTC**, or **USDT/USDC**
6. Create order and wait for deployment

### Access Your Server

* Find connection details in **My Orders**
* Web interfaces: Use the HTTP port URL
* SSH: `ssh -p <port> root@<proxy-address>`

## What is DreamBooth?

DreamBooth fine-tunes SD on your images:

* Train on 5-20 images
* Generate new images of your subject
* Any style or context
* Works with SD 1.5 and SDXL

## Requirements

| Model         | VRAM | Training Time |
| ------------- | ---- | ------------- |
| SD 1.5        | 12GB | 15-30 min     |
| SDXL          | 24GB | 30-60 min     |
| SD 1.5 + LoRA | 8GB  | 10-20 min     |

## Quick Deploy

**Docker Image:**

```
pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
```

**Ports:**

```
22/tcp
7860/http
```

**Command:**

```bash
pip install diffusers transformers accelerate bitsandbytes && \
pip install xformers peft && \
python dreambooth_train.py
```

## Accessing Your Service

After deployment, find your `http_pub` URL in **My Orders**:

1. Go to **My Orders** page
2. Click on your order
3. Find the `http_pub` URL (e.g., `abc123.clorecloud.net`)

Use `https://YOUR_HTTP_PUB_URL` instead of `localhost` in examples below.

## Installation

```bash
pip install diffusers transformers accelerate
pip install bitsandbytes xformers peft
```

## Prepare Training Data

1. Collect 5-20 images of your subject
2. Crop to face/subject
3. Resize to 512x512 (or 1024x1024 for SDXL)
4. Remove backgrounds if needed

```python
from PIL import Image
import os

def prepare_images(input_dir, output_dir, size=512):
    os.makedirs(output_dir, exist_ok=True)

    for filename in os.listdir(input_dir):
        if filename.endswith(('.jpg', '.png', '.jpeg')):
            img = Image.open(os.path.join(input_dir, filename))
            img = img.convert('RGB')

            # Center crop to square
            min_dim = min(img.size)
            left = (img.width - min_dim) // 2
            top = (img.height - min_dim) // 2
            img = img.crop((left, top, left + min_dim, top + min_dim))

            # Resize
            img = img.resize((size, size), Image.LANCZOS)
            img.save(os.path.join(output_dir, filename))

prepare_images("./raw_photos", "./training_data")
```

## DreamBooth with LoRA (Recommended)

Memory-efficient training:

```python
from diffusers import StableDiffusionPipeline, DDPMScheduler
from diffusers.loaders import LoraLoaderMixin
import torch

# Training script
from accelerate import Accelerator
from diffusers import AutoencoderKL, UNet2DConditionModel
from transformers import CLIPTextModel, CLIPTokenizer
from peft import LoraConfig, get_peft_model

# Load models
model_id = "runwayml/stable-diffusion-v1-5"
tokenizer = CLIPTokenizer.from_pretrained(model_id, subfolder="tokenizer")
text_encoder = CLIPTextModel.from_pretrained(model_id, subfolder="text_encoder")
vae = AutoencoderKL.from_pretrained(model_id, subfolder="vae")
unet = UNet2DConditionModel.from_pretrained(model_id, subfolder="unet")

# Add LoRA to UNet
lora_config = LoraConfig(
    r=8,
    lora_alpha=32,
    target_modules=["to_q", "to_k", "to_v", "to_out.0"],
    lora_dropout=0.1,
)

unet = get_peft_model(unet, lora_config)
```

## Using diffusers Training Script

```bash

# Clone training scripts
git clone https://github.com/huggingface/diffusers
cd diffusers/examples/dreambooth

# Install requirements
pip install -r requirements.txt

# Train with LoRA
accelerate launch train_dreambooth_lora.py \
    --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \
    --instance_data_dir="./training_data" \
    --instance_prompt="a photo of sks person" \
    --output_dir="./dreambooth_model" \
    --resolution=512 \
    --train_batch_size=1 \
    --gradient_accumulation_steps=1 \
    --learning_rate=1e-4 \
    --lr_scheduler="constant" \
    --lr_warmup_steps=0 \
    --max_train_steps=500 \
    --seed=42
```

## Training Parameters

| Parameter          | Recommended               | Effect                          |
| ------------------ | ------------------------- | ------------------------------- |
| learning\_rate     | 1e-4 to 5e-6              | Higher = faster, lower = stable |
| max\_train\_steps  | 400-1000                  | More = better fit               |
| train\_batch\_size | 1-2                       | Higher needs more VRAM          |
| resolution         | 512 (SD1.5) / 1024 (SDXL) | Training size                   |

## Instance Prompt

Choose a unique identifier:

```bash

# Good prompts
"a photo of sks person"      # sks = unique token
"a photo of xyz dog"
"a photo of abc car"

# The token (sks, xyz, abc) should be rare
```

## With Class Preservation

Prevent overfitting:

```bash
accelerate launch train_dreambooth_lora.py \
    --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \
    --instance_data_dir="./my_dog_photos" \
    --instance_prompt="a photo of sks dog" \
    --class_data_dir="./regular_dog_photos" \
    --class_prompt="a photo of dog" \
    --with_prior_preservation \
    --prior_loss_weight=1.0 \
    --num_class_images=200 \
    --output_dir="./dreambooth_dog" \
    --max_train_steps=800
```

## SDXL DreamBooth

```bash
accelerate launch train_dreambooth_lora_sdxl.py \
    --pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-1.0" \
    --instance_data_dir="./training_data" \
    --instance_prompt="a photo of sks person" \
    --output_dir="./dreambooth_sdxl" \
    --resolution=1024 \
    --train_batch_size=1 \
    --gradient_accumulation_steps=4 \
    --learning_rate=1e-4 \
    --max_train_steps=500 \
    --mixed_precision="fp16"
```

## Using Trained Model

### Load LoRA

```python
from diffusers import StableDiffusionPipeline
import torch

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

# Load your trained LoRA
pipe.load_lora_weights("./dreambooth_model")

# Generate
image = pipe(
    "a photo of sks person as an astronaut on mars",
    num_inference_steps=30,
    guidance_scale=7.5
).images[0]

image.save("astronaut.png")
```

### Full Fine-tune

```python
pipe = StableDiffusionPipeline.from_pretrained(
    "./dreambooth_model",
    torch_dtype=torch.float16
).to("cuda")

image = pipe("a photo of sks person in a suit").images[0]
```

## Gradio Interface

```python
import gradio as gr
from diffusers import StableDiffusionPipeline
import torch

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

pipe.load_lora_weights("./dreambooth_model")

def generate(prompt, negative_prompt, steps, guidance, seed):
    generator = torch.Generator("cuda").manual_seed(seed) if seed > 0 else None

    image = pipe(
        prompt=prompt,
        negative_prompt=negative_prompt,
        num_inference_steps=steps,
        guidance_scale=guidance,
        generator=generator
    ).images[0]

    return image

demo = gr.Interface(
    fn=generate,
    inputs=[
        gr.Textbox(label="Prompt (use 'sks' for your subject)"),
        gr.Textbox(label="Negative Prompt", value="blurry, ugly"),
        gr.Slider(20, 50, value=30, step=1, label="Steps"),
        gr.Slider(5, 15, value=7.5, step=0.5, label="Guidance"),
        gr.Number(value=-1, label="Seed")
    ],
    outputs=gr.Image(label="Generated Image"),
    title="DreamBooth Portrait Generator"
)

demo.launch(server_name="0.0.0.0", server_port=7860)
```

## Training Tips

### For People

* Use varied angles (front, side, 3/4)
* Different lighting conditions
* Various expressions
* Clear, high-resolution photos

### For Objects

* Multiple angles
* Different backgrounds
* Consistent lighting
* No occlusion

### For Styles

* 10-20 example images
* Consistent artistic style
* Various subjects in that style

## Troubleshooting

### Overfitting

* Reduce max\_train\_steps
* Lower learning\_rate
* Use prior preservation
* More training images

### Underfitting

* Increase max\_train\_steps
* Higher learning\_rate
* More training images
* Check image quality

### Style Not Learned

* Increase LoRA rank (r=16 or 32)
* Train longer
* Use more examples

## Cost Estimate

Typical CLORE.AI marketplace rates (as of 2024):

| GPU       | Hourly Rate | Daily Rate | 4-Hour Session |
| --------- | ----------- | ---------- | -------------- |
| RTX 3060  | \~$0.03     | \~$0.70    | \~$0.12        |
| RTX 3090  | \~$0.06     | \~$1.50    | \~$0.25        |
| RTX 4090  | \~$0.10     | \~$2.30    | \~$0.40        |
| A100 40GB | \~$0.17     | \~$4.00    | \~$0.70        |
| A100 80GB | \~$0.25     | \~$6.00    | \~$1.00        |

*Prices vary by provider and demand. Check* [*CLORE.AI Marketplace*](https://clore.ai/marketplace) *for current rates.*

**Save money:**

* Use **Spot** market for flexible workloads (often 30-50% cheaper)
* Pay with **CLORE** tokens
* Compare prices across different providers

## Next Steps

* [Kohya Training](https://docs.clore.ai/guides/training/kohya-training) - Advanced training
* Stable Diffusion WebUI - Use models
* [LoRA Fine-tuning](https://docs.clore.ai/guides/training/kohya-training) - LLM training
