> For the complete documentation index, see [llms.txt](https://docs.clore.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.clore.ai/guides/guides_v2-hi/training/dreambooth.md).

# DreamBooth

Stable Diffusion को विशिष्ट विषयों की छवियाँ उत्पन्न करने के लिए प्रशिक्षित करें।

{% hint style="success" %}
सभी उदाहरण GPU सर्वरों पर चलाए जा सकते हैं जिन्हें द्वारा किराए पर लिया गया है [CLORE.AI मार्केटप्लेस](https://clore.ai/marketplace).
{% endhint %}

## CLORE.AI पर किराये पर लेना

1. पर जाएँ [CLORE.AI मार्केटप्लेस](https://clore.ai/marketplace)
2. GPU प्रकार, VRAM, और मूल्य के अनुसार फ़िल्टर करें
3. चुनें **ऑन-डिमांड** (निश्चित दर) या **स्पॉट** (बिड प्राइस)
4. अपना ऑर्डर कॉन्फ़िगर करें:
   * Docker इमेज चुनें
   * पोर्ट सेट करें (SSH के लिए TCP, वेब UI के लिए HTTP)
   * यदि आवश्यक हो तो एनवायरनमेंट वेरिएबल जोड़ें
   * स्टार्टअप कमांड दर्ज करें
5. भुगतान चुनें: **CLORE**, **BTC**, या **USDT/USDC**
6. ऑर्डर बनाएं और डिप्लॉयमेंट का इंतज़ार करें

### अपने सर्वर तक पहुँचें

* कनेक्शन विवरण में खोजें **मेरे ऑर्डर**
* वेब इंटरफेस: HTTP पोर्ट URL का उपयोग करें
* SSH: `ssh -p <port> root@<proxy-address>`

## DreamBooth क्या है?

DreamBooth आपके चित्रों पर SD को फाइन-ट्यून करता है:

* 5-20 छवियों पर प्रशिक्षण
* अपने विषय की नई छवियाँ उत्पन्न करें
* कोई भी शैली या संदर्भ
* SD 1.5 और SDXL दोनों के साथ काम करता है

## आवश्यकताएँ

| मॉडल          | VRAM | प्रशिक्षण समय |
| ------------- | ---- | ------------- |
| SD 1.5        | 12GB | 15-30 मिनट    |
| SDXL          | 24GB | 30-60 मिनट    |
| SD 1.5 + LoRA | 8GB  | 10-20 मिनट    |

## त्वरित तैनाती

**Docker इमेज:**

```
pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
```

**पोर्ट:**

```
22/tcp
7860/http
```

**कमांड:**

```bash
pip install diffusers transformers accelerate bitsandbytes && \
pip install xformers peft && \
python dreambooth_train.py
```

## अपनी सेवा तक पहुँचना

डिप्लॉयमेंट के बाद, अपना खोजें `http_pub` URL में **मेरे ऑर्डर**:

1. जाएँ **मेरे ऑर्डर** पृष्ठ
2. अपने ऑर्डर पर क्लिक करें
3. खोजें `http_pub` URL (उदा., `abc123.clorecloud.net`)

उपयोग करें `https://YOUR_HTTP_PUB_URL` की बजाय `localhost` नीचे दिए उदाहरणों में।

## इंस्टॉलेशन

```bash
pip install diffusers transformers accelerate
pip install bitsandbytes xformers peft
```

## प्रशिक्षण डेटा तैयार करें

1. अपने विषय की 5-20 छवियाँ इकट्ठा करें
2. चेहरे/विषय को क्रॉप करें
3. 512x512 पर आकार बदलें (या SDXL के लिए 1024x1024)
4. ज़रूरत होने पर पृष्ठभूमि हटाएँ

```python
from PIL import Image
import os

def prepare_images(input_dir, output_dir, size=512):
    os.makedirs(output_dir, exist_ok=True)

    for filename in os.listdir(input_dir):
        for filename in os.listdir(image_folder):
            img = Image.open(os.path.join(input_dir, filename))
            img = img.convert('RGB')

            # Center crop to square
            min_dim = min(img.size)
            left = (img.width - min_dim) // 2
            top = (img.height - min_dim) // 2
            img = img.crop((left, top, left + min_dim, top + min_dim))

            # Resize
            img = img.resize((size, size), Image.LANCZOS)
            img.save(os.path.join(output_dir, filename))

prepare_images("./raw_photos", "./training_data")
```

## LoRA के साथ DreamBooth (अनुशंसित)

मेमोरी-कुशल प्रशिक्षण:

```python
from diffusers import StableDiffusionPipeline, DDPMScheduler
from diffusers.loaders import LoraLoaderMixin
import torch

# Training script
from accelerate import Accelerator
from diffusers import AutoencoderKL, UNet2DConditionModel
from transformers import CLIPTextModel, CLIPTokenizer
from peft import LoraConfig, get_peft_model

# Load models
model_id = "runwayml/stable-diffusion-v1-5"
tokenizer = CLIPTokenizer.from_pretrained(model_id, subfolder="tokenizer")
text_encoder = CLIPTextModel.from_pretrained(model_id, subfolder="text_encoder")
vae = AutoencoderKL.from_pretrained(model_id, subfolder="vae")
unet = UNet2DConditionModel.from_pretrained(model_id, subfolder="unet")

# Add LoRA to UNet
lora_config = LoraConfig(
    r=8,
    lora_alpha=32,
    target_modules=["to_q", "to_k", "to_v", "to_out.0"],
    lora_dropout=0.1,
)

unet = get_peft_model(unet, lora_config)
```

## diffusers प्रशिक्षण स्क्रिप्ट का उपयोग करना

```bash

# Clone training scripts
git clone https://github.com/huggingface/diffusers
cd diffusers/examples/dreambooth

# Install requirements
pip install -r requirements.txt

# Train with LoRA
accelerate launch train_dreambooth_lora.py \
    --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \
    --instance_data_dir="./training_data" \
    --instance_prompt="a photo of sks person" \
    --output_dir="./dreambooth_model" \
    --resolution=512 \
    --train_batch_size=1 \
    --gradient_accumulation_steps=1 \
    --learning_rate=1e-4 \
    --lr_scheduler="constant" \
    --lr_warmup_steps=0 \
    --max_train_steps=500 \
    --seed=42
```

## प्रशिक्षण पैरामीटर

| पैरामीटर           | अनुशंसित                  | प्रभाव                           |
| ------------------ | ------------------------- | -------------------------------- |
| learning\_rate     | 1e-4 से 5e-6              | ज़्यादा = तेज़, कम = स्थिर       |
| max\_train\_steps  | 400-1000                  | ज़्यादा = बेहतर फिट              |
| train\_batch\_size | 1-2                       | ज़्यादा VRAM की आवश्यकता होती है |
| resolution         | 512 (SD1.5) / 1024 (SDXL) | प्रशिक्षण आकार                   |

## Instance Prompt

एक अद्वितीय पहचानकर्ता चुनें:

```bash

# Good prompts
"a photo of sks person"      # sks = unique token
"a photo of xyz dog"
"a photo of abc car"

# The token (sks, xyz, abc) should be rare
```

## क्लास प्रिज़र्वेशन के साथ

ओवरफिटिंग रोकें:

```bash
accelerate launch train_dreambooth_lora.py \
    --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \
    --instance_data_dir="./my_dog_photos" \
    --instance_prompt="a photo of sks dog" \
    --class_data_dir="./regular_dog_photos" \
    --class_prompt="a photo of dog" \
    --with_prior_preservation \
    --prior_loss_weight=1.0 \
    --num_class_images=200 \
    --output_dir="./dreambooth_dog" \
    --max_train_steps=800
```

## SDXL DreamBooth

```bash
accelerate launch train_dreambooth_lora_sdxl.py \
    --pretrained_model_name_or_path="stabilityai/stable-diffusion-xl-base-1.0" \
    --instance_data_dir="./training_data" \
    --instance_prompt="a photo of sks person" \
    --output_dir="./dreambooth_sdxl" \
    --resolution=1024 \
    --train_batch_size=1 \
    --gradient_accumulation_steps=4 \
    --learning_rate=1e-4 \
    --max_train_steps=500 \
    --mixed_precision="fp16"
```

## प्रशिक्षित मॉडल का उपयोग करना

### LoRA लोड करें

```python
from diffusers import StableDiffusionPipeline
import torch

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

# अपने प्रशिक्षित LoRA को लोड करें
pipe.load_lora_weights("./dreambooth_model")

# जनरेट करें
image = pipe(
    "a photo of sks person as an astronaut on mars",
    num_inference_steps=30,
    guidance_scale=7.5
).images[0]

image.save("astronaut.png")
```

### पूर्ण फाइन-ट्यून

```python
pipe = StableDiffusionPipeline.from_pretrained(
    "./dreambooth_model",
    torch_dtype=torch.float16
).to("cuda")

image = pipe("a photo of sks person in a suit").images[0]
```

## Gradio इंटरफ़ेस

```python
import gradio as gr
from diffusers import StableDiffusionPipeline
import torch

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

pipe.load_lora_weights("./dreambooth_model")

def generate(prompt, negative_prompt, steps, guidance, seed):
    generator = torch.Generator("cuda").manual_seed(seed) if seed > 0 else None

    image = pipe(
        prompt=prompt,
        negative_prompt=negative_prompt,
        num_inference_steps=steps,
        guidance_scale=guidance,
        generator=generator
    ).images[0]

    return image

demo = gr.Interface(
    fn=generate,
    inputs=[
        gr.Textbox(label="Prompt (use 'sks' for your subject)"),
        gr.Textbox(label="Negative Prompt", value="blurry, ugly"),
        gr.Slider(20, 50, value=30, step=1, label="Steps"),
        gr.Slider(5, 15, value=7.5, step=0.5, label="Guidance"),
        gr.Number(value=-1, label="Seed")
    ],
    outputs=gr.Image(label="Generated Image"),
    title="DreamBooth Portrait Generator"
)

demo.launch(server_name="0.0.0.0", server_port=7860)
```

## प्रशिक्षण सुझाव

### लोगों के लिए

* विविध कोणों का उपयोग करें (सामने, साइड, 3/4)
* विभिन्न प्रकाश स्थितियाँ
* विभिन्न भाव-भंगिमाएँ
* स्पष्ट, उच्च-रिज़ॉल्यूशन फ़ोटो

### वस्तुओं के लिए

* कई कोण
* विभिन्न पृष्ठभूमियाँ
* सुसंगत प्रकाश व्यवस्था
* कोई आच्छादन नहीं

### शैलियों के लिए

* 10-20 उदाहरण छवियाँ
* सुसंगत कलात्मक शैली
* उस शैली में विभिन्न विषय

## समस्याओं का निवारण

### ओवरफिटिंग

* max\_train\_steps घटाएँ
* learning\_rate कम करें
* प्रायर प्रिज़र्वेशन का उपयोग करें
* अधिक प्रशिक्षण छवियाँ

### अंडरफिटिंग

* max\_train\_steps बढ़ाएँ
* learning\_rate बढ़ाएँ
* अधिक प्रशिक्षण छवियाँ
* छवि गुणवत्ता जांचें

### शैली नहीं सीखी गई

* LoRA रैंक बढ़ाएँ (r=16 या 32)
* लंबे समय तक प्रशिक्षण करें
* अधिक उदाहरणों का उपयोग करें

## लागत अनुमान

सामान्य CLORE.AI मार्केटप्लेस दरें (2024 के अनुसार):

| GPU       | घंटात्मक दर | दैनिक दर | 4-घंटे सत्र |
| --------- | ----------- | -------- | ----------- |
| RTX 3060  | \~$0.03     | \~$0.70  | \~$0.12     |
| RTX 3090  | \~$0.06     | \~$1.50  | \~$0.25     |
| RTX 4090  | \~$0.10     | \~$2.30  | \~$0.40     |
| A100 40GB | \~$0.17     | \~$4.00  | \~$0.70     |
| A100 80GB | \~$0.25     | \~$6.00  | \~$1.00     |

*कीमतें प्रदाता और मांग के अनुसार बदलती हैं। जाँच करें* [*CLORE.AI मार्केटप्लेस*](https://clore.ai/marketplace) *वर्तमान दरों के लिए।*

**पैसे बचाएँ:**

* उपयोग करें **स्पॉट** लचीले वर्कलोड के लिए मार्केट (अक्सर 30-50% सस्ता)
* भुगतान करें **CLORE** टोकन के साथ
* विभिन्न प्रदाताओं के बीच कीमतों की तुलना करें

## अगले कदम

* [Kohya प्रशिक्षण](/guides/guides_v2-hi/training/kohya-training.md) - उन्नत प्रशिक्षण
* Stable Diffusion WebUI - मॉडल्स का उपयोग करें
* [LoRA फाइन-ट्यूनिंग](/guides/guides_v2-hi/training/kohya-training.md) - LLM प्रशिक्षण


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.clore.ai/guides/guides_v2-hi/training/dreambooth.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.