> For the complete documentation index, see [llms.txt](https://docs.clore.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.clore.ai/guides/guides_v2-de/bildverarbeitung/controlnet-advanced.md).

# ControlNet

Beherrsche ControlNet für präzise Kontrolle der KI-Bilderzeugung.

{% hint style="success" %}
Alle Beispiele können auf GPU-Servern ausgeführt werden, die über [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## Mieten auf CLORE.AI

1. Besuchen Sie [CLORE.AI Marketplace](https://clore.ai/marketplace)
2. Nach GPU-Typ, VRAM und Preis filtern
3. Wählen **On-Demand** (Festpreis) oder **Spot** (Gebotspreis)
4. Konfigurieren Sie Ihre Bestellung:
   * Docker-Image auswählen
   * Ports festlegen (TCP für SSH, HTTP für Web-UIs)
   * Umgebungsvariablen bei Bedarf hinzufügen
   * Startbefehl eingeben
5. Zahlung auswählen: **CLORE**, **BTC**, oder **USDT/USDC**
6. Bestellung erstellen und auf Bereitstellung warten

### Zugriff auf Ihren Server

* Verbindungsdetails finden Sie in **Meine Bestellungen**
* Webschnittstellen: Verwenden Sie die HTTP-Port-URL
* SSH: `ssh -p <port> root@<proxy-address>`

## Was ist ControlNet?

ControlNet fügt Stable Diffusion bedingte Steuerung hinzu:

* **Canny** - Kantenerkennung
* **Tiefe** - 3D-Tiefenkarten
* **Pose** - Menschliche Posen
* **Scribble** - Grobe Skizzen
* **Segmentierung** - Semantische Masken
* **Line Art** - Saubere Linien
* **IP-Adapter** - Stiltransfer

## Anforderungen

| Steuerungstyp        | Min. VRAM | Empfohlen |
| -------------------- | --------- | --------- |
| Einzelnes ControlNet | 8GB       | RTX 3070  |
| Mehrere ControlNets  | 12GB      | RTX 3090  |
| SDXL ControlNet      | 16GB      | RTX 4090  |

## Schnelle Bereitstellung mit A1111

**Befehl:**

```bash
cd /workspace/stable-diffusion-webui && \
cd extensions && \
git clone https://github.com/Mikubill/sd-webui-controlnet && \
cd .. && \
python launch.py --listen --enable-insecure-extension-access
```

### Modelle herunterladen

```bash
cd /workspace/stable-diffusion-webui/extensions/sd-webui-controlnet/models

# SD 1.5 ControlNets
wget https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_canny.pth
wget https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11f1p_sd15_depth.pth
wget https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_openpose.pth
wget https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_scribble.pth
wget https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_lineart.pth
wget https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_softedge.pth
wget https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_seg.pth

# SDXL ControlNets
wget https://huggingface.co/diffusers/controlnet-canny-sdxl-1.0/resolve/main/diffusion_pytorch_model.safetensors -O controlnet-canny-sdxl.safetensors
```

## Python mit Diffusers

### Canny-Kantensteuerung

```python
import torch
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
from diffusers.utils import load_image
from controlnet_aux import CannyDetector
import cv2
import numpy as np

# ControlNet laden
controlnet = ControlNetModel.from_pretrained(
    "lllyasviel/control_v11p_sd15_canny",
    torch_dtype=torch.float16
)

# Lade Pipeline
pipe = StableDiffusionControlNetPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    controlnet=controlnet,
    torch_dtype=torch.float16
)
pipe.to("cuda")
pipe.enable_model_cpu_offload()

# Steuerbild vorbereiten
image = load_image("input.jpg")
canny = CannyDetector()
control_image = canny(image)

# Generieren
output = pipe(
    prompt="eine schöne Frau in einem Garten, hohe Qualität",
    negative_prompt="hässlich, verschwommen",
    image=control_image,
    num_inference_steps=30,
    controlnet_conditioning_scale=1.0
).images[0]

output.save("canny_output.png")
```

### Tiefensteuerung

```python
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
from controlnet_aux import MidasDetector
import torch

controlnet = ControlNetModel.from_pretrained(
    "lllyasviel/control_v11f1p_sd15_depth",
    torch_dtype=torch.float16
)

pipe = StableDiffusionControlNetPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    controlnet=controlnet,
    torch_dtype=torch.float16
).to("cuda")

# Tiefenkarte erhalten
depth_estimator = MidasDetector.from_pretrained("lllyasviel/Annotators")
depth_image = depth_estimator(image)

# Mit Tiefe generieren
output = pipe(
    prompt="eine futuristische Stadt, Sci-Fi, detailliert",
    image=depth_image,
    num_inference_steps=30
).images[0]
```

### OpenPose (menschliche Posen)

```python
from controlnet_aux import OpenposeDetector

# Pose erhalten
pose_detector = OpenposeDetector.from_pretrained("lllyasviel/Annotators")
pose_image = pose_detector(image)

controlnet = ControlNetModel.from_pretrained(
    "lllyasviel/control_v11p_sd15_openpose",
    torch_dtype=torch.float16
)

pipe = StableDiffusionControlNetPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    controlnet=controlnet,
    torch_dtype=torch.float16
).to("cuda")

output = pipe(
    prompt="eine Ballerina beim Tanzen, elegant, Studio-Beleuchtung",
    image=pose_image,
    num_inference_steps=30
).images[0]
```

### Scribble/Skizze

```python
from controlnet_aux import HEDdetector

# Kanten als Scribble erkennen
hed = HEDdetector.from_pretrained("lllyasviel/Annotators")
scribble_image = hed(image, scribble=True)

controlnet = ControlNetModel.from_pretrained(
    "lllyasviel/control_v11p_sd15_scribble",
    torch_dtype=torch.float16
)

pipe = StableDiffusionControlNetPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    controlnet=controlnet,
    torch_dtype=torch.float16
).to("cuda")

output = pipe(
    prompt="ein detailliertes Gemälde einer Landschaft",
    image=scribble_image,
    num_inference_steps=30
).images[0]
```

## Multi-ControlNet

Mehrere Steuerungen kombinieren:

```python
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
import torch

# Mehrere ControlNets laden
controlnet_canny = ControlNetModel.from_pretrained(
    "lllyasviel/control_v11p_sd15_canny",
    torch_dtype=torch.float16
)

controlnet_depth = ControlNetModel.from_pretrained(
    "lllyasviel/control_v11f1p_sd15_depth",
    torch_dtype=torch.float16
)

# Pipeline mit mehreren ControlNets erstellen
pipe = StableDiffusionControlNetPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    controlnet=[controlnet_canny, controlnet_depth],
    torch_dtype=torch.float16
).to("cuda")

# Mit mehreren Steuerungen generieren
output = pipe(
    prompt="ein schönes Porträt",
    image=[canny_image, depth_image],
    controlnet_conditioning_scale=[1.0, 0.8],  # Gewichte anpassen
    num_inference_steps=30
).images[0]
```

## SDXL ControlNet

```python
from diffusers import StableDiffusionXLControlNetPipeline, ControlNetModel
from controlnet_aux import CannyDetector
import torch

# SDXL ControlNet laden
controlnet = ControlNetModel.from_pretrained(
    "diffusers/controlnet-canny-sdxl-1.0",
    torch_dtype=torch.float16
)

pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet=controlnet,
    torch_dtype=torch.float16
).to("cuda")

# Canny-Bild vorbereiten
canny = CannyDetector()
control_image = canny(image, low_threshold=100, high_threshold=200)

output = pipe(
    prompt="ein professionelles Foto, detailliert, 8k",
    image=control_image,
    controlnet_conditioning_scale=0.5,
    num_inference_steps=30
).images[0]
```

## IP-Adapter (Stiltransfer)

```python
from diffusers import StableDiffusionPipeline
from transformers import CLIPVisionModelWithProjection
import torch

# IP-Adapter laden
pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

pipe.load_ip_adapter(
    "h94/IP-Adapter",
    subfolder="models",
    weight_name="ip-adapter_sd15.bin"
)

pipe.set_ip_adapter_scale(0.6)

# Stilreferenzbild
style_image = load_image("style_reference.jpg")

output = pipe(
    prompt="eine Katze, die auf einem Stuhl sitzt",
    ip_adapter_image=style_image,
    num_inference_steps=30
).images[0]
```

## Voreingestellte Prozessoren

Alle verfügbaren Preprozessoren:

```python
from controlnet_aux import (
    CannyDetector,           # Kantenerkennung
    HEDdetector,             # Weiche Kante/Scribble
    MidasDetector,           # Tiefenschätzung
    OpenposeDetector,        # Menschliche Pose
    MLSDdetector,            # Linienerkennung
    LineartDetector,         # Strichzeichnung
    LineartAnimeDetector,    # Anime-Strichzeichnung
    NormalBaeDetector,       # Normalmaps
    ContentShuffleDetector,  # Inhalt mischen
    ZoeDetector,             # Bessere Tiefe
    MediapipeFaceDetector,   # Gesichtsnetz
)

# Beispielanwendung
canny = CannyDetector()
canny_image = canny(image, low_threshold=100, high_threshold=200)

depth = MidasDetector.from_pretrained("lllyasviel/Annotators")
depth_image = depth(image)

pose = OpenposeDetector.from_pretrained("lllyasviel/Annotators")
pose_image = pose(image, hand_and_face=True)
```

## Steuergewichte

Einfluss pro ControlNet anpassen:

```python

# Volle Kontrolle
output = pipe(..., controlnet_conditioning_scale=1.0)

# Teilweise Kontrolle (mehr kreative Freiheit)
output = pipe(..., controlnet_conditioning_scale=0.5)

# Sehr leichte Anleitung
output = pipe(..., controlnet_conditioning_scale=0.3)
```

### Schrittweise Steuerung

```python

# Nur während bestimmter Schritte steuern
output = pipe(
    prompt="...",
    image=control_image,
    controlnet_conditioning_scale=1.0,
    control_guidance_start=0.0,  # Am Anfang starten
    control_guidance_end=0.5,    # Bei 50% der Schritte stoppen
    num_inference_steps=30
).images[0]
```

## Inpaint mit ControlNet

```python
from diffusers import StableDiffusionControlNetInpaintPipeline, ControlNetModel
import torch

controlnet = ControlNetModel.from_pretrained(
    "lllyasviel/control_v11p_sd15_canny",
    torch_dtype=torch.float16
)

pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    controlnet=controlnet,
    torch_dtype=torch.float16
).to("cuda")

output = pipe(
    prompt="ein roter Sportwagen",
    image=init_image,
    mask_image=mask,
    control_image=canny_image,
    num_inference_steps=30
).images[0]
```

## Batch-Verarbeitung

```python
import os
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
from controlnet_aux import CannyDetector
from PIL import Image
import torch

controlnet = ControlNetModel.from_pretrained(
    "lllyasviel/control_v11p_sd15_canny",
    torch_dtype=torch.float16
)

pipe = StableDiffusionControlNetPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    controlnet=controlnet,
    torch_dtype=torch.float16
).to("cuda")

canny = CannyDetector()

input_dir = "./inputs"
output_dir = "./outputs"
os.makedirs(output_dir, exist_ok=True)

prompt = "wunderschönes Landschaftsgemälde, detailliert, künstlerisch"

for filename in os.listdir(input_dir):
    if filename.lower().endswith(('.png', '.jpg', '.jpeg')):
        image = Image.open(os.path.join(input_dir, filename))
        control_image = canny(image)

        output = pipe(
            prompt=prompt,
            image=control_image,
            num_inference_steps=30
        ).images[0]

        output.save(os.path.join(output_dir, f"cn_{filename}"))
```

## Leitfaden für Steuerungstypen

| Steuerung | Am besten geeignet für | Stärke  |
| --------- | ---------------------- | ------- |
| Canny     | Architektur, Objekte   | 0.8-1.0 |
| Tiefe     | 3D-Szenen, Perspektive | 0.6-0.8 |
| Pose      | Menschen, Charaktere   | 0.8-1.0 |
| Scribble  | Skizzen, Konzepte      | 0.6-0.8 |
| Line Art  | Illustrationen         | 0.7-0.9 |
| Softedge  | Allgemeine Führung     | 0.5-0.7 |
| Seg       | Szenenkomposition      | 0.6-0.8 |

## Leistung

| Einrichtung         | GPU      | Auflösung | Zeit |
| ------------------- | -------- | --------- | ---- |
| Einzelnes CN SD1.5  | RTX 3090 | 512x512   | \~3s |
| Mehrfaches CN SD1.5 | RTX 3090 | 512x512   | \~5s |
| Einzelnes CN SDXL   | RTX 4090 | 1024x1024 | \~8s |

## Speicheroptimierung

```python

# Speicher-effiziente Attention aktivieren
pipe.enable_xformers_memory_efficient_attention()

# CPU-Auslagerung
pipe.enable_model_cpu_offload()

# Attention Slicing
pipe.enable_attention_slicing()
```

## Fehlerbehebung

### Schwacher Kontrolleffekt

* Erhöhen Sie `controlnet_conditioning_scale`
* Qualität der Preprocessor-Ausgabe prüfen
* Höhere Auflösung des Steuerbilds verwenden

### Artefakte

* Kontrollskala verringern
* Weicheren Preprocessor verwenden (softedge statt canny)
* Negativen Prompt für Artefakte hinzufügen

### VRAM-Probleme

* CPU-Auslagerung verwenden
* Auflösung reduzieren
* Jeweils ein ControlNet verwenden

## Kostenabschätzung

Typische CLORE.AI-Marktplatztarife (Stand 2024):

| GPU       | Stundensatz | Tagessatz | 4-Stunden-Sitzung |
| --------- | ----------- | --------- | ----------------- |
| RTX 3060  | \~$0.03     | \~$0.70   | \~$0.12           |
| RTX 3090  | \~$0.06     | \~$1.50   | \~$0.25           |
| RTX 4090  | \~$0.10     | \~$2.30   | \~$0.40           |
| A100 40GB | \~$0.17     | \~$4.00   | \~$0.70           |
| A100 80GB | \~$0.25     | \~$6.00   | \~$1.00           |

*Preise variieren je nach Anbieter und Nachfrage. Prüfen Sie* [*CLORE.AI Marketplace*](https://clore.ai/marketplace) *auf aktuelle Preise.*

**Geld sparen:**

* Verwenden Sie **Spot** Markt für flexible Workloads (oft 30–50% günstiger)
* Bezahlen mit **CLORE** Token
* Preise bei verschiedenen Anbietern vergleichen

## Nächste Schritte

* Stable Diffusion WebUI
* ComfyUI Workflows
* [Kohya Training](/guides/guides_v2-de/training/kohya-training.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/guides_v2-de/bildverarbeitung/controlnet-advanced.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.