> For the complete documentation index, see [llms.txt](https://docs.clore.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.clore.ai/guides/guides_v2-de/video-generierung/stable-video-diffusion.md).

# Stable Video Diffusion

{% hint style="info" %}
**Neuere Alternativen verfügbar!** Erwäge [**FramePack**](/guides/guides_v2-de/video-generierung/framepack.md) (nur 6GB VRAM!), [**Wan2.1**](/guides/guides_v2-de/video-generierung/wan-video.md) (höhere Qualität), oder [**LTX-2**](/guides/guides_v2-de/video-generierung/ltx-video-2.md) (Video mit nativer Audioaufnahme).
{% endhint %}

Erzeuge Videos aus Bildern mit Stability AIs SVD-Modell.

{% hint style="success" %}
Alle Beispiele können auf GPU-Servern ausgeführt werden, die über [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## Was ist Stable Video Diffusion?

SVD (Stable Video Diffusion) erzeugt kurze Videoclips aus einem einzelnen Bild:

* Ausgaben mit 14 oder 25 Frames
* Auflösung 576x1024
* Flüssige Bewegungserzeugung
* Open-Source-Gewichte

## Ressourcen

* **HuggingFace:** [stabilityai/stable-video-diffusion-img2vid-xt](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)
* **GitHub:** [Stability-AI/generative-models](https://github.com/Stability-AI/generative-models)
* **Paper:** [SVD-Paper](https://arxiv.org/abs/2311.15127)

## Hardware-Anforderungen

| Modell             | VRAM | Empfohlene GPU  |
| ------------------ | ---- | --------------- |
| SVD (14 Frames)    | 16GB | RTX 4090        |
| SVD-XT (25 Frames) | 24GB | RTX 4090 / A100 |

## Schnelle Bereitstellung

**Docker-Image:**

```
pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
```

**Ports:**

```
22/tcp
7860/http
```

**Befehl:**

```bash
pip install diffusers transformers accelerate && \
pip install gradio && \
python -c "
import gradio as gr
from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import export_to_video
import torch

pipe = StableVideoDiffusionPipeline.from_pretrained(
    'stabilityai/stable-video-diffusion-img2vid-xt',
    torch_dtype=torch.float16,
    variant='fp16'
).to('cuda')

def generate(image, seed, fps):
    generator = torch.manual_seed(seed)
    frames = pipe(image, num_frames=25, generator=generator).frames[0]
    export_to_video(frames, 'output.mp4', fps=fps)
    return 'output.mp4'

gr.Interface(
    fn=generate,
    inputs=[gr.Image(type='pil'), gr.Number(value=42, label='Seed'), gr.Slider(6, 30, value=7, label='FPS')],
    outputs=gr.Video(),
    title='Stable Video Diffusion'
).launch(server_name='0.0.0.0', server_port=7860)
"
```

## Zugriff auf Ihren Dienst

Nach der Bereitstellung finden Sie Ihre `http_pub` URL in **Meine Bestellungen**:

1. Gehen Sie zur **Meine Bestellungen** Seite
2. Klicken Sie auf Ihre Bestellung
3. Finden Sie die `http_pub` URL (z. B., `abc123.clorecloud.net`)

Verwenden Sie `https://IHRE_HTTP_PUB_URL` anstelle von `localhost` in den Beispielen unten.

## Installation

```bash
pip install diffusers transformers accelerate torch

# Für Videoexport
pip install imageio[ffmpeg]
```

## Grundlegende Verwendung

```python
import torch
from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import load_image, export_to_video

# Lade Pipeline
pipe = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid-xt",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")

# Lade und ändere Bildgröße
image = load_image("input.jpg")
image = image.resize((1024, 576))

# Generiere Video
generator = torch.manual_seed(42)
frames = pipe(image, num_frames=25, generator=generator).frames[0]

# Video speichern
export_to_video(frames, "output.mp4", fps=7)
```

## SVD vs SVD-XT

| Funktion | SVD     | SVD-XT    |
| -------- | ------- | --------- |
| Frames   | 14      | 25        |
| Dauer    | \~2 Sek | \~3.5 Sek |
| VRAM     | 16GB    | 24GB      |
| Qualität | Gut     | Besser    |

## Speicheroptimierung

```python

# Speicher-effiziente Attention aktivieren
pipe.enable_model_cpu_offload()

# Oder Attention Slicing verwenden
pipe.enable_attention_slicing()

# Für sehr wenig VRAM
pipe.enable_sequential_cpu_offload()
```

## Batch-Verarbeitung

```python
import os
from pathlib import Path

input_dir = Path("./images")
output_dir = Path("./videos")
output_dir.mkdir(exist_ok=True)

for img_path in input_dir.glob("*.jpg"):
    image = load_image(str(img_path)).resize((1024, 576))
    frames = pipe(image, num_frames=25).frames[0]
    export_to_video(frames, str(output_dir / f"{img_path.stem}.mp4"), fps=7)
    print(f"Generated: {img_path.stem}.mp4")
```

## ComfyUI-Integration

SVD funktioniert großartig in ComfyUI:

1. ComfyUI installieren
2. SVD-Modell herunterladen nach `models/checkpoints/`
3. Verwende SVD-Knoten für den img2vid-Workflow

## Fehlerbehebung

{% hint style="danger" %}
**Kein Speicher mehr**
{% endhint %}

* Verwenden Sie `enable_model_cpu_offload()`
* Reduzieren `num_frames` auf 14
* Verwende die fp16-Variante

### Video zu kurz

* Verwende SVD-XT (25 Frames) statt SVD (14 Frames)
* Interpoliere mit RIFE für ein flüssigeres Ergebnis

### Schlechte Bewegungsqualität

* Verwende hochwertige Eingabebilder
* Stelle sicher, dass das Bild 1024x576 (oder 576x1024) ist
* Probiere verschiedene Seeds

### CUDA-Fehler

* Aktualisiere PyTorch und diffusers
* Überprüfe die CUDA-Versionskompatibilität

## Kostenabschätzung

Typische CLORE.AI-Marktplatztarife (Stand 2024):

| GPU       | Stundensatz | Tagessatz | 4-Stunden-Sitzung |
| --------- | ----------- | --------- | ----------------- |
| RTX 3060  | \~$0.03     | \~$0.70   | \~$0.12           |
| RTX 3090  | \~$0.06     | \~$1.50   | \~$0.25           |
| RTX 4090  | \~$0.10     | \~$2.30   | \~$0.40           |
| A100 40GB | \~$0.17     | \~$4.00   | \~$0.70           |
| A100 80GB | \~$0.25     | \~$6.00   | \~$1.00           |

*Preise variieren je nach Anbieter. Prüfe* [*CLORE.AI Marketplace*](https://clore.ai/marketplace) *auf aktuelle Preise.*

## Nächste Schritte

* AnimateDiff - SD-Bilder animieren
* [RIFE-Interpolation](/guides/guides_v2-de/videoverarbeitung/rife-interpolation.md) - Erhöhe die FPS
* [Hunyuan Video](/guides/guides_v2-de/video-generierung/hunyuan-video.md) - Text-zu-Video


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.clore.ai/guides/guides_v2-de/video-generierung/stable-video-diffusion.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.