> For the complete documentation index, see [llms.txt](https://docs.clore.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.clore.ai/guides/guides_v2-de/video-generierung/ai-video-generation.md).

# KI-Video-Generierung

Erzeuge Videos mit Stable Video Diffusion, AnimateDiff und anderen Modellen.

{% hint style="success" %}
Alle Beispiele können auf GPU-Servern ausgeführt werden, die über [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## Mieten auf CLORE.AI

1. Besuchen Sie [CLORE.AI Marketplace](https://clore.ai/marketplace)
2. Nach GPU-Typ, VRAM und Preis filtern
3. Wählen **On-Demand** (Festpreis) oder **Spot** (Gebotspreis)
4. Konfigurieren Sie Ihre Bestellung:
   * Docker-Image auswählen
   * Ports festlegen (TCP für SSH, HTTP für Web-UIs)
   * Umgebungsvariablen bei Bedarf hinzufügen
   * Startbefehl eingeben
5. Zahlung auswählen: **CLORE**, **BTC**, oder **USDT/USDC**
6. Bestellung erstellen und auf Bereitstellung warten

### Zugriff auf Ihren Server

* Verbindungsdetails finden Sie in **Meine Bestellungen**
* Webschnittstellen: Verwenden Sie die HTTP-Port-URL
* SSH: `ssh -p <port> root@<proxy-address>`

## Verfügbare Modelle

| Modell      | Typ           | VRAM | Dauer        |
| ----------- | ------------- | ---- | ------------ |
| SVD         | Bild-zu-Video | 16GB | 4 Sekunden   |
| SVD-XT      | Bild-zu-Video | 20GB | 4 Sekunden   |
| AnimateDiff | Text-zu-Video | 12GB | 2–4 Sekunden |
| CogVideoX   | Text-zu-Video | 24GB | 6 Sekunden   |

## Stable Video Diffusion (SVD)

### Schnelle Bereitstellung

**Docker-Image:**

```
pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
```

**Ports:**

```
22/tcp
7860/http
```

**Befehl:**

```bash
pip install diffusers transformers accelerate gradio imageio && \
python svd_server.py
```

## Zugriff auf Ihren Dienst

Nach der Bereitstellung finden Sie Ihre `http_pub` URL in **Meine Bestellungen**:

1. Gehen Sie zur **Meine Bestellungen** Seite
2. Klicken Sie auf Ihre Bestellung
3. Finden Sie die `http_pub` URL (z. B., `abc123.clorecloud.net`)

Verwenden Sie `https://IHRE_HTTP_PUB_URL` anstelle von `localhost` in den Beispielen unten.

### SVD-Skript

```python
import torch
from diffusers import StableVideoDiffusionPipeline
from PIL import Image
import imageio

# Modell laden
pipe = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid-xt",
    torch_dtype=torch.float16,
)
pipe.to("cuda")
pipe.enable_model_cpu_offload()

# Lade und ändere Bildgröße
image = Image.open("input.png").resize((1024, 576))

# Generiere Video
frames = pipe(
    image,
    decode_chunk_size=8,
    num_frames=25,
    motion_bucket_id=127,
    noise_aug_strength=0.02
).frames[0]

# Als GIF speichern
imageio.mimsave("output.gif", frames, fps=6)

# Als MP4 speichern
imageio.mimsave("output.mp4", frames, fps=6)
```

### SVD mit Gradio-Benutzeroberfläche

```python
import gradio as gr
import torch
from diffusers import StableVideoDiffusionPipeline
from PIL import Image
import imageio
import tempfile

pipe = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid-xt",
    torch_dtype=torch.float16,
)
pipe.enable_model_cpu_offload()

def generate_video(image, motion_bucket, fps, num_frames):
    image = image.resize((1024, 576))

    frames = pipe(
        image,
        decode_chunk_size=4,
        num_frames=num_frames,
        motion_bucket_id=motion_bucket,
    ).frames[0]

    with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as f:
        imageio.mimsave(f.name, frames, fps=fps)
        return f.name

demo = gr.Interface(
    fn=generate_video,
    inputs=[
        gr.Image(type="pil", label="Eingabebild"),
        gr.Slider(1, 255, value=127, label="Bewegungsstärke"),
        gr.Slider(1, 30, value=6, label="FPS"),
        gr.Slider(14, 25, value=25, label="Frames")
    ],
    outputs=gr.Video(label="Generated Video"),
)

demo.launch(server_name="0.0.0.0", server_port=7860)
```

## AnimateDiff

### Installation

```bash
pip install diffusers transformers accelerate
```

### Video aus Text generieren

```python
import torch
from diffusers import AnimateDiffPipeline, MotionAdapter, DDIMScheduler
import imageio

# Lade Motion-Adapter
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2")

# Lade Pipeline
pipe = AnimateDiffPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    motion_adapter=adapter,
    torch_dtype=torch.float16,
)
pipe.scheduler = DDIMScheduler.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    subfolder="scheduler",
    clip_sample=False,
    timestep_spacing="linspace",
    beta_schedule="linear",
    steps_offset=1,
)
pipe.to("cuda")
pipe.enable_model_cpu_offload()

# Generieren
output = pipe(
    prompt="Eine Katze, die durch einen Garten läuft, schöne Blumen, sonniger Tag",
    negative_prompt="schlechte Qualität, verschwommen",
    num_frames=16,
    guidance_scale=7.5,
    num_inference_steps=25,
)

# Speichern
frames = output.frames[0]
imageio.mimsave("animatediff.gif", frames, fps=8)
```

### AnimateDiff mit benutzerdefiniertem Modell

```python
from diffusers import AnimateDiffPipeline, MotionAdapter, EulerDiscreteScheduler

adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2")

# Verwende einen benutzerdefinierten Checkpoint (z. B. RealisticVision)
pipe = AnimateDiffPipeline.from_pretrained(
    "SG161222/Realistic_Vision_V5.1_noVAE",
    motion_adapter=adapter,
    torch_dtype=torch.float16,
)
```

## AnimateDiff in ComfyUI

### Knoten installieren

```bash
cd /workspace/ComfyUI/custom_nodes
git clone https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved.git
git clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite.git
```

### Motion-Modelle herunterladen

```bash
cd /workspace/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved/models
wget https://huggingface.co/guoyww/animatediff/resolve/main/mm_sd_v15_v2.ckpt
```

## CogVideoX

### Text-zu-Video

```python
import torch
from diffusers import CogVideoXPipeline
import imageio

pipe = CogVideoXPipeline.from_pretrained(
    "THUDM/CogVideoX-2b",
    torch_dtype=torch.float16
)
pipe.to("cuda")
pipe.enable_model_cpu_offload()

prompt = "Eine Drohne fliegt über eine wunderschöne Berglandschaft bei Sonnenuntergang"

video = pipe(
    prompt=prompt,
    num_videos_per_prompt=1,
    num_inference_steps=50,
    num_frames=49,
    guidance_scale=6,
).frames[0]

imageio.mimsave("cogvideo.mp4", video, fps=8)
```

## Video-Upscaling

### Real-ESRGAN für Video

```python
import cv2
import torch
from basicsr.archs.rrdbnet_arch import RRDBNet
from realesrgan import RealESRGANer

model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4)
upsampler = RealESRGANer(
    scale=4,
    model_path='RealESRGAN_x4plus.pth',
    model=model,
    tile=400,
    tile_pad=10,
    pre_pad=0,
    half=True
)

# Verarbeite Video Frame für Frame
cap = cv2.VideoCapture("input.mp4")

# ... skaliere jedes Frame hoch
```

## Interpolation (sanfte Videos)

### FILM Frame Interpolation

```python

# Installieren
pip install tensorflow tensorflow_hub

import tensorflow as tf
import tensorflow_hub as hub

model = hub.load("https://tfhub.dev/google/film/1")

def interpolate(frame1, frame2, num_interpolations=3):
    # Gibt interpolierte Frames zwischen frame1 und frame2 zurück
    ...
```

### RIFE (Echtzeit)

```bash
pip install rife-ncnn-vulkan-python

from rife_ncnn_vulkan import Rife
rife = Rife(gpu_id=0)

# Interpoliere Frames
```

## Batch-Videoerzeugung

```python
prompts = [
    "Eine Rakete, die ins All startet",
    "Meereswellen, die gegen Felsen schlagen",
    "Ein Schmetterling, der durch Blumen fliegt",
]

for i, prompt in enumerate(prompts):
    print(f"Generiere {i+1}/{len(prompts)}")
    output = pipe(prompt, num_frames=16)
    imageio.mimsave(f"video_{i:03d}.mp4", output.frames[0], fps=8)
```

## Speichertipps

### Für begrenzten VRAM

```python

# CPU-Offload aktivieren
pipe.enable_model_cpu_offload()

# VAE-Slicing aktivieren
pipe.enable_vae_slicing()

# Attention-Slicing aktivieren
pipe.enable_attention_slicing()

# Anzahl der Frames reduzieren
num_frames = 14  # Statt 25
```

### Chunked-Decodierung

```python
frames = pipe(
    image,
    decode_chunk_size=2,  # Dekodiere 2 Frames gleichzeitig
    num_frames=25,
).frames[0]
```

## Konvertierung der Ausgabe

### GIF zu MP4

```bash
ffmpeg -i input.gif -movflags faststart -pix_fmt yuv420p -vf "scale=trunc(iw/2)*2:trunc(ih/2)*2" output.mp4
```

### Frame-Sequenz zu Video

```bash
ffmpeg -framerate 8 -i frame_%04d.png -c:v libx264 -pix_fmt yuv420p output.mp4
```

### Audio hinzufügen

```bash
ffmpeg -i video.mp4 -i audio.mp3 -c:v copy -c:a aac -shortest output_with_audio.mp4
```

## Leistung

| Modell      | GPU      | Frames | Zeit   |
| ----------- | -------- | ------ | ------ |
| SVD-XT      | RTX 3090 | 25     | \~120s |
| SVD-XT      | RTX 4090 | 25     | \~80s  |
| SVD-XT      | A100     | 25     | \~50s  |
| AnimateDiff | RTX 3090 | 16     | \~30s  |
| CogVideoX   | A100     | 49     | \~180s |

## Kostenabschätzung

Typische CLORE.AI-Marktplatztarife (Stand 2024):

| GPU       | Stundensatz | Tagessatz | 4-Stunden-Sitzung |
| --------- | ----------- | --------- | ----------------- |
| RTX 3060  | \~$0.03     | \~$0.70   | \~$0.12           |
| RTX 3090  | \~$0.06     | \~$1.50   | \~$0.25           |
| RTX 4090  | \~$0.10     | \~$2.30   | \~$0.40           |
| A100 40GB | \~$0.17     | \~$4.00   | \~$0.70           |
| A100 80GB | \~$0.25     | \~$6.00   | \~$1.00           |

*Preise variieren je nach Anbieter und Nachfrage. Prüfen Sie* [*CLORE.AI Marketplace*](https://clore.ai/marketplace) *auf aktuelle Preise.*

**Geld sparen:**

* Verwenden Sie **Spot** Markt für flexible Workloads (oft 30–50% günstiger)
* Bezahlen mit **CLORE** Token
* Preise bei verschiedenen Anbietern vergleichen

## Fehlerbehebung

### OOM-Fehler

* Reduziere num\_frames
* CPU-Offload aktivieren
* Verwende kleinere decode\_chunk\_size

### Flimmerndes Video

* Erhöhe num\_inference\_steps
* Probiere eine andere motion\_bucket\_id
* Verwende Frame-Interpolation

### Schlechte Qualität

* Verwende höher aufgelöstes Eingabebild (SVD)
* Bessere Prompts (AnimateDiff)
* Erhöhe guidance\_scale


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.clore.ai/guides/guides_v2-de/video-generierung/ai-video-generation.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.