# Stable Video Diffusion

{% hint style="info" %}
**Newer alternatives available!** Consider [**FramePack**](/guides/video-generation/framepack.md) (only 6GB VRAM!), [**Wan2.1**](/guides/video-generation/wan-video.md) (higher quality), or [**LTX-2**](/guides/video-generation/ltx-video-2.md) (video with native audio).
{% endhint %}

Generate videos from images using Stability AI's SVD model.

{% hint style="success" %}
All examples can be run on GPU servers rented through [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## What is Stable Video Diffusion?

SVD (Stable Video Diffusion) generates short video clips from a single image:

* 14 or 25 frame outputs
* 576x1024 resolution
* Smooth motion generation
* Open source weights

## Resources

* **HuggingFace:** [stabilityai/stable-video-diffusion-img2vid-xt](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)
* **GitHub:** [Stability-AI/generative-models](https://github.com/Stability-AI/generative-models)
* **Paper:** [SVD Paper](https://arxiv.org/abs/2311.15127)

## Hardware Requirements

| Model              | VRAM | Recommended GPU |
| ------------------ | ---- | --------------- |
| SVD (14 frames)    | 16GB | RTX 4090        |
| SVD-XT (25 frames) | 24GB | RTX 4090 / A100 |

## Quick Deploy

**Docker Image:**

```
pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
```

**Ports:**

```
22/tcp
7860/http
```

**Command:**

```bash
pip install diffusers transformers accelerate && \
pip install gradio && \
python -c "
import gradio as gr
from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import export_to_video
import torch

pipe = StableVideoDiffusionPipeline.from_pretrained(
    'stabilityai/stable-video-diffusion-img2vid-xt',
    torch_dtype=torch.float16,
    variant='fp16'
).to('cuda')

def generate(image, seed, fps):
    generator = torch.manual_seed(seed)
    frames = pipe(image, num_frames=25, generator=generator).frames[0]
    export_to_video(frames, 'output.mp4', fps=fps)
    return 'output.mp4'

gr.Interface(
    fn=generate,
    inputs=[gr.Image(type='pil'), gr.Number(value=42, label='Seed'), gr.Slider(6, 30, value=7, label='FPS')],
    outputs=gr.Video(),
    title='Stable Video Diffusion'
).launch(server_name='0.0.0.0', server_port=7860)
"
```

## Accessing Your Service

After deployment, find your `http_pub` URL in **My Orders**:

1. Go to **My Orders** page
2. Click on your order
3. Find the `http_pub` URL (e.g., `abc123.clorecloud.net`)

Use `https://YOUR_HTTP_PUB_URL` instead of `localhost` in examples below.

## Installation

```bash
pip install diffusers transformers accelerate torch

# For video export
pip install imageio[ffmpeg]
```

## Basic Usage

```python
import torch
from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import load_image, export_to_video

# Load pipeline
pipe = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid-xt",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")

# Load and resize image
image = load_image("input.jpg")
image = image.resize((1024, 576))

# Generate video
generator = torch.manual_seed(42)
frames = pipe(image, num_frames=25, generator=generator).frames[0]

# Save video
export_to_video(frames, "output.mp4", fps=7)
```

## SVD vs SVD-XT

| Feature  | SVD     | SVD-XT    |
| -------- | ------- | --------- |
| Frames   | 14      | 25        |
| Duration | \~2 sec | \~3.5 sec |
| VRAM     | 16GB    | 24GB      |
| Quality  | Good    | Better    |

## Memory Optimization

```python

# Enable memory efficient attention
pipe.enable_model_cpu_offload()

# Or use attention slicing
pipe.enable_attention_slicing()

# For very low VRAM
pipe.enable_sequential_cpu_offload()
```

## Batch Processing

```python
import os
from pathlib import Path

input_dir = Path("./images")
output_dir = Path("./videos")
output_dir.mkdir(exist_ok=True)

for img_path in input_dir.glob("*.jpg"):
    image = load_image(str(img_path)).resize((1024, 576))
    frames = pipe(image, num_frames=25).frames[0]
    export_to_video(frames, str(output_dir / f"{img_path.stem}.mp4"), fps=7)
    print(f"Generated: {img_path.stem}.mp4")
```

## ComfyUI Integration

SVD works great in ComfyUI:

1. Install ComfyUI
2. Download SVD model to `models/checkpoints/`
3. Use SVD nodes for img2vid workflow

## Troubleshooting

{% hint style="danger" %}
**Out of memory**
{% endhint %}

* Use `enable_model_cpu_offload()`
* Reduce `num_frames` to 14
* Use fp16 variant

### Video too short

* Use SVD-XT (25 frames) instead of SVD (14 frames)
* Interpolate with RIFE for smoother result

### Poor motion quality

* Use high-quality input images
* Ensure image is 1024x576 (or 576x1024)
* Try different seeds

### CUDA errors

* Update PyTorch and diffusers
* Check CUDA version compatibility

## Cost Estimate

Typical CLORE.AI marketplace rates (as of 2024):

| GPU       | Hourly Rate | Daily Rate | 4-Hour Session |
| --------- | ----------- | ---------- | -------------- |
| RTX 3060  | \~$0.03     | \~$0.70    | \~$0.12        |
| RTX 3090  | \~$0.06     | \~$1.50    | \~$0.25        |
| RTX 4090  | \~$0.10     | \~$2.30    | \~$0.40        |
| A100 40GB | \~$0.17     | \~$4.00    | \~$0.70        |
| A100 80GB | \~$0.25     | \~$6.00    | \~$1.00        |

*Prices vary by provider. Check* [*CLORE.AI Marketplace*](https://clore.ai/marketplace) *for current rates.*

## Next Steps

* AnimateDiff - Animate SD images
* [RIFE Interpolation](/guides/video-processing/rife-interpolation.md) - Increase FPS
* [Hunyuan Video](/guides/video-generation/hunyuan-video.md) - Text-to-video


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/video-generation/stable-video-diffusion.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
