# Stable Video Diffusion

{% hint style="info" %}
**¡Hay alternativas más nuevas disponibles!** Considera [**FramePack**](/guides/guides_v2-es/generacion-de-video/framepack.md) (¡solo 6 GB de VRAM!), [**Wan2.1**](/guides/guides_v2-es/generacion-de-video/wan-video.md) (mayor calidad), o [**LTX-2**](/guides/guides_v2-es/generacion-de-video/ltx-video-2.md) (video con audio nativo).
{% endhint %}

Genera videos a partir de imágenes usando el modelo SVD de Stability AI.

{% hint style="success" %}
Todos los ejemplos se pueden ejecutar en servidores GPU alquilados a través de [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## ¿Qué es Stable Video Diffusion?

SVD (Stable Video Diffusion) genera clips de video cortos a partir de una sola imagen:

* salidas de 14 o 25 fotogramas
* resolución 576x1024
* Generación de movimiento suave
* Pesos de código abierto

## Recursos

* **HuggingFace:** [stabilityai/stable-video-diffusion-img2vid-xt](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)
* **GitHub:** [Stability-AI/generative-models](https://github.com/Stability-AI/generative-models)
* **Artículo:** [Artículo de SVD](https://arxiv.org/abs/2311.15127)

## Requisitos de hardware

| Modelo                 | VRAM | GPU recomendada |
| ---------------------- | ---- | --------------- |
| SVD (14 fotogramas)    | 16GB | RTX 4090        |
| SVD-XT (25 fotogramas) | 24GB | RTX 4090 / A100 |

## Despliegue rápido

**Imagen Docker:**

```
pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
```

**Puertos:**

```
22/tcp
7860/http
```

**Comando:**

```bash
pip install diffusers transformers accelerate && \
pip install gradio && \
python -c "
import gradio as gr
from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import export_to_video
import torch

pipe = StableVideoDiffusionPipeline.from_pretrained(
    'stabilityai/stable-video-diffusion-img2vid-xt',
    torch_dtype=torch.float16,
    variant='fp16'
).to('cuda')

def generate(image, seed, fps):
    generator = torch.manual_seed(seed)
    frames = pipe(image, num_frames=25, generator=generator).frames[0]
    export_to_video(frames, 'output.mp4', fps=fps)
    return 'output.mp4'

gr.Interface(
    fn=generate,
    inputs=[gr.Image(type='pil'), gr.Number(value=42, label='Seed'), gr.Slider(6, 30, value=7, label='FPS')],
    outputs=gr.Video(),
    title='Stable Video Diffusion'
).launch(server_name='0.0.0.0', server_port=7860)
"
```

## Accediendo a tu servicio

Después del despliegue, encuentra tu `http_pub` URL en **Mis Pedidos**:

1. Ir a **Mis Pedidos** página
2. Haz clic en tu pedido
3. Encuentra la `http_pub` URL (por ejemplo, `abc123.clorecloud.net`)

Usa `https://TU_HTTP_PUB_URL` en lugar de `localhost` en los ejemplos abajo.

## Instalación

```bash
pip install diffusers transformers accelerate torch

# Para exportar video
pip install imageio[ffmpeg]
```

## Uso básico

```python
import torch
from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import load_image, export_to_video

# Cargar pipeline
pipe = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid-xt",
    torch_dtype=torch.float16,
    variant="fp16"
)
pipe.to("cuda")

# Cargar y cambiar tamaño de la imagen
image = load_image("input.jpg")
image = image.resize((1024, 576))

# Generar video
generator = torch.manual_seed(42)
frames = pipe(image, num_frames=25, generator=generator).frames[0]

# Guardar video
export_to_video(frames, "output.mp4", fps=7)
```

## SVD vs SVD-XT

| Función    | SVD     | SVD-XT    |
| ---------- | ------- | --------- |
| Fotogramas | 14      | 25        |
| Duración   | \~2 seg | \~3.5 seg |
| VRAM       | 16GB    | 24GB      |
| Calidad    | Bueno   | Mejor     |

## Optimización de memoria

```python

# Habilitar atención eficiente en memoria
pipe.enable_model_cpu_offload()

# O usar segmentación de atención
pipe.enable_attention_slicing()

# Para VRAM muy baja
pipe.enable_sequential_cpu_offload()
```

## Procesamiento por lotes

```python
import os
from pathlib import Path

input_dir = Path("./images")
output_dir = Path("./videos")
output_dir.mkdir(exist_ok=True)

for img_path in input_dir.glob("*.jpg"):
    image = load_image(str(img_path)).resize((1024, 576))
    frames = pipe(image, num_frames=25).frames[0]
    export_to_video(frames, str(output_dir / f"{img_path.stem}.mp4"), fps=7)
    print(f"Generated: {img_path.stem}.mp4")
```

## Integración con ComfyUI

SVD funciona muy bien en ComfyUI:

1. Instalar ComfyUI
2. Descargar el modelo SVD a `models/checkpoints/`
3. Usar nodos SVD para el flujo de trabajo img2vid

## Solución de problemas

{% hint style="danger" %}
**Fuera de memoria**
{% endhint %}

* Usa `enable_model_cpu_offload()`
* Reducir `num_frames` a 14
* Usar la variante fp16

### Video demasiado corto

* Usa SVD-XT (25 fotogramas) en lugar de SVD (14 fotogramas)
* Interpolar con RIFE para un resultado más suave

### Mala calidad de movimiento

* Usa imágenes de entrada de alta calidad
* Asegúrate de que la imagen sea 1024x576 (o 576x1024)
* Prueba diferentes semillas

### Errores de CUDA

* Actualiza PyTorch y diffusers
* Verifica la compatibilidad de la versión de CUDA

## Estimación de costos

Tarifas típicas del marketplace de CLORE.AI (a fecha de 2024):

| GPU       | Tarifa por hora | Tarifa diaria | Sesión de 4 horas |
| --------- | --------------- | ------------- | ----------------- |
| RTX 3060  | \~$0.03         | \~$0.70       | \~$0.12           |
| RTX 3090  | \~$0.06         | \~$1.50       | \~$0.25           |
| RTX 4090  | \~$0.10         | \~$2.30       | \~$0.40           |
| A100 40GB | \~$0.17         | \~$4.00       | \~$0.70           |
| A100 80GB | \~$0.25         | \~$6.00       | \~$1.00           |

*Los precios varían según el proveedor. Consulta* [*CLORE.AI Marketplace*](https://clore.ai/marketplace) *para las tarifas actuales.*

## Próximos pasos

* AnimateDiff - Animar imágenes SD
* [Interpolación RIFE](/guides/guides_v2-es/procesamiento-de-video/rife-interpolation.md) - Aumentar FPS
* [Hunyuan Video](/guides/guides_v2-es/generacion-de-video/hunyuan-video.md) - Texto a video


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/guides_v2-es/generacion-de-video/stable-video-diffusion.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.