# AI Video Generation

Generate videos using Stable Video Diffusion, AnimateDiff, and other models.

{% hint style="success" %}
All examples can be run on GPU servers rented through [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## Renting on CLORE.AI

1. Visit [CLORE.AI Marketplace](https://clore.ai/marketplace)
2. Filter by GPU type, VRAM, and price
3. Choose **On-Demand** (fixed rate) or **Spot** (bid price)
4. Configure your order:
   * Select Docker image
   * Set ports (TCP for SSH, HTTP for web UIs)
   * Add environment variables if needed
   * Enter startup command
5. Select payment: **CLORE**, **BTC**, or **USDT/USDC**
6. Create order and wait for deployment

### Access Your Server

* Find connection details in **My Orders**
* Web interfaces: Use the HTTP port URL
* SSH: `ssh -p <port> root@<proxy-address>`

## Available Models

| Model       | Type           | VRAM | Duration    |
| ----------- | -------------- | ---- | ----------- |
| SVD         | Image-to-Video | 16GB | 4 seconds   |
| SVD-XT      | Image-to-Video | 20GB | 4 seconds   |
| AnimateDiff | Text-to-Video  | 12GB | 2-4 seconds |
| CogVideoX   | Text-to-Video  | 24GB | 6 seconds   |

## Stable Video Diffusion (SVD)

### Quick Deploy

**Docker Image:**

```
pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
```

**Ports:**

```
22/tcp
7860/http
```

**Command:**

```bash
pip install diffusers transformers accelerate gradio imageio && \
python svd_server.py
```

## Accessing Your Service

After deployment, find your `http_pub` URL in **My Orders**:

1. Go to **My Orders** page
2. Click on your order
3. Find the `http_pub` URL (e.g., `abc123.clorecloud.net`)

Use `https://YOUR_HTTP_PUB_URL` instead of `localhost` in examples below.

### SVD Script

```python
import torch
from diffusers import StableVideoDiffusionPipeline
from PIL import Image
import imageio

# Load model
pipe = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid-xt",
    torch_dtype=torch.float16,
)
pipe.to("cuda")
pipe.enable_model_cpu_offload()

# Load and resize image
image = Image.open("input.png").resize((1024, 576))

# Generate video
frames = pipe(
    image,
    decode_chunk_size=8,
    num_frames=25,
    motion_bucket_id=127,
    noise_aug_strength=0.02
).frames[0]

# Save as GIF
imageio.mimsave("output.gif", frames, fps=6)

# Save as MP4
imageio.mimsave("output.mp4", frames, fps=6)
```

### SVD with Gradio UI

```python
import gradio as gr
import torch
from diffusers import StableVideoDiffusionPipeline
from PIL import Image
import imageio
import tempfile

pipe = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid-xt",
    torch_dtype=torch.float16,
)
pipe.enable_model_cpu_offload()

def generate_video(image, motion_bucket, fps, num_frames):
    image = image.resize((1024, 576))

    frames = pipe(
        image,
        decode_chunk_size=4,
        num_frames=num_frames,
        motion_bucket_id=motion_bucket,
    ).frames[0]

    with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as f:
        imageio.mimsave(f.name, frames, fps=fps)
        return f.name

demo = gr.Interface(
    fn=generate_video,
    inputs=[
        gr.Image(type="pil", label="Input Image"),
        gr.Slider(1, 255, value=127, label="Motion Amount"),
        gr.Slider(1, 30, value=6, label="FPS"),
        gr.Slider(14, 25, value=25, label="Frames")
    ],
    outputs=gr.Video(label="Generated Video"),
)

demo.launch(server_name="0.0.0.0", server_port=7860)
```

## AnimateDiff

### Installation

```bash
pip install diffusers transformers accelerate
```

### Generate Video from Text

```python
import torch
from diffusers import AnimateDiffPipeline, MotionAdapter, DDIMScheduler
import imageio

# Load motion adapter
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2")

# Load pipeline
pipe = AnimateDiffPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    motion_adapter=adapter,
    torch_dtype=torch.float16,
)
pipe.scheduler = DDIMScheduler.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    subfolder="scheduler",
    clip_sample=False,
    timestep_spacing="linspace",
    beta_schedule="linear",
    steps_offset=1,
)
pipe.to("cuda")
pipe.enable_model_cpu_offload()

# Generate
output = pipe(
    prompt="A cat walking through a garden, beautiful flowers, sunny day",
    negative_prompt="bad quality, blurry",
    num_frames=16,
    guidance_scale=7.5,
    num_inference_steps=25,
)

# Save
frames = output.frames[0]
imageio.mimsave("animatediff.gif", frames, fps=8)
```

### AnimateDiff with Custom Model

```python
from diffusers import AnimateDiffPipeline, MotionAdapter, EulerDiscreteScheduler

adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2")

# Use a custom checkpoint (e.g., RealisticVision)
pipe = AnimateDiffPipeline.from_pretrained(
    "SG161222/Realistic_Vision_V5.1_noVAE",
    motion_adapter=adapter,
    torch_dtype=torch.float16,
)
```

## AnimateDiff in ComfyUI

### Install Nodes

```bash
cd /workspace/ComfyUI/custom_nodes
git clone https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved.git
git clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite.git
```

### Download Motion Models

```bash
cd /workspace/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved/models
wget https://huggingface.co/guoyww/animatediff/resolve/main/mm_sd_v15_v2.ckpt
```

## CogVideoX

### Text-to-Video

```python
import torch
from diffusers import CogVideoXPipeline
import imageio

pipe = CogVideoXPipeline.from_pretrained(
    "THUDM/CogVideoX-2b",
    torch_dtype=torch.float16
)
pipe.to("cuda")
pipe.enable_model_cpu_offload()

prompt = "A drone flying over a beautiful mountain landscape at sunset"

video = pipe(
    prompt=prompt,
    num_videos_per_prompt=1,
    num_inference_steps=50,
    num_frames=49,
    guidance_scale=6,
).frames[0]

imageio.mimsave("cogvideo.mp4", video, fps=8)
```

## Video Upscaling

### Real-ESRGAN for Video

```python
import cv2
import torch
from basicsr.archs.rrdbnet_arch import RRDBNet
from realesrgan import RealESRGANer

model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4)
upsampler = RealESRGANer(
    scale=4,
    model_path='RealESRGAN_x4plus.pth',
    model=model,
    tile=400,
    tile_pad=10,
    pre_pad=0,
    half=True
)

# Process video frame by frame
cap = cv2.VideoCapture("input.mp4")

# ... upscale each frame
```

## Interpolation (Smooth Videos)

### FILM Frame Interpolation

```python

# Install
pip install tensorflow tensorflow_hub

import tensorflow as tf
import tensorflow_hub as hub

model = hub.load("https://tfhub.dev/google/film/1")

def interpolate(frame1, frame2, num_interpolations=3):
    # Returns interpolated frames between frame1 and frame2
    ...
```

### RIFE (Real-Time)

```bash
pip install rife-ncnn-vulkan-python

from rife_ncnn_vulkan import Rife
rife = Rife(gpu_id=0)

# Interpolate frames
```

## Batch Video Generation

```python
prompts = [
    "A rocket launching into space",
    "Ocean waves crashing on rocks",
    "A butterfly flying through flowers",
]

for i, prompt in enumerate(prompts):
    print(f"Generating {i+1}/{len(prompts)}")
    output = pipe(prompt, num_frames=16)
    imageio.mimsave(f"video_{i:03d}.mp4", output.frames[0], fps=8)
```

## Memory Tips

### For Limited VRAM

```python

# Enable CPU offload
pipe.enable_model_cpu_offload()

# Enable VAE slicing
pipe.enable_vae_slicing()

# Enable attention slicing
pipe.enable_attention_slicing()

# Reduce frame count
num_frames = 14  # Instead of 25
```

### Chunked Decoding

```python
frames = pipe(
    image,
    decode_chunk_size=2,  # Decode 2 frames at a time
    num_frames=25,
).frames[0]
```

## Converting Output

### GIF to MP4

```bash
ffmpeg -i input.gif -movflags faststart -pix_fmt yuv420p -vf "scale=trunc(iw/2)*2:trunc(ih/2)*2" output.mp4
```

### Frame Sequence to Video

```bash
ffmpeg -framerate 8 -i frame_%04d.png -c:v libx264 -pix_fmt yuv420p output.mp4
```

### Add Audio

```bash
ffmpeg -i video.mp4 -i audio.mp3 -c:v copy -c:a aac -shortest output_with_audio.mp4
```

## Performance

| Model       | GPU      | Frames | Time   |
| ----------- | -------- | ------ | ------ |
| SVD-XT      | RTX 3090 | 25     | \~120s |
| SVD-XT      | RTX 4090 | 25     | \~80s  |
| SVD-XT      | A100     | 25     | \~50s  |
| AnimateDiff | RTX 3090 | 16     | \~30s  |
| CogVideoX   | A100     | 49     | \~180s |

## Cost Estimate

Typical CLORE.AI marketplace rates (as of 2024):

| GPU       | Hourly Rate | Daily Rate | 4-Hour Session |
| --------- | ----------- | ---------- | -------------- |
| RTX 3060  | \~$0.03     | \~$0.70    | \~$0.12        |
| RTX 3090  | \~$0.06     | \~$1.50    | \~$0.25        |
| RTX 4090  | \~$0.10     | \~$2.30    | \~$0.40        |
| A100 40GB | \~$0.17     | \~$4.00    | \~$0.70        |
| A100 80GB | \~$0.25     | \~$6.00    | \~$1.00        |

*Prices vary by provider and demand. Check* [*CLORE.AI Marketplace*](https://clore.ai/marketplace) *for current rates.*

**Save money:**

* Use **Spot** market for flexible workloads (often 30-50% cheaper)
* Pay with **CLORE** tokens
* Compare prices across different providers

## Troubleshooting

### OOM Error

* Reduce num\_frames
* Enable CPU offload
* Use smaller decode\_chunk\_size

### Flickering Video

* Increase num\_inference\_steps
* Try different motion\_bucket\_id
* Use frame interpolation

### Poor Quality

* Use higher resolution input (SVD)
* Better prompts (AnimateDiff)
* Increase guidance\_scale


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/video-generation/ai-video-generation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
