# AnimateDiff

AnimateDiff is a plug-and-play module that **animates your existing Stable Diffusion models** without any additional training. With over 10,000 GitHub stars, it is the go-to framework for turning still-image SD checkpoints into smooth, temporally-consistent video generators. Run it on a Clore.ai GPU instance using ComfyUI as the front-end for maximum flexibility.

***

## What is AnimateDiff?

AnimateDiff inserts a **motion module** into a frozen Stable Diffusion U-Net. The motion module is trained once on video data and can be combined with any fine-tuned SD 1.5 checkpoint — DreamBooth models, LoRAs, ControlNet adapters — without re-training. The result is short animated clips (typically 16–32 frames at 8 fps) that preserve the style of the base model.

**Key highlights:**

* Works with any SD 1.5 checkpoint out of the box
* Compatible with ControlNet, IP-Adapter, LoRAs, and other extensions
* ComfyUI node ecosystem provides full pipeline control
* SDXL motion modules available for higher-resolution output
* Community-maintained model zoo with domain-specific motion modules

***

## Prerequisites

| Requirement | Minimum  | Recommended     |
| ----------- | -------- | --------------- |
| GPU VRAM    | 8 GB     | 16–24 GB        |
| GPU         | RTX 3080 | RTX 4090 / A100 |
| RAM         | 16 GB    | 32 GB           |
| Storage     | 20 GB    | 50+ GB          |

{% hint style="info" %}
AnimateDiff with a standard 16-frame sequence at 512×512 consumes approximately 8–10 GB VRAM. For 768×768 or longer sequences, 16+ GB is recommended.
{% endhint %}

***

## Step 1 — Rent a GPU on Clore.ai

1. Go to [clore.ai](https://clore.ai) and sign in.
2. Click **Marketplace** and filter by VRAM (≥ 16 GB for best results).
3. Select a server — RTX 4090 or A6000 offers the best price/performance.
4. Under **Docker image**, enter your custom image (see Step 2 below).
5. Configure **open ports**: `22` (SSH) and `8188` (ComfyUI web UI).
6. Click **Rent** and wait for the instance to start (\~1–2 minutes).

{% hint style="info" %}
Use the **Advanced** port configuration to map port `8188` to a public port. Note the assigned public port — you will use it to access the ComfyUI web interface.
{% endhint %}

***

## Step 2 — Docker Image

There is no single official AnimateDiff Docker image. The recommended approach is to use a **ComfyUI-based image** with AnimateDiff nodes pre-installed.

**Recommended public image:**

```
yanwk/comfyui-boot:latest
```

Or build your own:

```dockerfile
FROM pytorch/pytorch:2.1.2-cuda12.1-cudnn8-runtime

RUN apt-get update && apt-get install -y \
    git wget curl ffmpeg libgl1 libglib2.0-0 \
    openssh-server && \
    rm -rf /var/lib/apt/lists/*

# Set up SSH
RUN mkdir /var/run/sshd && \
    echo 'root:clore123' | chpasswd && \
    sed -i 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config

# Clone ComfyUI
RUN git clone https://github.com/comfyanonymous/ComfyUI /workspace/ComfyUI && \
    cd /workspace/ComfyUI && pip install -r requirements.txt

# Install ComfyUI Manager
RUN cd /workspace/ComfyUI/custom_nodes && \
    git clone https://github.com/ltdrdata/ComfyUI-Manager

# Install AnimateDiff-Evolved nodes
RUN cd /workspace/ComfyUI/custom_nodes && \
    git clone https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved && \
    pip install -r ComfyUI-AnimateDiff-Evolved/requirements.txt

# Install VideoHelperSuite for output
RUN cd /workspace/ComfyUI/custom_nodes && \
    git clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite && \
    pip install -r ComfyUI-VideoHelperSuite/requirements.txt

EXPOSE 22 8188

CMD service ssh start && \
    python /workspace/ComfyUI/main.py --listen 0.0.0.0 --port 8188 --enable-cors-header
```

***

## Step 3 — Connect via SSH

Once the instance is running, connect via SSH to download models:

```bash
ssh root@<clore-host> -p <assigned-ssh-port>
```

Replace `<clore-host>` and `<assigned-ssh-port>` with the values shown in your Clore.ai dashboard.

***

## Step 4 — Download Models

AnimateDiff requires at minimum a **base SD 1.5 checkpoint** and a **motion module**.

### Download Motion Module

```bash
cd /workspace/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved/models

# v3 motion module (recommended)
wget -O v3_sd15_mm.ckpt \
  "https://huggingface.co/guoyww/animatediff/resolve/main/v3_sd15_mm.ckpt"

# v2 motion module (wider compatibility)
wget -O mm_sd_v15_v2.ckpt \
  "https://huggingface.co/guoyww/animatediff/resolve/main/mm_sd_v15_v2.ckpt"
```

### Download a Base SD 1.5 Checkpoint

```bash
cd /workspace/ComfyUI/models/checkpoints

# Realistic Vision (popular for AnimateDiff)
wget -O realisticVisionV60B1_v51VAE.safetensors \
  "https://huggingface.co/SG161222/Realistic_Vision_V6.0_B1_noVAE/resolve/main/Realistic_Vision_V6.0_B1_fp16-no-ema.safetensors"
```

{% hint style="info" %}
You can use any SD 1.5 fine-tune. Popular choices include DreamShaper, Deliberate, and Epicphotogasm. Download from CivitAI or Hugging Face.
{% endhint %}

### (Optional) Download SDXL Motion Module

```bash
cd /workspace/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved/models

wget -O temporaldiff-v1-animatediff.safetensors \
  "https://huggingface.co/CiaraRowles/TemporalDiff/resolve/main/temporaldiff-v1-animatediff.safetensors"
```

***

## Step 5 — Access ComfyUI

Open your browser and navigate to:

```
http://<clore-host>:<public-port-8188>
```

You should see the ComfyUI node editor interface.

{% hint style="info" %}
Bookmark this URL. ComfyUI autosaves your workflow as you work — no need to manually save unless exporting JSON.
{% endhint %}

***

## Step 6 — Load an AnimateDiff Workflow

### Basic AnimateDiff Workflow (JSON)

In ComfyUI, press **Load** and paste or import this workflow JSON, or build it manually with these nodes:

**Core node chain:**

1. `Load Checkpoint` → your SD 1.5 checkpoint
2. `CLIP Text Encode (Prompt)` → positive and negative prompts
3. `AnimateDiff Loader` → select your motion module
4. `KSampler (Efficient)` → sampling settings
5. `VAE Decode` → decode latents
6. `Video Combine` (VideoHelperSuite) → export as GIF/MP4

### Recommended Sampling Settings

| Parameter      | Value           |
| -------------- | --------------- |
| Steps          | 20–25           |
| CFG Scale      | 7–8             |
| Sampler        | DPM++ 2M Karras |
| Width × Height | 512 × 512       |
| Frames         | 16              |
| Context Length | 16              |

***

## Step 7 — Run Your First Animation

1. In the `CLIP Text Encode` node, enter your prompt:

   ```
   A majestic lion walking through tall grass at sunset, cinematic, 4k
   ```
2. In the negative prompt node:

   ```
   worst quality, low quality, blurry, watermark, deformed, nsfw
   ```
3. In `AnimateDiff Loader`, select `v3_sd15_mm.ckpt`
4. Click **Queue Prompt**

{% hint style="info" %}
Generation time for 16 frames at 512×512 with 20 steps is approximately **30–60 seconds** on an RTX 4090. Longer sequences and higher resolutions scale linearly.
{% endhint %}

***

## Advanced Techniques

### Using ControlNet with AnimateDiff

AnimateDiff works with ControlNet for guided video generation:

```bash
# Download ControlNet model
cd /workspace/ComfyUI/models/controlnet
wget -O control_v11p_sd15_openpose.pth \
  "https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_openpose.pth"
```

Add a `ControlNet Apply` node between `Load ControlNet Model` and `KSampler`. Use an OpenPose skeleton image as the conditioning input.

### Prompt Travel (Keyframe Animation)

The AnimateDiff-Evolved node supports **prompt travel** — different text prompts at different frames:

```
"A forest at dawn" → frame 0
"A forest at noon" → frame 8
"A forest at sunset" → frame 16
```

This creates smooth transitions between scenes without manual keyframing.

### Using LoRA with AnimateDiff

```bash
# Download motion LoRA
cd /workspace/ComfyUI/models/loras
wget -O v2_lora_PanLeft.ckpt \
  "https://huggingface.co/guoyww/animatediff/resolve/main/v2_lora_PanLeft.ckpt"
```

Add a `LoRA Loader` node to apply camera motion effects: PanLeft, PanRight, ZoomIn, ZoomOut, RollingAnticlockwise.

***

## Output Formats

AnimateDiff via VideoHelperSuite supports:

| Format     | Node            | Notes               |
| ---------- | --------------- | ------------------- |
| GIF        | `Video Combine` | Best for sharing    |
| MP4 (h264) | `Video Combine` | Smallest file size  |
| WebP       | `Video Combine` | Good quality/size   |
| PNG frames | `Save Image`    | For post-processing |

***

## Troubleshooting

### Out of Memory (CUDA OOM)

```
RuntimeError: CUDA out of memory
```

**Solutions:**

* Reduce frame count (try 8 instead of 16)
* Reduce resolution (512×512 is the sweet spot for SD 1.5)
* Enable `--lowvram` flag in ComfyUI startup command
* Use `fp16` precision in `Load Checkpoint` node

### Motion Module Not Found

```
Error: motion module not found
```

**Solution:** Verify the `.ckpt` file is in:

```
/workspace/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved/models/
```

Refresh the ComfyUI page to reload available models.

### Flickering / Inconsistent Frames

**Solutions:**

* Increase `context_length` to match total frame count
* Use `v3_sd15_mm.ckpt` instead of v2 (better temporal consistency)
* Lower CFG scale (try 7 instead of 9)
* Use a lower-variance sampler: `DPM++ 2M Karras` or `Euler a`

### SSH Connection Refused

```bash
ssh: connect to host <ip> port <port>: Connection refused
```

**Solution:** Wait 1–2 minutes for the SSH daemon to start, or check if the container has fully initialized via the Clore.ai dashboard logs.

***

## Clore.ai GPU Recommendations

AnimateDiff uses SD 1.5 backbone — VRAM requirements are modest compared to modern video models, making it budget-friendly.

| GPU           | VRAM  | Clore.ai Price | 16-frame @ 512px | Notes                                              |
| ------------- | ----- | -------------- | ---------------- | -------------------------------------------------- |
| RTX 3090      | 24 GB | \~$0.12/hr     | \~50s            | Best value — run multiple queued batches           |
| RTX 4090      | 24 GB | \~$0.70/hr     | \~30s            | Fastest consumer GPU                               |
| A100 40GB     | 40 GB | \~$1.20/hr     | \~18s            | Overkill for SD 1.5, but good for SDXL+AnimateDiff |
| RTX 3080 10GB | 10 GB | \~$0.07/hr     | \~90s            | Budget minimum — limited to 512px, shorter clips   |

{% hint style="info" %}
**RTX 3090 is the AnimateDiff sweet spot** at \~$0.12/hr. A 16-frame animation takes \~50 seconds, meaning you can generate 70+ clips per dollar spent. For high-volume content creation, batch queue in ComfyUI and run overnight.
{% endhint %}

**SDXL AnimateDiff users:** The SDXL motion modules require 12GB+ VRAM for 768px. RTX 3090/4090 handle this well. RTX 3080 (10GB) is too limited for SDXL workflows.

***

## Useful Resources

* [AnimateDiff GitHub](https://github.com/guoyww/AnimateDiff)
* [ComfyUI-AnimateDiff-Evolved](https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved)
* [ComfyUI Official](https://github.com/comfyanonymous/ComfyUI)
* [AnimateDiff Motion Models (HuggingFace)](https://huggingface.co/guoyww/animatediff)
* [CivitAI — SD Checkpoints](https://civitai.com)
