# AnimateDiff

AnimateDiff is a plug-and-play module that **animates your existing Stable Diffusion models** without any additional training. With over 10,000 GitHub stars, it is the go-to framework for turning still-image SD checkpoints into smooth, temporally-consistent video generators. Run it on a Clore.ai GPU instance using ComfyUI as the front-end for maximum flexibility.

***

## What is AnimateDiff?

AnimateDiff inserts a **motion module** into a frozen Stable Diffusion U-Net. The motion module is trained once on video data and can be combined with any fine-tuned SD 1.5 checkpoint — DreamBooth models, LoRAs, ControlNet adapters — without re-training. The result is short animated clips (typically 16–32 frames at 8 fps) that preserve the style of the base model.

**Key highlights:**

* Works with any SD 1.5 checkpoint out of the box
* Compatible with ControlNet, IP-Adapter, LoRAs, and other extensions
* ComfyUI node ecosystem provides full pipeline control
* SDXL motion modules available for higher-resolution output
* Community-maintained model zoo with domain-specific motion modules

***

## Prerequisites

| Requirement | Minimum  | Recommended     |
| ----------- | -------- | --------------- |
| GPU VRAM    | 8 GB     | 16–24 GB        |
| GPU         | RTX 3080 | RTX 4090 / A100 |
| RAM         | 16 GB    | 32 GB           |
| Storage     | 20 GB    | 50+ GB          |

{% hint style="info" %}
AnimateDiff with a standard 16-frame sequence at 512×512 consumes approximately 8–10 GB VRAM. For 768×768 or longer sequences, 16+ GB is recommended.
{% endhint %}

***

## Step 1 — Rent a GPU on Clore.ai

1. Go to [clore.ai](https://clore.ai) and sign in.
2. Click **Marketplace** and filter by VRAM (≥ 16 GB for best results).
3. Select a server — RTX 4090 or A6000 offers the best price/performance.
4. Under **Docker image**, enter your custom image (see Step 2 below).
5. Configure **open ports**: `22` (SSH) and `8188` (ComfyUI web UI).
6. Click **Rent** and wait for the instance to start (\~1–2 minutes).

{% hint style="info" %}
Use the **Advanced** port configuration to map port `8188` to a public port. Note the assigned public port — you will use it to access the ComfyUI web interface.
{% endhint %}

***

## Step 2 — Docker Image

There is no single official AnimateDiff Docker image. The recommended approach is to use a **ComfyUI-based image** with AnimateDiff nodes pre-installed.

**Recommended public image:**

```
yanwk/comfyui-boot:latest
```

Or build your own:

```dockerfile
FROM pytorch/pytorch:2.1.2-cuda12.1-cudnn8-runtime

RUN apt-get update && apt-get install -y \
    git wget curl ffmpeg libgl1 libglib2.0-0 \
    openssh-server && \
    rm -rf /var/lib/apt/lists/*

# Set up SSH
RUN mkdir /var/run/sshd && \
    echo 'root:clore123' | chpasswd && \
    sed -i 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config

# Clone ComfyUI
RUN git clone https://github.com/comfyanonymous/ComfyUI /workspace/ComfyUI && \
    cd /workspace/ComfyUI && pip install -r requirements.txt

# Install ComfyUI Manager
RUN cd /workspace/ComfyUI/custom_nodes && \
    git clone https://github.com/ltdrdata/ComfyUI-Manager

# Install AnimateDiff-Evolved nodes
RUN cd /workspace/ComfyUI/custom_nodes && \
    git clone https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved && \
    pip install -r ComfyUI-AnimateDiff-Evolved/requirements.txt

# Install VideoHelperSuite for output
RUN cd /workspace/ComfyUI/custom_nodes && \
    git clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite && \
    pip install -r ComfyUI-VideoHelperSuite/requirements.txt

EXPOSE 22 8188

CMD service ssh start && \
    python /workspace/ComfyUI/main.py --listen 0.0.0.0 --port 8188 --enable-cors-header
```

***

## Step 3 — Connect via SSH

Once the instance is running, connect via SSH to download models:

```bash
ssh root@<clore-host> -p <assigned-ssh-port>
```

Replace `<clore-host>` and `<assigned-ssh-port>` with the values shown in your Clore.ai dashboard.

***

## Step 4 — Download Models

AnimateDiff requires at minimum a **base SD 1.5 checkpoint** and a **motion module**.

### Download Motion Module

```bash
cd /workspace/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved/models

# v3 motion module (recommended)
wget -O v3_sd15_mm.ckpt \
  "https://huggingface.co/guoyww/animatediff/resolve/main/v3_sd15_mm.ckpt"

# v2 motion module (wider compatibility)
wget -O mm_sd_v15_v2.ckpt \
  "https://huggingface.co/guoyww/animatediff/resolve/main/mm_sd_v15_v2.ckpt"
```

### Download a Base SD 1.5 Checkpoint

```bash
cd /workspace/ComfyUI/models/checkpoints

# Realistic Vision (popular for AnimateDiff)
wget -O realisticVisionV60B1_v51VAE.safetensors \
  "https://huggingface.co/SG161222/Realistic_Vision_V6.0_B1_noVAE/resolve/main/Realistic_Vision_V6.0_B1_fp16-no-ema.safetensors"
```

{% hint style="info" %}
You can use any SD 1.5 fine-tune. Popular choices include DreamShaper, Deliberate, and Epicphotogasm. Download from CivitAI or Hugging Face.
{% endhint %}

### (Optional) Download SDXL Motion Module

```bash
cd /workspace/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved/models

wget -O temporaldiff-v1-animatediff.safetensors \
  "https://huggingface.co/CiaraRowles/TemporalDiff/resolve/main/temporaldiff-v1-animatediff.safetensors"
```

***

## Step 5 — Access ComfyUI

Open your browser and navigate to:

```
http://<clore-host>:<public-port-8188>
```

You should see the ComfyUI node editor interface.

{% hint style="info" %}
Bookmark this URL. ComfyUI autosaves your workflow as you work — no need to manually save unless exporting JSON.
{% endhint %}

***

## Step 6 — Load an AnimateDiff Workflow

### Basic AnimateDiff Workflow (JSON)

In ComfyUI, press **Load** and paste or import this workflow JSON, or build it manually with these nodes:

**Core node chain:**

1. `Load Checkpoint` → your SD 1.5 checkpoint
2. `CLIP Text Encode (Prompt)` → positive and negative prompts
3. `AnimateDiff Loader` → select your motion module
4. `KSampler (Efficient)` → sampling settings
5. `VAE Decode` → decode latents
6. `Video Combine` (VideoHelperSuite) → export as GIF/MP4

### Recommended Sampling Settings

| Parameter      | Value           |
| -------------- | --------------- |
| Steps          | 20–25           |
| CFG Scale      | 7–8             |
| Sampler        | DPM++ 2M Karras |
| Width × Height | 512 × 512       |
| Frames         | 16              |
| Context Length | 16              |

***

## Step 7 — Run Your First Animation

1. In the `CLIP Text Encode` node, enter your prompt:

   ```
   A majestic lion walking through tall grass at sunset, cinematic, 4k
   ```
2. In the negative prompt node:

   ```
   worst quality, low quality, blurry, watermark, deformed, nsfw
   ```
3. In `AnimateDiff Loader`, select `v3_sd15_mm.ckpt`
4. Click **Queue Prompt**

{% hint style="info" %}
Generation time for 16 frames at 512×512 with 20 steps is approximately **30–60 seconds** on an RTX 4090. Longer sequences and higher resolutions scale linearly.
{% endhint %}

***

## Advanced Techniques

### Using ControlNet with AnimateDiff

AnimateDiff works with ControlNet for guided video generation:

```bash
# Download ControlNet model
cd /workspace/ComfyUI/models/controlnet
wget -O control_v11p_sd15_openpose.pth \
  "https://huggingface.co/lllyasviel/ControlNet-v1-1/resolve/main/control_v11p_sd15_openpose.pth"
```

Add a `ControlNet Apply` node between `Load ControlNet Model` and `KSampler`. Use an OpenPose skeleton image as the conditioning input.

### Prompt Travel (Keyframe Animation)

The AnimateDiff-Evolved node supports **prompt travel** — different text prompts at different frames:

```
"A forest at dawn" → frame 0
"A forest at noon" → frame 8
"A forest at sunset" → frame 16
```

This creates smooth transitions between scenes without manual keyframing.

### Using LoRA with AnimateDiff

```bash
# Download motion LoRA
cd /workspace/ComfyUI/models/loras
wget -O v2_lora_PanLeft.ckpt \
  "https://huggingface.co/guoyww/animatediff/resolve/main/v2_lora_PanLeft.ckpt"
```

Add a `LoRA Loader` node to apply camera motion effects: PanLeft, PanRight, ZoomIn, ZoomOut, RollingAnticlockwise.

***

## Output Formats

AnimateDiff via VideoHelperSuite supports:

| Format     | Node            | Notes               |
| ---------- | --------------- | ------------------- |
| GIF        | `Video Combine` | Best for sharing    |
| MP4 (h264) | `Video Combine` | Smallest file size  |
| WebP       | `Video Combine` | Good quality/size   |
| PNG frames | `Save Image`    | For post-processing |

***

## Troubleshooting

### Out of Memory (CUDA OOM)

```
RuntimeError: CUDA out of memory
```

**Solutions:**

* Reduce frame count (try 8 instead of 16)
* Reduce resolution (512×512 is the sweet spot for SD 1.5)
* Enable `--lowvram` flag in ComfyUI startup command
* Use `fp16` precision in `Load Checkpoint` node

### Motion Module Not Found

```
Error: motion module not found
```

**Solution:** Verify the `.ckpt` file is in:

```
/workspace/ComfyUI/custom_nodes/ComfyUI-AnimateDiff-Evolved/models/
```

Refresh the ComfyUI page to reload available models.

### Flickering / Inconsistent Frames

**Solutions:**

* Increase `context_length` to match total frame count
* Use `v3_sd15_mm.ckpt` instead of v2 (better temporal consistency)
* Lower CFG scale (try 7 instead of 9)
* Use a lower-variance sampler: `DPM++ 2M Karras` or `Euler a`

### SSH Connection Refused

```bash
ssh: connect to host <ip> port <port>: Connection refused
```

**Solution:** Wait 1–2 minutes for the SSH daemon to start, or check if the container has fully initialized via the Clore.ai dashboard logs.

***

## Clore.ai GPU Recommendations

AnimateDiff uses SD 1.5 backbone — VRAM requirements are modest compared to modern video models, making it budget-friendly.

| GPU           | VRAM  | Clore.ai Price | 16-frame @ 512px | Notes                                              |
| ------------- | ----- | -------------- | ---------------- | -------------------------------------------------- |
| RTX 3090      | 24 GB | \~$0.12/hr     | \~50s            | Best value — run multiple queued batches           |
| RTX 4090      | 24 GB | \~$0.70/hr     | \~30s            | Fastest consumer GPU                               |
| A100 40GB     | 40 GB | \~$1.20/hr     | \~18s            | Overkill for SD 1.5, but good for SDXL+AnimateDiff |
| RTX 3080 10GB | 10 GB | \~$0.07/hr     | \~90s            | Budget minimum — limited to 512px, shorter clips   |

{% hint style="info" %}
**RTX 3090 is the AnimateDiff sweet spot** at \~$0.12/hr. A 16-frame animation takes \~50 seconds, meaning you can generate 70+ clips per dollar spent. For high-volume content creation, batch queue in ComfyUI and run overnight.
{% endhint %}

**SDXL AnimateDiff users:** The SDXL motion modules require 12GB+ VRAM for 768px. RTX 3090/4090 handle this well. RTX 3080 (10GB) is too limited for SDXL workflows.

***

## Useful Resources

* [AnimateDiff GitHub](https://github.com/guoyww/AnimateDiff)
* [ComfyUI-AnimateDiff-Evolved](https://github.com/Kosinkadink/ComfyUI-AnimateDiff-Evolved)
* [ComfyUI Official](https://github.com/comfyanonymous/ComfyUI)
* [AnimateDiff Motion Models (HuggingFace)](https://huggingface.co/guoyww/animatediff)
* [CivitAI — SD Checkpoints](https://civitai.com)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/video-generation/animatediff.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
