# FramePack Video Generation

FramePack is a breakthrough in AI video generation: it can create videos up to 2 minutes long using **just 6GB of VRAM**. Built on the HunyuanVideo architecture, FramePack's key innovation is packing frames efficiently so that GPU memory stays constant regardless of video length. This makes AI video generation accessible on budget GPUs that were previously too limited.

## Key Features

* **6GB VRAM minimum**: Works on RTX 3060, RTX 3070, even GTX 1060!
* **Up to 2-minute videos**: Constant VRAM usage regardless of video length
* **Image-to-Video**: Animate any image with a text prompt
* **Web UI included**: Gradio-based interface for easy use
* **Built on HunyuanVideo**: Leverages Tencent's video diffusion architecture
* **Open source**: GitHub with active development

## Requirements

| Component | Minimum      | Recommended   |
| --------- | ------------ | ------------- |
| GPU       | GTX 1060 6GB | RTX 4090 24GB |
| VRAM      | 6GB          | 12GB+         |
| RAM       | 16GB         | 32GB          |
| Disk      | 30GB         | 50GB          |
| CUDA      | 11.8+        | 12.0+         |
| Python    | 3.10+        | 3.11          |

**Recommended Clore.ai GPU**: RTX 3080 10GB (\~$0.2–0.5/day) — great quality at low cost!

### Speed Reference

| GPU           | Time per Frame | 60-frame Video (\~2s at 30fps) |
| ------------- | -------------- | ------------------------------ |
| RTX 3060 12GB | \~30 sec       | \~30 min                       |
| RTX 3080 10GB | \~18 sec       | \~18 min                       |
| RTX 4080 16GB | \~12 sec       | \~12 min                       |
| RTX 4090 24GB | \~8 sec        | \~8 min                        |
| RTX 5090 32GB | \~5 sec        | \~5 min                        |

## Installation

```bash
# Clone repository
git clone https://github.com/lllyasviel/FramePack.git
cd FramePack

# Create conda environment (recommended)
conda create -n framepack python=3.11 -y
conda activate framepack

# Install dependencies
pip install -r requirements.txt

# Install PyTorch with CUDA
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
```

### Docker Setup

```bash
docker run --gpus all -p 7860:7860 \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  -v ./outputs:/app/outputs \
  ghcr.io/lllyasviel/framepack:latest
```

## Quick Start — Web UI

The easiest way to use FramePack:

```bash
cd FramePack
python app.py --port 7860

# For low VRAM (6GB):
python app.py --port 7860 --low-vram

# Access at http://localhost:7860
```

**Web UI workflow:**

1. Upload a source image (the first frame)
2. Enter a text prompt describing the motion ("camera slowly zooms in", "person walks forward")
3. Set video length (number of frames)
4. Click Generate
5. Download the MP4

## Usage

FramePack is a **Gradio web application**, not a Python library. The primary interface is the web UI.

### Web UI Workflow

1. Open `http://localhost:7860` after launching
2. Upload a source image (will be the first frame)
3. Enter a text prompt describing the desired motion
4. Set number of frames (more = longer video)
5. Click **Generate** → wait → download MP4

### API Access via Gradio Client

You can call FramePack programmatically using the Gradio API:

```python
from gradio_client import Client

# Connect to running FramePack instance
client = Client("http://localhost:7860")

# Generate video from image + prompt
result = client.predict(
    "input_photo.jpg",                              # source image
    "the person smiles and turns their head slowly", # prompt
    60,                                              # num frames
    7.5,                                             # guidance scale
    30,                                              # inference steps
    42,                                              # seed
    api_name="/generate"
)
print(f"Video saved to: {result}")
```

### Batch Processing with Gradio Client

```python
from gradio_client import Client
import glob

client = Client("http://localhost:7860")

prompts = [
    ("photo1.jpg", "gentle camera zoom with soft lighting"),
    ("photo2.jpg", "wind blowing through hair, clouds moving"),
    ("photo3.jpg", "slow zoom out revealing the full scene"),
]

for img_path, prompt in prompts:
    result = client.predict(img_path, prompt, 60, 7.5, 30, -1, api_name="/generate")
    print(f"Done: {img_path} → {result}")
```

## Resolution Guide

| VRAM | Max Resolution | Quality               |
| ---- | -------------- | --------------------- |
| 6GB  | 512×512        | Good for social media |
| 8GB  | 640×640        | Better detail         |
| 10GB | 512×768        | Portrait/landscape    |
| 12GB | 768×768        | High quality          |
| 24GB | 1024×768       | Best quality          |

## Tips for Clore.ai Users

* **Budget-friendly**: This is one of the few video AI models that works on cheap GPUs ($0.15–0.3/day for RTX 3060!)
* **Use `--low-vram` flag**: Essential for 6–8GB GPUs — enables CPU offloading automatically
* **512×512 is fine**: For social media (TikTok, Reels), 512px is perfectly acceptable
* **Longer ≠ more VRAM**: Unlike other video models, FramePack keeps VRAM constant — generate longer videos freely
* **Pre-download models**: First run downloads \~15GB. Run once, then your Clore session has models cached
* **Combine with upscaling**: Generate at 512×512, then use Real-ESRGAN to upscale to 2K/4K

## Prompt Tips

Good prompts describe **motion**, not just appearance:

```
✅ "the camera slowly pans right, revealing a mountain landscape"
✅ "the person blinks and smiles gently, wind moves their hair"
✅ "zoom out slowly, showing the full building"

❌ "a beautiful sunset" (no motion described)
❌ "high quality, 4K, detailed" (style words don't help much)
```

## Troubleshooting

| Issue                  | Solution                                                              |
| ---------------------- | --------------------------------------------------------------------- |
| CUDA out of memory     | Use `--low-vram` flag, reduce resolution to 512×512                   |
| Very slow generation   | Normal for 6GB GPUs (\~30s/frame). Use RTX 4090 for 4x speed          |
| Black/corrupted frames | Update PyTorch: `pip install torch --upgrade`                         |
| Model download hangs   | Check disk space (needs 30GB free). Try `HF_HUB_ENABLE_HF_TRANSFER=1` |
| Web UI won't start     | Check port 7860 is free: `lsof -i :7860`                              |

## Further Reading

* [GitHub Repository](https://github.com/lllyasviel/FramePack)
* [HunyuanVideo (base model)](https://github.com/Tencent/HunyuanVideo)
* [Clore.ai GPU Comparison](https://docs.clore.ai/guides/getting-started/gpu-comparison) — find the cheapest GPU for your needs
