# Mochi-1 Video

**Mochi-1** is Genmo's open-source 10-billion parameter video generation model producing 848×480 @ 30fps output with physically realistic motion. It uses an asymmetric diffusion transformer (AsymmDiT) architecture and ranks among the highest-quality open-source video models for motion fidelity. Deploy it on Clore.ai's GPU cloud to generate professional-grade videos at a fraction of commercial API costs.

***

## What is Mochi-1?

Mochi-1 is a **10-billion parameter** video diffusion model trained to produce videos with:

* Smooth, physically plausible motion
* High temporal consistency
* Strong prompt adherence
* 848×480 resolution at 30 fps

It uses an **asymmetric diffusion transformer** (AsymmDiT) architecture — different encoder depths for video and text — enabling efficient inference at scale. The weights are released under the Genmo Open Source License, free for research and commercial use.

**Model highlights:**

* 10B parameters
* Native 848×480 @ 30 fps output
* High-motion fidelity (ranked top in community benchmarks)
* Available on Hugging Face with diffusers integration
* Gradio demo UI for easy interaction

***

## Prerequisites

| Requirement | Minimum  | Recommended |
| ----------- | -------- | ----------- |
| GPU VRAM    | 24 GB    | 40–80 GB    |
| GPU         | RTX 4090 | A100 / H100 |
| RAM         | 32 GB    | 64 GB       |
| Storage     | 60 GB    | 100 GB      |
| CUDA        | 11.8+    | 12.1+       |

{% hint style="warning" %}
Mochi-1 is a large model (≈40 GB in fp8 / ≈80 GB in bf16). A single RTX 4090 (24 GB) can run it with quantization. For full quality, use an A100 40 GB or larger. Multi-GPU setups are supported.
{% endhint %}

***

## Step 1 — Rent a GPU on Clore.ai

1. Go to [clore.ai](https://clore.ai) and sign in.
2. Click **Marketplace** and filter:
   * VRAM: **≥ 24 GB** (RTX 4090 minimum, A100 recommended)
   * For multi-GPU: filter by GPU count ≥ 2
3. Select your server and click **Configure**.
4. Set Docker image to `pytorch/pytorch:2.4.1-cuda12.4-cudnn9-devel` (base image — we install Mochi inside).
5. Set open ports: `22` (SSH) and `7860` (Gradio UI).
6. Click **Rent**.

{% hint style="info" %}
Clore.ai lists A100 40 GB instances starting from \~$0.60–$0.90/hr. For Mochi-1 at full quality, this is the most cost-effective choice.
{% endhint %}

***

## Step 2 — Custom Dockerfile

Build your own image or use this `Dockerfile` to create a ready-to-use Mochi-1 environment:

```dockerfile
FROM pytorch/pytorch:2.4.1-cuda12.4-cudnn9-devel

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get install -y \
    git wget curl ffmpeg \
    libgl1 libglib2.0-0 \
    openssh-server \
    && rm -rf /var/lib/apt/lists/*

# Configure SSH
RUN mkdir /var/run/sshd && \
    echo 'root:clore123' | chpasswd && \
    sed -i 's/#PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config && \
    sed -i 's/UsePAM yes/UsePAM no/' /etc/ssh/sshd_config

WORKDIR /workspace

# Clone Mochi-1 repository
RUN git clone https://github.com/genmoai/mochi /workspace/mochi

# Install Python dependencies
RUN cd /workspace/mochi && \
    pip install --upgrade pip && \
    pip install -e . && \
    pip install gradio huggingface_hub

EXPOSE 22 7860

CMD service ssh start && \
    echo "Mochi-1 environment ready. Run download script then launch demo." && \
    tail -f /dev/null
```

### Build and Push to Docker Hub

Build the image locally and push it to your own Docker Hub account (replace `YOUR_DOCKERHUB_USERNAME` with your actual username):

```bash
docker build -t YOUR_DOCKERHUB_USERNAME/mochi-1:latest .
docker push YOUR_DOCKERHUB_USERNAME/mochi-1:latest
```

Then use `YOUR_DOCKERHUB_USERNAME/mochi-1:latest` as your Docker image in Clore.ai.

{% hint style="info" %}
There is no official pre-built Docker image for Mochi-1 on Docker Hub. You need to build from the Dockerfile above. Alternatively, use `pytorch/pytorch:2.4.1-cuda12.4-cudnn9-devel` as the base image directly and run the setup commands manually via SSH.
{% endhint %}

***

## Step 3 — Connect via SSH

Once your instance is running:

```bash
ssh root@<clore-host> -p <assigned-ssh-port>
```

***

## Step 4 — Download Mochi-1 Weights

The model weights are hosted on Hugging Face. Download them via the `huggingface_hub` CLI:

```bash
cd /workspace

# Install huggingface-cli if not present
pip install -U huggingface_hub

# Download Mochi-1 weights (~40 GB for bf16)
huggingface-cli download genmo/mochi-1-preview \
    --local-dir /workspace/mochi-weights \
    --include "*.safetensors" "*.json" "*.txt"
```

{% hint style="info" %}
The full bf16 model is approximately 80 GB. The `fp8` quantized version is \~40 GB and runs on RTX 4090 (24 GB) with CPU offloading. Specify `--include "*fp8*"` to download only quantized weights.
{% endhint %}

### Alternative: Download Only fp8 Quantized Weights

```bash
huggingface-cli download genmo/mochi-1-preview \
    --local-dir /workspace/mochi-weights \
    --include "*fp8*" "*.json" "*.txt"
```

***

## Step 5 — Launch the Gradio Demo

Mochi-1 ships with a Gradio web UI for easy text-to-video generation:

```bash
cd /workspace/mochi

python demos/gradio_ui.py \
    --model_dir /workspace/mochi-weights \
    --share False \
    --host 0.0.0.0 \
    --port 7860
```

**For low-VRAM mode (RTX 4090, 24 GB):**

```bash
python demos/gradio_ui.py \
    --model_dir /workspace/mochi-weights \
    --cpu_offload \
    --share False \
    --host 0.0.0.0 \
    --port 7860
```

{% hint style="info" %}
The `--cpu_offload` flag moves model layers to CPU RAM when not in use, reducing peak VRAM to \~18–20 GB at the cost of \~2× slower generation.
{% endhint %}

***

## Step 6 — Access the Web UI

Open your browser and navigate to:

```
http://<clore-host>:<public-port-7860>
```

You will see the Mochi-1 Gradio interface with:

* A text prompt input
* Generation settings (steps, guidance scale, seed)
* Video output player

***

## Step 7 — Generate Your First Video

### Example Prompts

**Nature scene:**

```
A majestic waterfall cascading down mossy rocks in a lush rainforest, 
golden hour sunlight filtering through the canopy, slow cinematic pan
```

**Action scene:**

```
A cheetah sprinting across an open savanna at full speed, 
dust kicking up behind it, dramatic wide shot, 4K wildlife documentary
```

**Abstract/artistic:**

```
Colorful paint swirling in water in extreme slow motion, 
vivid blue and orange pigments mixing, macro lens, studio lighting
```

### Recommended Settings

| Parameter      | Value                 |
| -------------- | --------------------- |
| Steps          | 64                    |
| Guidance Scale | 4.5                   |
| Duration       | 5.1 seconds (default) |
| Resolution     | 848×480 (native)      |

{% hint style="info" %}
Generation time varies significantly by GPU. On an A100 80 GB, a 5-second video takes approximately **2–4 minutes**. On RTX 4090 with CPU offload, expect **8–15 minutes**.
{% endhint %}

***

## Python API Usage

For programmatic generation, use the diffusers pipeline:

```python
import torch
from diffusers import MochiPipeline
from diffusers.utils import export_to_video

# Load pipeline
pipe = MochiPipeline.from_pretrained(
    "/workspace/mochi-weights",
    torch_dtype=torch.bfloat16
)
pipe.enable_model_cpu_offload()
pipe.enable_vae_slicing()

# Generate video
with torch.autocast("cuda", torch.bfloat16, cache_enabled=False):
    frames = pipe(
        prompt="A golden retriever playing fetch on a sunny beach, cinematic",
        num_frames=84,
        guidance_scale=4.5,
        num_inference_steps=64,
        generator=torch.Generator("cuda").manual_seed(42)
    ).frames[0]

# Export
export_to_video(frames, "output.mp4", fps=30)
print("Video saved to output.mp4")
```

### Batch Generation Script

```python
import torch
from diffusers import MochiPipeline
from diffusers.utils import export_to_video
import os

pipe = MochiPipeline.from_pretrained(
    "/workspace/mochi-weights",
    torch_dtype=torch.bfloat16
)
pipe.enable_model_cpu_offload()

prompts = [
    "A butterfly landing on a flower in slow motion, macro photography",
    "Ocean waves crashing against rocky cliffs at sunset, drone shot",
    "Northern lights dancing across a starry sky over a frozen lake",
]

os.makedirs("/workspace/outputs", exist_ok=True)

for i, prompt in enumerate(prompts):
    frames = pipe(
        prompt=prompt,
        num_frames=84,
        guidance_scale=4.5,
        num_inference_steps=64,
    ).frames[0]
    
    output_path = f"/workspace/outputs/video_{i:03d}.mp4"
    export_to_video(frames, output_path, fps=30)
    print(f"Saved: {output_path}")
```

***

## Multi-GPU Inference

For faster generation with multiple GPUs:

```python
import torch
from diffusers import MochiPipeline

# Use device_map for automatic multi-GPU distribution
pipe = MochiPipeline.from_pretrained(
    "/workspace/mochi-weights",
    torch_dtype=torch.bfloat16,
    device_map="balanced"
)

# No need for cpu_offload with multiple GPUs
frames = pipe(
    prompt="Your prompt here",
    num_frames=84,
    guidance_scale=4.5,
    num_inference_steps=64,
).frames[0]
```

{% hint style="info" %}
Clore.ai offers multi-GPU servers (2×, 4× RTX 4090 or A100). With 2× A100 80 GB, generation time drops to under 60 seconds for a 5-second clip.
{% endhint %}

***

## Troubleshooting

### CUDA Out of Memory

```
torch.cuda.OutOfMemoryError: CUDA out of memory
```

**Solutions:**

1. Add `--cpu_offload` to the gradio command
2. Enable VAE slicing: `pipe.enable_vae_slicing()`
3. Reduce `num_frames` (try 24 instead of 84)
4. Use fp8 quantized weights instead of bf16

### Model Loading Slow

**Solution:** Ensure weights are on a fast NVMe drive, not HDD. Check storage speed:

```bash
dd if=/dev/zero of=/workspace/test bs=1M count=1000 conv=fdatasync
```

### Video Artifacts / Temporal Flickering

**Solutions:**

* Increase inference steps (try 80–100)
* Adjust guidance scale (3.5–5.0 range is usually best)
* Use a specific seed for reproducibility and iteration

### Port 7860 Not Accessible

Check that the port was correctly opened in Clore.ai and the Gradio server is binding to `0.0.0.0`:

```bash
ss -tlnp | grep 7860
```

***

## Cost Estimation

| GPU          | VRAM   | Est. Price | 5s video time |
| ------------ | ------ | ---------- | ------------- |
| RTX 4090     | 24 GB  | \~$0.35/hr | \~10–15 min   |
| A100 40GB    | 40 GB  | \~$0.70/hr | \~3–5 min     |
| A100 80GB    | 80 GB  | \~$1.20/hr | \~2–3 min     |
| 2× A100 80GB | 160 GB | \~$2.20/hr | \~60–90 sec   |

***

## Clore.ai GPU Recommendations

Mochi-1 is VRAM-hungry — the 10B parameter model requires careful GPU selection.

| GPU          | VRAM   | Clore.ai Price | Mode                     | 5s Video Generation Time |
| ------------ | ------ | -------------- | ------------------------ | ------------------------ |
| RTX 4090     | 24 GB  | \~$0.70/hr     | fp8 quantized only       | \~10–15 min              |
| A100 40GB    | 40 GB  | \~$1.20/hr     | bf16 recommended         | \~3–5 min                |
| A100 80GB    | 80 GB  | \~$2.00/hr     | full bf16, fast          | \~2–3 min                |
| 2× A100 80GB | 160 GB | \~$4.00/hr     | tensor parallel, fastest | \~60–90 sec              |

{% hint style="warning" %}
**RTX 3090 (24GB) is not recommended** — Mochi-1 in fp8 mode needs 24GB minimum and leaves almost no headroom. The RTX 4090 (24GB) works in fp8 but OOMs frequently on longer sequences. Start with A100 40GB for reliable results.
{% endhint %}

**Best value for quality:** A100 40GB at \~$1.20/hr generates a 5-second clip in 3–5 minutes. That's \~$0.08–0.10 per video clip — significantly cheaper than Runway ML ($0.25–0.50/clip) or Pika Labs subscriptions.

***

## Useful Resources

* [Mochi-1 GitHub](https://github.com/genmoai/mochi)
* [Mochi-1 on Hugging Face](https://huggingface.co/genmo/mochi-1-preview)
* [Genmo Blog — Mochi-1 Release](https://www.genmo.ai/blog/mochi-1)
* [Diffusers Mochi Documentation](https://huggingface.co/docs/diffusers/api/pipelines/mochi)
* [Mochi Prompt Guide (Community)](https://github.com/genmoai/mochi/blob/main/README.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/video-generation/mochi-1.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.