# LTX-Video Real-Time Generation

LTX-Video by Lightricks is the fastest open-source video generation model available. On an RTX 4090 it produces a 5-second 768×512 clip in roughly 4 seconds — faster than real-time playback. The model supports both text-to-video (T2V) and image-to-video (I2V) workflows through native `diffusers` integration via `LTXPipeline` and `LTXImageToVideoPipeline`.

Renting a GPU on [Clore.ai](https://clore.ai/) gives you instant access to the hardware LTX-Video needs, with no upfront investment and per-hour billing.

## Key Features

* **Faster than real-time** — 5-second video generated in \~4 seconds on an RTX 4090.
* **Text-to-Video** — produce clips from natural language descriptions.
* **Image-to-Video** — animate a static reference image with motion and camera control.
* **Lightweight architecture** — 2B parameter video DiT with a compact latent space.
* **Native diffusers** — `LTXPipeline` and `LTXImageToVideoPipeline` in `diffusers >= 0.32`.
* **Open weights** — Apache-2.0 license; fully commercial use permitted.
* **Temporal VAE** — 1:192 compression ratio across space and time; efficient decoding.

## Requirements

| Component  | Minimum | Recommended |
| ---------- | ------- | ----------- |
| GPU VRAM   | 16 GB   | 24 GB       |
| System RAM | 16 GB   | 32 GB       |
| Disk       | 15 GB   | 30 GB       |
| Python     | 3.10+   | 3.11        |
| CUDA       | 12.1+   | 12.4        |
| diffusers  | 0.32+   | latest      |

**Clore.ai GPU recommendation:** An **RTX 4090** (24 GB, \~$0.5–2/day) is ideal for maximum throughput. An **RTX 3090** (24 GB, \~$0.3–1/day) still runs faster than many competing models at a fraction of the cost.

## Quick Start

```bash
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124
pip install diffusers transformers accelerate sentencepiece imageio[ffmpeg]

python -c "import torch; print(torch.cuda.get_device_name(0))"
```

## Usage Examples

### Text-to-Video

```python
import torch
from diffusers import LTXPipeline
from diffusers.utils import export_to_video

pipe = LTXPipeline.from_pretrained(
    "Lightricks/LTX-Video",
    torch_dtype=torch.bfloat16,
)
pipe.to("cuda")

prompt = (
    "A drone shot gliding over a turquoise coral reef, "
    "schools of tropical fish darting below, golden hour light "
    "refracting through the water surface"
)

video_frames = pipe(
    prompt=prompt,
    negative_prompt="blurry, low quality, distorted",
    num_frames=121,               # ~5 sec at 24 fps
    width=768,
    height=512,
    num_inference_steps=30,
    guidance_scale=7.5,
    generator=torch.Generator("cuda").manual_seed(0),
).frames[0]

export_to_video(video_frames, "coral_reef.mp4", fps=24)
print("Saved coral_reef.mp4")
```

### Image-to-Video

```python
import torch
from PIL import Image
from diffusers import LTXImageToVideoPipeline
from diffusers.utils import export_to_video

pipe = LTXImageToVideoPipeline.from_pretrained(
    "Lightricks/LTX-Video",
    torch_dtype=torch.bfloat16,
)
pipe.to("cuda")

image = Image.open("cityscape.png").resize((768, 512))

video_frames = pipe(
    prompt="Camera slowly pans right, city lights flicker on at dusk",
    negative_prompt="static, blurry",
    image=image,
    num_frames=121,
    num_inference_steps=30,
    guidance_scale=7.5,
).frames[0]

export_to_video(video_frames, "cityscape_animated.mp4", fps=24)
```

### Batch Generation Script

```python
import torch
from diffusers import LTXPipeline
from diffusers.utils import export_to_video

pipe = LTXPipeline.from_pretrained(
    "Lightricks/LTX-Video", torch_dtype=torch.bfloat16
).to("cuda")

prompts = [
    "A cat stretching on a sunlit windowsill, dust motes floating",
    "Aerial view of waves crashing on black volcanic sand",
    "Time-lapse of storm clouds rolling over a prairie",
]

for i, prompt in enumerate(prompts):
    frames = pipe(
        prompt=prompt,
        num_frames=121,
        width=768,
        height=512,
        num_inference_steps=30,
        guidance_scale=7.5,
    ).frames[0]
    export_to_video(frames, f"batch_{i:03d}.mp4", fps=24)
    print(f"[{i+1}/{len(prompts)}] Done")
```

## Tips for Clore.ai Users

1. **Speed benchmark** — on an RTX 4090, LTX-Video generates 121 frames in \~4 seconds; use this as a sanity check that your rental is performing correctly.
2. **bf16 precision** — the checkpoint is trained in bf16; do not switch to fp16 or you risk quality degradation.
3. **Cache weights** — set `HF_HOME=/workspace/hf_cache` on a persistent volume. The model is \~6 GB; re-downloading on every container start wastes time.
4. **Prompt engineering** — LTX-Video responds well to cinematic language: "drone shot", "slow motion", "golden hour", "tracking shot". Be specific about camera motion.
5. **Batch overnight** — LTX-Video is fast enough to generate hundreds of clips per hour on a 4090. Queue prompts from a file and let it run.
6. **SSH + tmux** — always run generation inside a `tmux` session so dropped connections don't interrupt long batch jobs.
7. **Monitor VRAM** — `watch -n1 nvidia-smi` in a second terminal to ensure you're not hitting swap.

## Troubleshooting

| Problem                      | Fix                                                                             |
| ---------------------------- | ------------------------------------------------------------------------------- |
| `OutOfMemoryError`           | Reduce `num_frames` to 81 or `width`/`height` to 512×320                        |
| Model not found in diffusers | Upgrade: `pip install -U diffusers` — LTXPipeline requires diffusers ≥ 0.32     |
| Black or static output       | Ensure you pass a `negative_prompt`; increase `guidance_scale` to 8–9           |
| `ImportError: imageio`       | `pip install imageio[ffmpeg]` — ffmpeg backend needed for MP4 export            |
| Slow first inference         | First run compiles CUDA kernels and downloads weights; subsequent runs are fast |
| Color banding artifacts      | Use `torch.bfloat16` (not float16); bfloat16 has wider dynamic range            |
| Container restarted mid-job  | Set `HF_HOME` to persistent storage; partial HF downloads auto-resume           |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/video-generation/ltx-video.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
