FLUX.1

Run FLUX.1 image generation from Black Forest Labs on Clore.ai

circle-info

Faster alternative! FLUX.2 Klein generates images in < 0.5 seconds (vs 10–30s for FLUX.1) with comparable quality. This guide is still relevant for LoRA training and ControlNet workflows.

State-of-the-art image generation model from Black Forest Labs on CLORE.AI GPUs.

circle-check

Why FLUX.1?

  • Best quality - Superior to SDXL and Midjourney v5

  • Text rendering - Actually readable text in images

  • Prompt following - Excellent instruction adherence

  • Fast variants - FLUX.1-schnell for quick generation

Model Variants

Model
Speed
Quality
VRAM
License

FLUX.1-schnell

Fast (4 steps)

Great

12GB+

Apache 2.0

FLUX.1-dev

Medium (20 steps)

Excellent

16GB+

Non-commercial

FLUX.1-pro

API only

Best

-

Commercial

Quick Deploy on CLORE.AI

Docker Image:

ghcr.io/huggingface/text-generation-inference:latest

Ports:

For easiest deployment, use ComfyUI with FLUX nodes.

Installation Methods

Method 2: Diffusers

Method 3: Fooocus

Fooocus has built-in FLUX support:

ComfyUI Workflow

FLUX.1-schnell (Fast)

Nodes needed:

  1. Load Diffusion Model → flux1-schnell.safetensors

  2. DualCLIPLoader → clip_l.safetensors + t5xxl_fp16.safetensors

  3. CLIP Text Encode → your prompt

  4. Empty SD3 Latent Image → set dimensions

  5. KSampler → steps: 4, cfg: 1.0

  6. VAE Decode → ae.safetensors

  7. Save Image

FLUX.1-dev (Quality)

Same workflow but:

  • Steps: 20-50

  • CFG: 3.5

  • Use guidance_scale in prompt

Python API

Basic Generation

With Memory Optimization

Batch Generation

FLUX.1-dev (Higher Quality)

Prompt Tips

FLUX excels at:

  • Text in images: "A neon sign that says 'OPEN 24/7'"

  • Complex scenes: "A busy Tokyo street at night with reflections"

  • Specific styles: "Oil painting in the style of Monet"

  • Detailed descriptions: Long, detailed prompts work well

Example Prompts

Memory Optimization

For 12GB VRAM (RTX 3060)

For 8GB VRAM

Use quantized version or ComfyUI with GGUF:

Performance Comparison

Model
Steps
Time (4090)
Quality

FLUX.1-schnell

4

~3 sec

Great

FLUX.1-dev

20

~12 sec

Excellent

FLUX.1-dev

50

~30 sec

Best

SDXL

30

~8 sec

Good

GPU Requirements

Setup
Minimum
Recommended

FLUX.1-schnell

12GB

16GB+

FLUX.1-dev

16GB

24GB+

With CPU offload

8GB

12GB+

Quantized (GGUF)

6GB

8GB+

GPU Presets

RTX 3060 12GB (Budget)

RTX 3090 24GB (Optimal)

RTX 4090 24GB (Performance)

A100 40GB/80GB (Production)

Cost Estimate

GPU
Hourly
Images/Hour

RTX 3060 12GB

~$0.03

~200 (schnell)

RTX 3090 24GB

~$0.06

~600 (schnell)

RTX 4090 24GB

~$0.10

~1000 (schnell)

A100 40GB

~$0.17

~1500 (schnell)

Troubleshooting

Out of Memory

Slow Generation

  • Use FLUX.1-schnell (4 steps)

  • Enable torch.compile: pipe.unet = torch.compile(pipe.unet)

  • Use fp16 instead of bf16 on older GPUs

Poor Quality

  • Use more steps (FLUX-dev: 30-50)

  • Increase guidance_scale (3.0-4.0 for dev)

  • Write more detailed prompts


FLUX LoRA

LoRA (Low-Rank Adaptation) weights allow you to fine-tune FLUX for specific styles, characters, or concepts without retraining the full model. Hundreds of community LoRAs are available on HuggingFace and CivitAI.

Installation

Loading a Single LoRA

Loading from HuggingFace Hub

LoRA Scale (Strength)

Combining Multiple LoRAs

Unloading LoRA

Training Your Own FLUX LoRA

Source
URL
Notes

CivitAI

civitai.com

Large community library

HuggingFace

huggingface.co/models

Filter by FLUX

Replicate

replicate.com

Browse trained models


ControlNet for FLUX

ControlNet allows guiding FLUX generation with structural inputs like canny edges, depth maps, and pose skeletons. XLabs-AI has released the first ControlNet models specifically for FLUX.1.

Installation

FLUX ControlNet Canny (XLabs-AI)

FLUX ControlNet Depth

Multi-ControlNet for FLUX

Available FLUX ControlNet Models

Model
Repo
Use Case

Canny

XLabs-AI/flux-controlnet-canny-diffusers

Edge-guided generation

Depth

XLabs-AI/flux-controlnet-depth-diffusers

Depth-guided generation

HED/Soft Edge

XLabs-AI/flux-controlnet-hed-diffusers

Soft structural control

Pose

XLabs-AI/flux-controlnet-openpose-diffusers

Pose-guided portraits

ControlNet Tips

  • conditioning_scale 0.5–0.8 works best for FLUX (too high loses creativity)

  • Use 1024×1024 or multiples for best quality

  • Combine with LoRA for style + structure control

  • Lower steps (20–25) is usually sufficient with ControlNet


FLUX.1-schnell: Fast Generation Mode

FLUX.1-schnell is the distilled, speed-optimized variant of FLUX. It generates high-quality images in just 4 steps (vs 20–50 for FLUX.1-dev), making it ideal for rapid prototyping and high-throughput workflows.

Key Differences vs FLUX.1-dev

Feature
FLUX.1-schnell
FLUX.1-dev

Steps

4

20–50

Speed (4090)

~3 sec

~12–30 sec

License

Apache 2.0 (free commercial)

Non-commercial

guidance_scale

0.0 (no CFG)

3.5

Quality

Great

Excellent

VRAM

12GB+

16GB+

License note: FLUX.1-schnell is Apache 2.0 — you can use it in commercial products freely. FLUX.1-dev requires a separate commercial license from Black Forest Labs.

Quick Start

High-Throughput Batch Generation

Multiple Aspect Ratios with schnell

schnell with Memory Optimizations

Performance Benchmarks (schnell)

GPU
VRAM
Time/image (1024px)
Images/hour

RTX 3060 12GB

12GB

~8 sec

~450

RTX 3090 24GB

24GB

~4 sec

~900

RTX 4090 24GB

24GB

~3 sec

~1200

A100 40GB

40GB

~2 sec

~1800

When to Use schnell vs dev

Use FLUX.1-schnell when:

  • Rapid prototyping / testing prompts

  • High-volume batch generation

  • Commercial projects (Apache 2.0)

  • Limited GPU budget

  • Real-time or near-real-time applications

Use FLUX.1-dev when:

  • Maximum image quality is priority

  • Fine detail and complex scenes

  • Research / artistic work

  • Combining with LoRA/ControlNet (dev tends to respond better)


Next Steps

Last updated

Was this helpful?