# Docker Images

Ready-to-deploy Docker images for AI workloads on CLORE.AI.

{% hint style="success" %}
Deploy these images directly at [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## Quick Deploy Reference

### Most Popular

| Task                 | Image                                | Ports     |
| -------------------- | ------------------------------------ | --------- |
| Chat with AI         | `ollama/ollama`                      | 22, 11434 |
| ChatGPT-like UI      | `ghcr.io/open-webui/open-webui`      | 22, 8080  |
| Image Generation     | `universonic/stable-diffusion-webui` | 22, 7860  |
| Node-based Image Gen | `yanwk/comfyui-boot`                 | 22, 8188  |
| LLM API Server       | `vllm/vllm-openai`                   | 22, 8000  |

***

## Language Models

### Ollama

**Universal LLM runner - easiest way to run any model.**

```
Image: ollama/ollama
Ports: 22/tcp, 11434/http
Command: ollama serve
```

**After deploy:**

```bash
# SSH into server
ssh -p <port> root@<proxy>

# Pull and run a model
ollama pull llama3.2
ollama run llama3.2
```

**Environment variables:**

```
OLLAMA_HOST=0.0.0.0
OLLAMA_MODELS=/root/.ollama/models
```

***

### Open WebUI

**ChatGPT-like interface for Ollama.**

```
Image: ghcr.io/open-webui/open-webui:ollama
Ports: 22/tcp, 8080/http
```

Includes Ollama built-in. Access via HTTP port.

**Standalone (connect to existing Ollama):**

```
Image: ghcr.io/open-webui/open-webui:main
Ports: 22/tcp, 8080/http
Environment: OLLAMA_BASE_URL=http://localhost:11434
```

***

### vLLM

**High-performance LLM serving with OpenAI-compatible API.**

```
Image: vllm/vllm-openai:latest
Ports: 22/tcp, 8000/http
Command: python -m vllm.entrypoints.openai.api_server --model meta-llama/Meta-Llama-3.1-8B-Instruct --host 0.0.0.0
```

**For larger models (multi-GPU):**

```bash
python -m vllm.entrypoints.openai.api_server \
    --model meta-llama/Meta-Llama-3.1-70B-Instruct \
    --tensor-parallel-size 2 \
    --host 0.0.0.0
```

**Environment variables:**

```
HUGGING_FACE_HUB_TOKEN=<your-token>  # For gated models
```

***

### Text Generation Inference (TGI)

**HuggingFace's production LLM server.**

```
Image: ghcr.io/huggingface/text-generation-inference:latest
Ports: 22/tcp, 8080/http
Command: --model-id meta-llama/Meta-Llama-3.1-8B-Instruct
```

**Environment variables:**

```
HUGGING_FACE_HUB_TOKEN=<your-token>
MAX_INPUT_LENGTH=4096
MAX_TOTAL_TOKENS=8192
```

***

## Image Generation

### Stable Diffusion WebUI (AUTOMATIC1111)

**Most popular SD interface with extensions.**

```
Image: universonic/stable-diffusion-webui:latest
Ports: 22/tcp, 7860/http
```

**For low VRAM (8GB or less):**

```bash
./webui.sh --listen --medvram --xformers
```

**For API access:**

```bash
./webui.sh --listen --xformers --api
```

***

### ComfyUI

**Node-based workflow for advanced users.**

```
Image: yanwk/comfyui-boot:cu126-slim
Ports: 22/tcp, 8188/http
Environment: CLI_ARGS=--listen 0.0.0.0
```

**Alternative images:**

```
# With common extensions
Image: ai-dock/comfyui:latest

# Minimal
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
```

**Manual setup command:**

```bash
git clone https://github.com/comfyanonymous/ComfyUI && cd ComfyUI && pip install -r requirements.txt && python main.py --listen 0.0.0.0
```

***

### Fooocus

**Simplified SD interface, Midjourney-like.**

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp, 7865/http
Command: git clone https://github.com/lllyasviel/Fooocus && cd Fooocus && pip install -r requirements.txt && python launch.py --listen
```

***

### FLUX

**Latest high-quality image generation.**

Use ComfyUI with FLUX nodes:

```
Image: yanwk/comfyui-boot:cu126-slim
Ports: 22/tcp, 8188/http
```

Or via Diffusers:

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp
```

```python
# After SSH
pip install diffusers transformers accelerate
python << 'EOF'
from diffusers import FluxPipeline
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell")
pipe.enable_model_cpu_offload()
image = pipe("A cat", num_inference_steps=4).images[0]
image.save("output.png")
EOF
```

***

## Video Generation

### Stable Video Diffusion

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp
```

```bash
pip install diffusers transformers accelerate
python << 'EOF'
from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import load_image, export_to_video
pipe = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid-xt",
    variant="fp16"
)
pipe.to("cuda")
image = load_image("input.png")
frames = pipe(image, num_frames=25).frames[0]
export_to_video(frames, "output.mp4", fps=7)
EOF
```

***

### AnimateDiff

Use with ComfyUI:

```
Image: yanwk/comfyui-boot:cu126-slim
Ports: 22/tcp, 8188/http
```

Install AnimateDiff nodes via ComfyUI Manager.

***

## Audio & Voice

### Whisper (Transcription)

```
Image: onerahmet/openai-whisper-asr-webservice:latest
Ports: 22/tcp, 9000/http
Environment: ASR_MODEL=large-v3
```

**API usage:**

```bash
curl -X POST "http://localhost:9000/asr" \
    -F "audio_file=@audio.mp3" \
    -F "task=transcribe"
```

***

### Bark (Text-to-Speech)

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp
```

```bash
pip install bark
python << 'EOF'
from bark import SAMPLE_RATE, generate_audio, preload_models
from scipy.io.wavfile import write as write_wav
preload_models()
audio = generate_audio("Hello, this is a test.")
write_wav("output.wav", SAMPLE_RATE, audio)
EOF
```

***

### Stable Audio

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp
```

```bash
pip install stable-audio-tools
# Requires HF token for model access
```

***

## Vision Models

### LLaVA

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp
```

```bash
pip install llava
python -m llava.serve.cli --model-path liuhaotian/llava-v1.6-34b
```

***

### Llama 3.2 Vision

Use Ollama:

```
Image: ollama/ollama
Ports: 22/tcp, 11434/http
```

```bash
ollama pull llama3.2-vision
ollama run llama3.2-vision "describe this image" --images photo.jpg
```

***

## Development & Training

### PyTorch Base

**For custom setups and training.**

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp
```

Includes: CUDA 12.1, cuDNN 8, PyTorch 2.1

***

### Jupyter Lab

**Interactive notebooks for ML.**

```
Image: jupyter/pytorch-notebook:cuda12-pytorch-2.1
Ports: 22/tcp, 8888/http
```

Or use PyTorch base with Jupyter:

```bash
pip install jupyterlab
jupyter lab --ip=0.0.0.0 --allow-root --no-browser
```

***

### Kohya Training

**For LoRA and model fine-tuning.**

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp
```

```bash
git clone https://github.com/kohya-ss/sd-scripts
cd sd-scripts
pip install -r requirements.txt
# Use training scripts
```

***

## Base Images Reference

### NVIDIA Official

| Image                                    | CUDA | Use Case             |
| ---------------------------------------- | ---- | -------------------- |
| `nvidia/cuda:12.1.0-devel-ubuntu22.04`   | 12.1 | CUDA development     |
| `nvidia/cuda:12.1.0-runtime-ubuntu22.04` | 12.1 | CUDA runtime only    |
| `nvidia/cuda:11.8.0-devel-ubuntu22.04`   | 11.8 | Legacy compatibility |

### PyTorch Official

| Image                                          | PyTorch | CUDA |
| ---------------------------------------------- | ------- | ---- |
| `pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel`  | 2.5     | 12.4 |
| `pytorch/pytorch:2.0.1-cuda11.7-cudnn8-devel`  | 2.0     | 11.7 |
| `pytorch/pytorch:1.13.1-cuda11.6-cudnn8-devel` | 1.13    | 11.6 |

### HuggingFace

| Image                                           | Purpose                |
| ----------------------------------------------- | ---------------------- |
| `huggingface/transformers-pytorch-gpu`          | Transformers + PyTorch |
| `ghcr.io/huggingface/text-generation-inference` | TGI server             |

***

## Environment Variables

### Common Variables

| Variable                 | Description                   | Example        |
| ------------------------ | ----------------------------- | -------------- |
| `HUGGING_FACE_HUB_TOKEN` | HF API token for gated models | `hf_xxx`       |
| `CUDA_VISIBLE_DEVICES`   | GPU selection                 | `0,1`          |
| `TRANSFORMERS_CACHE`     | Model cache directory         | `/root/.cache` |

### Ollama Variables

| Variable              | Description       | Default            |
| --------------------- | ----------------- | ------------------ |
| `OLLAMA_HOST`         | Bind address      | `127.0.0.1`        |
| `OLLAMA_MODELS`       | Models directory  | `~/.ollama/models` |
| `OLLAMA_NUM_PARALLEL` | Parallel requests | `1`                |

### vLLM Variables

| Variable                 | Description                  |
| ------------------------ | ---------------------------- |
| `VLLM_ATTENTION_BACKEND` | Attention implementation     |
| `VLLM_USE_MODELSCOPE`    | Use ModelScope instead of HF |

***

## Port Reference

| Port  | Protocol | Service                    |
| ----- | -------- | -------------------------- |
| 22    | TCP      | SSH                        |
| 7860  | HTTP     | Gradio (SD WebUI, Fooocus) |
| 7865  | HTTP     | Fooocus alternative        |
| 8000  | HTTP     | vLLM API                   |
| 8080  | HTTP     | Open WebUI, TGI            |
| 8188  | HTTP     | ComfyUI                    |
| 8888  | HTTP     | Jupyter                    |
| 9000  | HTTP     | Whisper API                |
| 11434 | TCP      | Ollama API                 |

***

## Tips

### Persistent Storage

Mount volumes to keep data between restarts:

```bash
docker run -v /data/models:/root/.cache/huggingface ...
```

### GPU Selection

For multi-GPU systems:

```bash
docker run --gpus '"device=0,1"' ...
# or
CUDA_VISIBLE_DEVICES=0,1
```

### Memory Management

If running out of VRAM:

1. Use smaller models
2. Enable CPU offload
3. Reduce batch size
4. Use quantized models (GGUF Q4)

## Next Steps

* [GPU Comparison](/guides/getting-started/gpu-comparison.md) - Choose the right GPU
* [Model Compatibility](/guides/getting-started/model-compatibility.md) - What runs where
* [Quickstart Guide](/guides/quickstart.md) - Get started in 5 minutes


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/getting-started/docker-images.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
