> For the complete documentation index, see [llms.txt](https://docs.clore.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.clore.ai/guides/guides_v2-de/erste-schritte/docker-images.md).

# Docker-Images

Bereit zum Bereitstellen: Docker-Images für KI-Workloads auf CLORE.AI.

{% hint style="success" %}
Bereitstellen Sie diese Images direkt unter [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## Schnellbereitstellungsreferenz

### Beliebteste

| Aufgabe                             | Image                                | Ports     |
| ----------------------------------- | ------------------------------------ | --------- |
| Mit KI chatten                      | `ollama/ollama`                      | 22, 11434 |
| ChatGPT-ähnliche Benutzeroberfläche | `ghcr.io/open-webui/open-webui`      | 22, 8080  |
| Bildgenerierung                     | `universonic/stable-diffusion-webui` | 22, 7860  |
| Knotenbasierte Bildgenerierung      | `yanwk/comfyui-boot`                 | 22, 8188  |
| LLM-API-Server                      | `vllm/vllm-openai`                   | 22, 8000  |

***

## Sprachmodelle

### Ollama

**Universeller LLM-Runner – der einfachste Weg, jedes Modell auszuführen.**

```
Image: ollama/ollama
Ports: 22/tcp, 11434/http
Befehl: ollama serve
```

**Nach der Bereitstellung:**

```bash
# Per SSH in den Server einloggen
ssh -p <port> root@<proxy>

# Modell herunterladen und ausführen
ollama pull llama3.2
ollama run llama3.2
```

**Umgebungsvariablen:**

```
OLLAMA_HOST=0.0.0.0
OLLAMA_MODELS=/root/.ollama/models
```

***

### WebUI öffnen

**ChatGPT-ähnliche Oberfläche für Ollama.**

```
Image: ghcr.io/open-webui/open-webui:ollama
Ports: 22/tcp, 8080/http
```

Enthält Ollama integriert. Zugriff über HTTP-Port.

**Standalone (mit vorhandenem Ollama verbinden):**

```
Image: ghcr.io/open-webui/open-webui:main
Ports: 22/tcp, 8080/http
Umgebung: OLLAMA_BASE_URL=http://localhost:11434
```

***

### vLLM

**Leistungsstarkes LLM-Serving mit OpenAI-kompatibler API.**

```
Image: vllm/vllm-openai:latest
Ports: 22/tcp, 8000/http
Befehl: python -m vllm.entrypoints.openai.api_server --model meta-llama/Meta-Llama-3.1-8B-Instruct --host 0.0.0.0
```

**Für größere Modelle (Multi-GPU):**

```bash
python -m vllm.entrypoints.openai.api_server \
    --model meta-llama/Meta-Llama-3.1-70B-Instruct \
    --tensor-parallel-size 2 \
    --host 0.0.0.0
```

**Umgebungsvariablen:**

```
HUGGING_FACE_HUB_TOKEN=<dein-token>  # Für gesperrte Modelle
```

***

### Text Generation Inference (TGI)

**HuggingFace's Produktions-LLM-Server.**

```
Image: ghcr.io/huggingface/text-generation-inference:latest
Ports: 22/tcp, 8080/http
Befehl: --model-id meta-llama/Meta-Llama-3.1-8B-Instruct
```

**Umgebungsvariablen:**

```
HUGGING_FACE_HUB_TOKEN=<dein-token>
MAX_INPUT_LENGTH=4096
MAX_TOTAL_TOKENS=8192
```

***

## Bildgenerierung

### Stable Diffusion WebUI (AUTOMATIC1111)

**Beliebteste SD-Oberfläche mit Erweiterungen.**

```
Image: universonic/stable-diffusion-webui:latest
Ports: 22/tcp, 7860/http
```

**Für wenig VRAM (8GB oder weniger):**

```bash
./webui.sh --listen --medvram --xformers
```

**Für API-Zugriff:**

```bash
./webui.sh --listen --xformers --api
```

***

### ComfyUI

**Knotenbasierter Workflow für fortgeschrittene Benutzer.**

```
Image: yanwk/comfyui-boot:cu126-slim
Ports: 22/tcp, 8188/http
Umgebung: CLI_ARGS=--listen 0.0.0.0
```

**Alternative Images:**

```
# Mit häufigen Erweiterungen
Image: ai-dock/comfyui:latest

# Minimal
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
```

**Manueller Setup-Befehl:**

```bash
git clone https://github.com/comfyanonymous/ComfyUI && cd ComfyUI && pip install -r requirements.txt && python main.py --listen 0.0.0.0
```

***

### Fooocus

**Vereinfachte SD-Oberfläche, Midjourney-ähnlich.**

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp, 7865/http
Befehl: git clone https://github.com/lllyasviel/Fooocus && cd Fooocus && pip install -r requirements.txt && python launch.py --listen
```

***

### FLUX

**Neueste hochwertige Bildgenerierung.**

ComfyUI mit FLUX-Nodes verwenden:

```
Image: yanwk/comfyui-boot:cu126-slim
Ports: 22/tcp, 8188/http
```

Oder über Diffusers:

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp
```

```python
# Nach SSH
pip install diffusers transformers accelerate
python << 'EOF'
from diffusers import FluxPipeline
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell")
pipe.enable_model_cpu_offload()
image = pipe("A cat", num_inference_steps=4).images[0]
image.save("output.png")
EOF
```

***

## Videogenerierung

### Stable Video Diffusion

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp
```

```bash
pip install diffusers transformers accelerate
python << 'EOF'
from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import load_image, export_to_video
pipe = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid-xt",
    variant="fp16"
)
pipe.to("cuda")
image = load_image("input.png")
frames = pipe(image, num_frames=25).frames[0]
export_to_video(frames, "output.mp4", fps=7)
EOF
```

***

### AnimateDiff

Verwendung mit ComfyUI:

```
Image: yanwk/comfyui-boot:cu126-slim
Ports: 22/tcp, 8188/http
```

Installiere AnimateDiff-Nodes über den ComfyUI Manager.

***

## Audio & Stimme

### Whisper (Transkription)

```
Image: onerahmet/openai-whisper-asr-webservice:latest
Ports: 22/tcp, 9000/http
Umgebung: ASR_MODEL=large-v3
```

**API-Nutzung:**

```bash
curl -X POST "http://localhost:9000/asr" \
    -F "audio_file=@audio.mp3" \
    -F "task=transcribe"
```

***

### Bark (Text-to-Speech)

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp
```

```bash
pip install bark
python << 'EOF'
from bark import SAMPLE_RATE, generate_audio, preload_models
from scipy.io.wavfile import write as write_wav
preload_models()
audio = generate_audio("Hello, this is a test.")
write_wav("output.wav", SAMPLE_RATE, audio)
EOF
```

***

### Stable Audio

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp
```

```bash
pip install stable-audio-tools
# Erfordert HF-Token für den Modellzugriff
```

***

## Visionsmodelle

### LLaVA

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp
```

```bash
pip install llava
python -m llava.serve.cli --model-path liuhaotian/llava-v1.6-34b
```

***

### Llama 3.2 Vision

Ollama verwenden:

```
Image: ollama/ollama
Ports: 22/tcp, 11434/http
```

```bash
ollama pull llama3.2-vision
ollama run llama3.2-vision "describe this image" --images photo.jpg
```

***

## Entwicklung & Training

### PyTorch-Basis

**Für benutzerdefinierte Setups und Training.**

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp
```

Enthält: CUDA 12.1, cuDNN 8, PyTorch 2.1

***

### Jupyter Lab

**Interaktive Notebooks für ML.**

```
Image: jupyter/pytorch-notebook:cuda12-pytorch-2.1
Ports: 22/tcp, 8888/http
```

Oder verwende die PyTorch-Basis mit Jupyter:

```bash
pip install jupyterlab
jupyter lab --ip=0.0.0.0 --allow-root --no-browser
```

***

### Kohya Training

**Für LoRA und Feinabstimmung von Modellen.**

```
Image: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Ports: 22/tcp
```

```bash
git clone https://github.com/kohya-ss/sd-scripts
cd sd-scripts
pip install -r requirements.txt
# Trainingsskripte verwenden
```

***

## Referenz für Basis-Images

### NVIDIA Offiziell

| Image                                    | CUDA | Einsatzgebiet         |
| ---------------------------------------- | ---- | --------------------- |
| `nvidia/cuda:12.1.0-devel-ubuntu22.04`   | 12.1 | CUDA-Entwicklung      |
| `nvidia/cuda:12.1.0-runtime-ubuntu22.04` | 12.1 | Nur CUDA-Laufzeit     |
| `nvidia/cuda:11.8.0-devel-ubuntu22.04`   | 11.8 | Legacy-Kompatibilität |

### PyTorch Offiziell

| Image                                          | PyTorch | CUDA |
| ---------------------------------------------- | ------- | ---- |
| `pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel`  | 2.5     | 12.4 |
| `pytorch/pytorch:2.0.1-cuda11.7-cudnn8-devel`  | 2.0     | 11.7 |
| `pytorch/pytorch:1.13.1-cuda11.6-cudnn8-devel` | 1.13    | 11.6 |

### HuggingFace

| Image                                           | Zweck                  |
| ----------------------------------------------- | ---------------------- |
| `huggingface/transformers-pytorch-gpu`          | Transformers + PyTorch |
| `ghcr.io/huggingface/text-generation-inference` | TGI-Server             |

***

## Umgebungsvariablen

### Häufige Variablen

| Variable                 | Beschreibung                       | Beispiel       |
| ------------------------ | ---------------------------------- | -------------- |
| `HUGGING_FACE_HUB_TOKEN` | HF-API-Token für gesperrte Modelle | `hf_xxx`       |
| `CUDA_VISIBLE_DEVICES`   | GPU-Auswahl                        | `0,1`          |
| `TRANSFORMERS_CACHE`     | Modell-Cache-Verzeichnis           | `/root/.cache` |

### Ollama-Variablen

| Variable              | Beschreibung        | Standard           |
| --------------------- | ------------------- | ------------------ |
| `OLLAMA_HOST`         | Bind-Adresse        | `127.0.0.1`        |
| `OLLAMA_MODELS`       | Modelle-Verzeichnis | `~/.ollama/models` |
| `OLLAMA_NUM_PARALLEL` | Parallele Anfragen  | `1`                |

### vLLM-Variablen

| Variable                 | Beschreibung                  |
| ------------------------ | ----------------------------- |
| `VLLM_ATTENTION_BACKEND` | Attention-Implementierung     |
| `VLLM_USE_MODELSCOPE`    | ModelScope statt HF verwenden |

***

## Portreferenz

| Port  | Protokoll | Dienst                     |
| ----- | --------- | -------------------------- |
| 22    | TCP       | SSH                        |
| 7860  | HTTP      | Gradio (SD WebUI, Fooocus) |
| 7865  | HTTP      | Fooocus-Alternative        |
| 8000  | HTTP      | vLLM-API                   |
| 8080  | HTTP      | Open WebUI, TGI            |
| 8188  | HTTP      | ComfyUI                    |
| 8888  | HTTP      | Jupyter                    |
| 9000  | HTTP      | Whisper-API                |
| 11434 | TCP       | Ollama-API                 |

***

## Tipps

### Persistenter Speicher

Volumes einbinden, um Daten zwischen Neustarts zu behalten:

```bash
docker run -v /data/models:/root/.cache/huggingface ...
```

### GPU-Auswahl

Für Multi-GPU-Systeme:

```bash
docker run --gpus '"device=0,1"' ...
# oder
CUDA_VISIBLE_DEVICES=0,1
```

### Speicherverwaltung

Wenn der VRAM knapp wird:

1. Verwende kleinere Modelle
2. CPU-Offload aktivieren
3. Batch-Größe reduzieren
4. Verwende quantisierte Modelle (GGUF Q4)

## Nächste Schritte

* [GPU-Vergleich](/guides/guides_v2-de/erste-schritte/gpu-comparison.md) - Wähle die richtige GPU
* [Modellkompatibilität](/guides/guides_v2-de/erste-schritte/model-compatibility.md) - Was wo läuft
* [Schnellstart-Anleitung](/guides/guides_v2-de/quickstart.md) - Starte in 5 Minuten


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/guides_v2-de/erste-schritte/docker-images.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
