Imágenes Docker

Imágenes Docker listas para desplegar cargas de trabajo de IA en Clore.ai

Imágenes Docker listas para desplegar cargas de trabajo de IA en CLORE.AI.

Despliega estas imágenes directamente en CLORE.AI Marketplace.

Referencia de Despliegue Rápido

Más Popular

Tarea

Imagen

Puertos

Chatear con IA

ollama/ollama

22, 11434

Interfaz tipo ChatGPT

ghcr.io/open-webui/open-webui

22, 8080

Generación de imágenes

universonic/stable-diffusion-webui

22, 7860

Generación de Imagen basada en Nodos

yanwk/comfyui-boot

22, 8188

Servidor API LLM

vllm/vllm-openai

22, 8000

Modelos de Lenguaje

Ollama

Ejecutor universal de LLM: la forma más fácil de ejecutar cualquier modelo.

Imagen: ollama/ollama
Puertos: 22/tcp, 11434/http
Comando: ollama serve

Después del despliegue:

# Conectarse por SSH al servidor
ssh -p <port> root@<proxy>

# Descargar y ejecutar un modelo
ollama pull llama3.2
ollama run llama3.2

Variables de entorno:

OLLAMA_HOST=0.0.0.0
OLLAMA_MODELS=/root/.ollama/models

Abrir WebUI

Interfaz tipo ChatGPT para Ollama.

Imagen: ghcr.io/open-webui/open-webui:ollama
Puertos: 22/tcp, 8080/http

Incluye Ollama incorporado. Acceso vía puerto HTTP.

Independiente (conectar a un Ollama existente):

Imagen: ghcr.io/open-webui/open-webui:main
Puertos: 22/tcp, 8080/http
Entorno: OLLAMA_BASE_URL=http://localhost:11434

vLLM

Servicio LLM de alto rendimiento con API compatible con OpenAI.

Imagen: vllm/vllm-openai:latest
Puertos: 22/tcp, 8000/http
Comando: python -m vllm.entrypoints.openai.api_server --model meta-llama/Meta-Llama-3.1-8B-Instruct --host 0.0.0.0

Para modelos más grandes (multi-GPU):

python -m vllm.entrypoints.openai.api_server \
    --model meta-llama/Meta-Llama-3.1-70B-Instruct \
    --tensor-parallel-size 2 \
    --host 0.0.0.0

Variables de entorno:

HUGGING_FACE_HUB_TOKEN=<tu-token>  # Para modelos restringidos

Text Generation Inference (TGI)

Servidor LLM de producción de HuggingFace.

Imagen: ghcr.io/huggingface/text-generation-inference:latest
Puertos: 22/tcp, 8080/http
Comando: --model-id meta-llama/Meta-Llama-3.1-8B-Instruct

Variables de entorno:

HUGGING_FACE_HUB_TOKEN=<tu-token>
MAX_INPUT_LENGTH=4096
MAX_TOTAL_TOKENS=8192

Generación de imágenes

Stable Diffusion WebUI (AUTOMATIC1111)

Interfaz SD más popular con extensiones.

Imagen: universonic/stable-diffusion-webui:latest
Puertos: 22/tcp, 7860/http

Para poca VRAM (8GB o menos):

./webui.sh --listen --medvram --xformers

Para acceso vía API:

./webui.sh --listen --xformers --api

ComfyUI

Flujo de trabajo basado en nodos para usuarios avanzados.

Imagen: yanwk/comfyui-boot:cu126-slim
Puertos: 22/tcp, 8188/http
Entorno: CLI_ARGS=--listen 0.0.0.0

Imágenes alternativas:

# Con extensiones comunes
Imagen: ai-dock/comfyui:latest

# Minimal
Imagen: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel

Comando de configuración manual:

git clone https://github.com/comfyanonymous/ComfyUI && cd ComfyUI && pip install -r requirements.txt && python main.py --listen 0.0.0.0

Fooocus

Interfaz SD simplificada, similar a Midjourney.

Imagen: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Puertos: 22/tcp, 7865/http
Comando: git clone https://github.com/lllyasviel/Fooocus && cd Fooocus && pip install -r requirements.txt && python launch.py --listen

FLUX

Generación de imágenes de alta calidad y última generación.

Usa ComfyUI con nodos FLUX:

Imagen: yanwk/comfyui-boot:cu126-slim
Puertos: 22/tcp, 8188/http

O vía Diffusers:

Imagen: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Puertos: 22/tcp

# Después de SSH
pip install diffusers transformers accelerate
python << 'EOF'
from diffusers import FluxPipeline
pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell")
pipe.enable_model_cpu_offload()
image = pipe("A cat", num_inference_steps=4).images[0]
image.save("output.png")
EOF

Generación de Video

Stable Video Diffusion

Imagen: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Puertos: 22/tcp

pip install diffusers transformers accelerate
python << 'EOF'
from diffusers import StableVideoDiffusionPipeline
from diffusers.utils import load_image, export_to_video
pipe = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid-xt",
    variant="fp16"
)
pipe.to("cuda")
image = load_image("input.png")
frames = pipe(image, num_frames=25).frames[0]
export_to_video(frames, "output.mp4", fps=7)
EOF

AnimateDiff

Usar con ComfyUI:

Imagen: yanwk/comfyui-boot:cu126-slim
Puertos: 22/tcp, 8188/http

Instala nodos AnimateDiff vía ComfyUI Manager.

Audio y Voz

Whisper (Transcripción)

Imagen: onerahmet/openai-whisper-asr-webservice:latest
Puertos: 22/tcp, 9000/http
Entorno: ASR_MODEL=large-v3

Uso de la API:

curl -X POST "http://localhost:9000/asr" \
    -F "[email protected]" \
    -F "task=transcribe"

Bark (Texto a Voz)

Imagen: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Puertos: 22/tcp

pip install bark
python << 'EOF'
from bark import SAMPLE_RATE, generate_audio, preload_models
from scipy.io.wavfile import write as write_wav
preload_models()
audio = generate_audio("Hello, this is a test.")
write_wav("output.wav", SAMPLE_RATE, audio)
EOF

Stable Audio

Imagen: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Puertos: 22/tcp

pip install stable-audio-tools
# Requiere token HF para acceso al modelo

Modelos de Visión

LLaVA

Imagen: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Puertos: 22/tcp

pip install llava
python -m llava.serve.cli --model-path liuhaotian/llava-v1.6-34b

Llama 3.2 Vision

Usar Ollama:

Imagen: ollama/ollama
Puertos: 22/tcp, 11434/http

ollama pull llama3.2-vision
ollama run llama3.2-vision "describe this image" --images photo.jpg

Desarrollo y Entrenamiento

Base PyTorch

Para configuraciones personalizadas y entrenamiento.

Imagen: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Puertos: 22/tcp

Incluye: CUDA 12.1, cuDNN 8, PyTorch 2.1

Jupyter Lab

Cuadernos interactivos para ML.

Imagen: jupyter/pytorch-notebook:cuda12-pytorch-2.1
Puertos: 22/tcp, 8888/http

O usa la base PyTorch con Jupyter:

pip install jupyterlab
jupyter lab --ip=0.0.0.0 --allow-root --no-browser

Entrenamiento Kohya

Para LoRA y ajuste fino de modelos.

Imagen: pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
Puertos: 22/tcp

git clone https://github.com/kohya-ss/sd-scripts
cd sd-scripts
pip install -r requirements.txt
# Usar scripts de entrenamiento

Referencia de Imágenes Base

Oficial de NVIDIA

Imagen

CUDA

Caso de uso

nvidia/cuda:12.1.0-devel-ubuntu22.04

12.1

Desarrollo CUDA

nvidia/cuda:12.1.0-runtime-ubuntu22.04

12.1

Solo runtime CUDA

nvidia/cuda:11.8.0-devel-ubuntu22.04

11.8

Compatibilidad heredada

Oficial de PyTorch

Imagen

PyTorch

CUDA

pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel

2.5

12.4

pytorch/pytorch:2.0.1-cuda11.7-cudnn8-devel

2.0

11.7

pytorch/pytorch:1.13.1-cuda11.6-cudnn8-devel

1.13

11.6

HuggingFace

Imagen

Propósito

huggingface/transformers-pytorch-gpu

Transformers + PyTorch

ghcr.io/huggingface/text-generation-inference

Servidor TGI

Variables de entorno

Variables Comunes

Variable

Descripción

Ejemplo

HUGGING_FACE_HUB_TOKEN

Token de la API de HF para modelos restringidos

hf_xxx

CUDA_VISIBLE_DEVICES

Selección de GPU

0,1

TRANSFORMERS_CACHE

Directorio de caché de modelos

/root/.cache

Variables de Ollama

Variable

Descripción

Valor por defecto

OLLAMA_HOST

Dirección de enlace

127.0.0.1

OLLAMA_MODELS

Directorio de modelos

~/.ollama/models

OLLAMA_NUM_PARALLEL

Peticiones en paralelo

1

Variables de vLLM

Variable

Descripción

VLLM_ATTENTION_BACKEND

Implementación de atención

VLLM_USE_MODELSCOPE

Usar ModelScope en lugar de HF

Referencia de Puertos

Puerto

Protocolo

Servicio

TCP

SSH

7860

HTTP

Gradio (SD WebUI, Fooocus)

7865

HTTP

Alternativa a Fooocus

8000

HTTP

API vLLM

8080

HTTP

Open WebUI, TGI

8188

HTTP

ComfyUI

8888

HTTP

Jupyter

9000

HTTP

API Whisper

11434

TCP

API de Ollama

Consejos

Almacenamiento Persistente

Monta volúmenes para conservar datos entre reinicios:

docker run -v /data/models:/root/.cache/huggingface ...

Selección de GPU

Para sistemas multi-GPU:

docker run --gpus '"device=0,1"' ...
# o
CUDA_VISIBLE_DEVICES=0,1

Gestión de memoria

Si te quedas sin VRAM:

Usa modelos más pequeños
Habilitar descarga a CPU
Reducir el tamaño del lote
Usa modelos cuantizados (GGUF Q4)

Próximos pasos

Comparación de GPU - Elige la GPU adecuada
Compatibilidad de Modelos - Qué se ejecuta dónde
Guía de inicio rápido - Comienza en 5 minutos

AnteriorCalculadora de costos SiguientePrecios de GPU

Última actualización hace 1 día

¿Te fue útil?

hashtagReferencia de Despliegue Rápido

hashtagMás Popular

hashtagModelos de Lenguaje

hashtagOllama

hashtagAbrir WebUI

hashtagvLLM

hashtagText Generation Inference (TGI)

hashtagGeneración de imágenes

hashtagStable Diffusion WebUI (AUTOMATIC1111)

hashtagComfyUI

hashtagFooocus

hashtagFLUX

hashtagGeneración de Video

hashtagStable Video Diffusion

hashtagAnimateDiff

hashtagAudio y Voz

hashtagWhisper (Transcripción)

hashtagBark (Texto a Voz)

hashtagStable Audio

hashtagModelos de Visión

hashtagLLaVA

hashtagLlama 3.2 Vision

hashtagDesarrollo y Entrenamiento

hashtagBase PyTorch

hashtagJupyter Lab

hashtagEntrenamiento Kohya

hashtagReferencia de Imágenes Base

hashtagOficial de NVIDIA

hashtagOficial de PyTorch

hashtagHuggingFace

hashtagVariables de entorno

hashtagVariables Comunes

hashtagVariables de Ollama

hashtagVariables de vLLM

hashtagReferencia de Puertos

hashtagConsejos

hashtagAlmacenamiento Persistente

hashtagSelección de GPU

hashtagGestión de memoria

hashtagPróximos pasos

Referencia de Despliegue Rápido

Más Popular

Modelos de Lenguaje

Ollama

Abrir WebUI

vLLM

Text Generation Inference (TGI)

Generación de imágenes

Stable Diffusion WebUI (AUTOMATIC1111)

ComfyUI

Fooocus

FLUX

Generación de Video

Stable Video Diffusion

AnimateDiff

Audio y Voz

Whisper (Transcripción)

Bark (Texto a Voz)

Stable Audio

Modelos de Visión

LLaVA

Llama 3.2 Vision

Desarrollo y Entrenamiento

Base PyTorch

Jupyter Lab

Entrenamiento Kohya

Referencia de Imágenes Base

Oficial de NVIDIA

Oficial de PyTorch

HuggingFace

Variables de entorno

Variables Comunes

Variables de Ollama

Variables de vLLM

Referencia de Puertos

Consejos

Almacenamiento Persistente

Selección de GPU

Gestión de memoria

Próximos pasos