Open WebUI

ChatGPT-like interface for running LLMs on Clore.ai GPUs

Beautiful ChatGPT-like interface for running LLMs on CLORE.AI GPUs.

All examples can be run on GPU servers rented through CLORE.AI Marketplace.

Why Open WebUI?

ChatGPT-like UI - Familiar, polished interface
Multi-model - Switch between models easily
RAG built-in - Upload documents for context
User management - Multi-user support
History - Conversation persistence
Ollama integration - Works out of the box

Quick Deploy on CLORE.AI

Docker Image:

ghcr.io/open-webui/open-webui:cuda

Ports:

22/tcp
8080/http

Command:

# Start Ollama in background
ollama serve &
sleep 5
ollama pull llama3.2

# Start Open WebUI (connects to Ollama automatically)
# Note: The Docker image handles this

Accessing Your Service

After deployment, find your http_pub URL in My Orders:

Go to My Orders page
Click on your order
Find the http_pub URL (e.g., abc123.clorecloud.net)

Use https://YOUR_HTTP_PUB_URL instead of localhost in examples below.

Verify It's Working

# Check health
curl https://your-http-pub.clorecloud.net/health

# Get version
curl https://your-http-pub.clorecloud.net/api/version

Response:

{"version": "0.7.2"}

If you get HTTP 502, wait 1-2 minutes - the service is still starting.

Installation

With Ollama (Recommended)

# Start Ollama first
docker run -d --gpus all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

# Pull a model
docker exec -it ollama ollama pull llama3.2

# Start Open WebUI
docker run -d -p 8080:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

All-in-One (Bundled Ollama)

docker run -d -p 8080:8080 \
  --gpus all \
  -v ollama:/root/.ollama \
  -v open-webui:/app/backend/data \
  --name open-webui \
  ghcr.io/open-webui/open-webui:ollama

First Setup

Open http://your-server:8080
Create admin account (first user becomes admin)
Go to Settings → Models → Pull a model
Start chatting!

Features

Chat Interface

Markdown rendering
Code highlighting
Image generation (with compatible models)
Voice input/output
File attachments

Model Management

Pull models directly from UI
Create custom models
Set default model
Model-specific settings

RAG (Document Chat)

Click "+" in chat
Upload PDF, TXT, or other documents
Ask questions about the content

User Management

Multiple users
Role-based access
API key management
Usage tracking

Configuration

Environment Variables

docker run -d \
  -e OLLAMA_BASE_URL=http://ollama:11434 \
  -e WEBUI_AUTH=True \
  -e WEBUI_NAME="My AI Chat" \
  -e DEFAULT_MODELS="llama3.2" \
  ghcr.io/open-webui/open-webui:main

Key Settings

Variable

Description

Default

OLLAMA_BASE_URL

Ollama API URL

http://localhost:11434

WEBUI_AUTH

Enable authentication

True

WEBUI_NAME

Instance name

Open WebUI

DEFAULT_MODELS

Default model

ENABLE_RAG_WEB_SEARCH

Web search in RAG

False

Connect to Remote Ollama

docker run -d -p 8080:8080 \
  -e OLLAMA_BASE_URL=http://remote-server:11434 \
  ghcr.io/open-webui/open-webui:main

Docker Compose

version: '3.8'

services:
  ollama:
    image: ollama/ollama
    container_name: ollama
    volumes:
      - ollama:/root/.ollama
    ports:
      - "11434:11434"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    volumes:
      - open-webui:/app/backend/data
    ports:
      - "8080:8080"
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    depends_on:
      - ollama

volumes:
  ollama:
  open-webui:

docker-compose up -d

API Reference

Open WebUI provides several API endpoints:

Endpoint

Method

Description

/health

GET

Health check

/api/version

GET

Get Open WebUI version

/api/config

GET

Get configuration

/ollama/api/tags

GET

List Ollama models (proxied)

/ollama/api/chat

POST

Chat with Ollama (proxied)

Check Health

curl https://your-http-pub.clorecloud.net/health

Response: true

Get Version

curl https://your-http-pub.clorecloud.net/api/version

Response:

{"version": "0.7.2"}

List Models (via Ollama proxy)

curl https://your-http-pub.clorecloud.net/ollama/api/tags

Most API operations require authentication. Use the web UI to create an account and manage API keys.

Tips

Faster Responses

Use quantized models (Q4_K_M)
Enable streaming in settings
Reduce context length if needed

Better Quality

Use larger models (13B+)
Use Q8 quantization
Adjust temperature in model settings

Save Resources

Set OLLAMA_KEEP_ALIVE=5m
Unload unused models
Use smaller models for testing

GPU Requirements

Same as Ollama.

Open WebUI itself uses minimal resources (~500MB RAM).

Troubleshooting

Can't connect to Ollama

# Check Ollama is running
curl http://localhost:11434/api/tags

# If using Docker, use host networking or correct URL
docker run --network=host ghcr.io/open-webui/open-webui:main

Models not showing

Check Ollama connection in Settings
Refresh model list
Pull models via CLI: ollama pull modelname

Slow performance

Check GPU is being used: nvidia-smi
Try smaller/quantized models
Reduce concurrent users

Cost Estimate

Setup

GPU

Hourly

Basic (7B)

RTX 3060

~$0.03

Standard (13B)

RTX 3090

~$0.06

Advanced (34B)

RTX 4090

~$0.10

Enterprise (70B)

A100

~$0.17

Next Steps

Ollama - CLI usage
LocalAI - More backends
RAG + LangChain - Advanced RAG

PreviousOllama NextvLLM

Last updated 7 days ago

Was this helpful?

hashtagWhy Open WebUI?

hashtagQuick Deploy on CLORE.AI

hashtagAccessing Your Service

hashtagVerify It's Working

hashtagInstallation

hashtagWith Ollama (Recommended)

hashtagAll-in-One (Bundled Ollama)

hashtagFirst Setup

hashtagFeatures

hashtagChat Interface

hashtagModel Management

hashtagRAG (Document Chat)

hashtagUser Management

hashtagConfiguration

hashtagEnvironment Variables

hashtagKey Settings

hashtagConnect to Remote Ollama

hashtagDocker Compose

hashtagAPI Reference

hashtagCheck Health

hashtagGet Version

hashtagList Models (via Ollama proxy)

hashtagTips

hashtagFaster Responses

hashtagBetter Quality

hashtagSave Resources

hashtagGPU Requirements

hashtagTroubleshooting

hashtagCan't connect to Ollama

hashtagModels not showing

hashtagSlow performance

hashtagCost Estimate

hashtagNext Steps

Why Open WebUI?

Quick Deploy on CLORE.AI

Accessing Your Service

Verify It's Working

Installation

With Ollama (Recommended)

All-in-One (Bundled Ollama)

First Setup

Features

Chat Interface

Model Management

RAG (Document Chat)

User Management

Configuration

Environment Variables

Key Settings

Connect to Remote Ollama

Docker Compose

API Reference

Check Health

Get Version

List Models (via Ollama proxy)

Tips

Faster Responses

Better Quality

Save Resources

GPU Requirements

Troubleshooting

Can't connect to Ollama

Models not showing

Slow performance

Cost Estimate

Next Steps