# Open WebUI

Beautiful ChatGPT-like interface for running LLMs on CLORE.AI GPUs.

{% hint style="success" %}
All examples can be run on GPU servers rented through [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## Why Open WebUI?

* **ChatGPT-like UI** - Familiar, polished interface
* **Multi-model** - Switch between models easily
* **RAG built-in** - Upload documents for context
* **User management** - Multi-user support
* **History** - Conversation persistence
* **Ollama integration** - Works out of the box

## Quick Deploy on CLORE.AI

**Docker Image:**

```
ghcr.io/open-webui/open-webui:cuda
```

**Ports:**

```
22/tcp
8080/http
```

**Command:**

```bash
# Start Ollama in background
ollama serve &
sleep 5
ollama pull llama3.2

# Start Open WebUI (connects to Ollama automatically)
# Note: The Docker image handles this
```

## Accessing Your Service

After deployment, find your `http_pub` URL in **My Orders**:

1. Go to **My Orders** page
2. Click on your order
3. Find the `http_pub` URL (e.g., `abc123.clorecloud.net`)

Use `https://YOUR_HTTP_PUB_URL` instead of `localhost` in examples below.

### Verify It's Working

```bash
# Check health
curl https://your-http-pub.clorecloud.net/health

# Get version
curl https://your-http-pub.clorecloud.net/api/version
```

Response:

```json
{"version": "0.7.2"}
```

{% hint style="warning" %}
If you get HTTP 502, wait 1-2 minutes - the service is still starting.
{% endhint %}

## Installation

### With Ollama (Recommended)

```bash
# Start Ollama first
docker run -d --gpus all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

# Pull a model
docker exec -it ollama ollama pull llama3.2

# Start Open WebUI
docker run -d -p 8080:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main
```

### All-in-One (Bundled Ollama)

```bash
docker run -d -p 8080:8080 \
  --gpus all \
  -v ollama:/root/.ollama \
  -v open-webui:/app/backend/data \
  --name open-webui \
  ghcr.io/open-webui/open-webui:ollama
```

## First Setup

1. Open `http://your-server:8080`
2. Create admin account (first user becomes admin)
3. Go to Settings → Models → Pull a model
4. Start chatting!

## Features

### Chat Interface

* Markdown rendering
* Code highlighting
* Image generation (with compatible models)
* Voice input/output
* File attachments

### Model Management

* Pull models directly from UI
* Create custom models
* Set default model
* Model-specific settings

### RAG (Document Chat)

1. Click "+" in chat
2. Upload PDF, TXT, or other documents
3. Ask questions about the content

### User Management

* Multiple users
* Role-based access
* API key management
* Usage tracking

## Configuration

### Environment Variables

```bash
docker run -d \
  -e OLLAMA_BASE_URL=http://ollama:11434 \
  -e WEBUI_AUTH=True \
  -e WEBUI_NAME="My AI Chat" \
  -e DEFAULT_MODELS="llama3.2" \
  ghcr.io/open-webui/open-webui:main
```

### Key Settings

| Variable                | Description           | Default                  |
| ----------------------- | --------------------- | ------------------------ |
| `OLLAMA_BASE_URL`       | Ollama API URL        | `http://localhost:11434` |
| `WEBUI_AUTH`            | Enable authentication | `True`                   |
| `WEBUI_NAME`            | Instance name         | `Open WebUI`             |
| `DEFAULT_MODELS`        | Default model         | -                        |
| `ENABLE_RAG_WEB_SEARCH` | Web search in RAG     | `False`                  |

### Connect to Remote Ollama

```bash
docker run -d -p 8080:8080 \
  -e OLLAMA_BASE_URL=http://remote-server:11434 \
  ghcr.io/open-webui/open-webui:main
```

## Docker Compose

```yaml
version: '3.8'

services:
  ollama:
    image: ollama/ollama
    container_name: ollama
    volumes:
      - ollama:/root/.ollama
    ports:
      - "11434:11434"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    volumes:
      - open-webui:/app/backend/data
    ports:
      - "8080:8080"
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    depends_on:
      - ollama

volumes:
  ollama:
  open-webui:
```

```bash
docker-compose up -d
```

## API Reference

Open WebUI provides several API endpoints:

| Endpoint           | Method | Description                  |
| ------------------ | ------ | ---------------------------- |
| `/health`          | GET    | Health check                 |
| `/api/version`     | GET    | Get Open WebUI version       |
| `/api/config`      | GET    | Get configuration            |
| `/ollama/api/tags` | GET    | List Ollama models (proxied) |
| `/ollama/api/chat` | POST   | Chat with Ollama (proxied)   |

### Check Health

```bash
curl https://your-http-pub.clorecloud.net/health
```

Response: `true`

### Get Version

```bash
curl https://your-http-pub.clorecloud.net/api/version
```

Response:

```json
{"version": "0.7.2"}
```

### List Models (via Ollama proxy)

```bash
curl https://your-http-pub.clorecloud.net/ollama/api/tags
```

{% hint style="info" %}
Most API operations require authentication. Use the web UI to create an account and manage API keys.
{% endhint %}

## Tips

### Faster Responses

1. Use quantized models (Q4\_K\_M)
2. Enable streaming in settings
3. Reduce context length if needed

### Better Quality

1. Use larger models (13B+)
2. Use Q8 quantization
3. Adjust temperature in model settings

### Save Resources

1. Set `OLLAMA_KEEP_ALIVE=5m`
2. Unload unused models
3. Use smaller models for testing

## GPU Requirements

Same as [Ollama](https://docs.clore.ai/guides/ollama#gpu-requirements).

Open WebUI itself uses minimal resources (\~500MB RAM).

## Troubleshooting

### Can't connect to Ollama

```bash
# Check Ollama is running
curl http://localhost:11434/api/tags

# If using Docker, use host networking or correct URL
docker run --network=host ghcr.io/open-webui/open-webui:main
```

### Models not showing

1. Check Ollama connection in Settings
2. Refresh model list
3. Pull models via CLI: `ollama pull modelname`

### Slow performance

1. Check GPU is being used: `nvidia-smi`
2. Try smaller/quantized models
3. Reduce concurrent users

## Cost Estimate

| Setup            | GPU      | Hourly  |
| ---------------- | -------- | ------- |
| Basic (7B)       | RTX 3060 | \~$0.03 |
| Standard (13B)   | RTX 3090 | \~$0.06 |
| Advanced (34B)   | RTX 4090 | \~$0.10 |
| Enterprise (70B) | A100     | \~$0.17 |

## Next Steps

* [Ollama](https://docs.clore.ai/guides/language-models/ollama) - CLI usage
* [LocalAI](https://docs.clore.ai/guides/language-models/localai-openai-compatible) - More backends
* [RAG + LangChain](https://docs.clore.ai/guides/training/finetune-llm) - Advanced RAG
