LobeChat AI Assistant

Deploy LobeChat on Clore.ai — a stunning, feature-rich AI chat interface with multi-provider support, knowledge base, and plugins running on affordable GPU-backed cloud servers.

Overview

LobeChatarrow-up-right is a modern, open-source AI chat framework with 55K+ GitHub stars, known for its polished UI and extensive feature set. It supports virtually every major LLM provider — OpenAI, Anthropic Claude, Google Gemini, Mistral, and local models via Ollama — all from a single, self-hosted interface.

Why run LobeChat on Clore.ai?

  • No GPU required — LobeChat itself is a lightweight web app. Clore.ai CPU-only or minimal-GPU instances are perfectly sufficient for the interface.

  • Pair with local LLMs — Spin up Ollama or vLLM on the same Clore.ai server and point LobeChat at it for fully local, private inference.

  • Affordable hosting — A basic Clore.ai instance costs a fraction of traditional VPS providers, and you can shut it down when not in use.

  • Full data ownership — Database mode stores all conversations, files, and embeddings in your own PostgreSQL instance on the server.

LobeChat operates in two distinct modes:

Mode
Description
Best For

Standalone

Single Docker container, settings stored in browser

Quick testing, personal use

Database

Full stack (PostgreSQL + MinIO + Auth + App)

Teams, persistent history, knowledge base


Requirements

Server Specifications

Component
Minimum
Recommended
Notes

GPU

None required

RTX 3090 (if running local LLMs)

Only needed for Ollama/vLLM backend

VRAM

24 GB (RTX 3090)

For local model inference

CPU

2 vCPU

4+ vCPU

LobeChat itself is lightweight

RAM

2 GB

8 GB

4+ GB if using database mode

Storage

10 GB

50+ GB

More if storing uploaded files or models

Clore.ai Pricing Reference

Server Type
Approx. Cost
Use Case

CPU-only instance

~$0.05–0.10/hr

Standalone LobeChat only

RTX 3090 (24 GB VRAM)

~$0.20/hr

LobeChat + Ollama local LLMs

RTX 4090 (24 GB VRAM)

~$0.35/hr

LobeChat + faster local inference

A100 80 GB

~$1.10/hr

LobeChat + large models (70B+)

💡 Tip: For API-only use (connecting to OpenAI, Anthropic, etc.), any small instance works. A GPU server only makes sense if you want to also run local LLMs. See GPU Comparison Guide for details.

Prerequisites

  • Clore.ai account with a deployed server

  • SSH access to your server

  • Docker and Docker Compose (pre-installed on Clore.ai servers)

  • NVIDIA drivers (pre-installed; only relevant if using local LLM backend)

  • At least one API key (OpenAI, Anthropic, etc.) or a local Ollama instance


Quick Start

Standalone mode runs LobeChat as a single container. Settings and conversation history are stored in the browser's local storage — no database required.

Step 1: Connect to your Clore.ai server

Step 2: Pull and run LobeChat

Step 3: Verify it's running

Step 4: Access the interface

Open your browser and navigate to:

⚠️ Security Note: Clore.ai servers are publicly accessible. Consider setting ACCESS_CODE to password-protect your instance (see Configuration section below).


Option B: Standalone with Multiple Providers

Pass multiple API keys to support different providers simultaneously:


Option C: With Local Ollama Backend

If you have Ollama running on the same Clore.ai server (see Ollama Guide):

On Linux, replace host-gateway with the actual Docker bridge IP, typically 172.17.0.1:


Option D: Database Mode (Docker Compose)

Database mode enables persistent conversation history, multi-user support, file uploads to S3-compatible storage, and a full knowledge base.

Step 1: Create project directory

Step 2: Create docker-compose.yml

Step 3: Start the stack

Step 4: Create MinIO bucket


Configuration

Environment Variables Reference

Variable
Description
Default

OPENAI_API_KEY

OpenAI API key

OPENAI_PROXY_URL

Custom OpenAI-compatible endpoint

https://api.openai.com/v1

ANTHROPIC_API_KEY

Anthropic Claude API key

GOOGLE_API_KEY

Google Gemini API key

MISTRAL_API_KEY

Mistral AI API key

OLLAMA_PROXY_URL

URL to local Ollama instance

ACCESS_CODE

Password to protect the interface

DEFAULT_AGENT_CONFIG

JSON config for default assistant behavior

FEATURE_FLAGS

Enable/disable specific features

Enabling Specific Features

Enable web search plugin:

Enable text-to-speech:

Set custom system prompt for default agent:

Updating LobeChat

For Docker Compose:


GPU Acceleration

LobeChat itself does not require GPU. However, when paired with a GPU-accelerated backend on Clore.ai, you get local, private LLM inference:

Pairing with vLLM (High-Performance Inference)

See the vLLM Guide for full setup. Quick integration:

Resource Usage

Backend
GPU VRAM Used
Approximate Throughput

Ollama (Llama 3.2 3B)

~2 GB

50–80 tokens/sec on 3090

Ollama (Llama 3.1 8B)

~6 GB

40–60 tokens/sec on 3090

vLLM (Llama 3.1 8B)

~16 GB

80–150 tokens/sec on 3090

vLLM (Llama 3.1 70B)

~80 GB

20–40 tokens/sec on A100 80GB


Tips & Best Practices

Cost Optimization

  • Stop your server when idle. Clore.ai charges by the hour — use the dashboard to pause instances you're not actively using.

  • Standalone mode for personal use. Unless you need multi-user support or persistent server-side history, standalone mode avoids the overhead of PostgreSQL and MinIO.

  • Use API providers for large models. Routing Claude or GPT-4 requests through external APIs is cheaper than renting an H100 for occasional queries.

Security

  • Never expose LobeChat without an ACCESS_CODE on a public IP.

  • Consider using Nginx reverse proxy with HTTPS if running long-term.

  • Rotate API keys if you suspect exposure.

Performance

  • For database mode with 10+ concurrent users, ensure at least 8 GB RAM on the host.

  • MinIO performs better with SSD-backed storage (Clore.ai NVMe instances).

Persistence Between Clore.ai Sessions

Since Clore.ai servers can be terminated:

Regularly export conversations from Settings → Data Export in the UI.


Troubleshooting

Container fails to start

Cannot connect to Ollama from LobeChat

Database connection errors (database mode)

Images/files not uploading (database mode)

Out of memory errors


Further Reading

Last updated

Was this helpful?