# LibreChat Multi-Provider ## Overview [LibreChat](https://github.com/danny-avila/LibreChat) is an enhanced, open-source ChatGPT-like interface with 22K+ GitHub stars. It faithfully reimagines the ChatGPT experience while adding features the original lacks — multi-provider switching within the same conversation, conversation branching/forking, a rich plugin system, file uploads with vision, and a full code interpreter sandbox. **Why run LibreChat on Clore.ai?** * **True multi-provider in one UI** — Switch between GPT-4, Claude 3.5, Gemini Pro, Mistral, and local Ollama models mid-session. * **No GPU needed for the app** — LibreChat is a Node.js application; it only needs compute for inference if you attach a local LLM backend. * **Cost-effective self-hosting** — Clore.ai pricing starts at fractions of a cent per minute, ideal for running a personal AI hub. * **Persistent conversations** — MongoDB stores your full chat history server-side, unlike browser-local solutions. * **Team-friendly** — Multi-user support with individual API key management. ### Key Features | Feature | Description | | ---------------------- | ------------------------------------------------------------- | | Multi-provider | OpenAI, Anthropic, Google, Azure, Mistral, Ollama, OpenRouter | | Conversation branching | Fork and explore alternative responses | | Plugins | Bing search, Zapier, WolframAlpha, custom tools | | File uploads | Images, PDFs, documents with vision analysis | | Code interpreter | Execute Python in an isolated sandbox | | Artifacts | Render HTML, React, and Markdown outputs | | Presets | Save and share custom model configurations | *** ## Requirements ### Server Specifications | Component | Minimum | Recommended | Notes | | ----------- | ------------- | --------------------------- | ---------------------------------- | | **GPU** | None required | RTX 3090 (if adding Ollama) | Only for local LLM inference | | **VRAM** | — | 24 GB | For local models via Ollama | | **CPU** | 2 vCPU | 4 vCPU | Node.js + MongoDB | | **RAM** | 4 GB | 8 GB | MongoDB benefits from more RAM | | **Storage** | 20 GB | 50+ GB | File uploads, model cache if local | ### Clore.ai Pricing Reference | Server Type | Approx. Cost | Use Case | | ------------------------------ | --------------- | ---------------------------------- | | CPU-focused (4 vCPU, 8 GB RAM) | \~$0.05–0.10/hr | LibreChat + external API providers | | RTX 3090 (24 GB VRAM) | \~$0.20/hr | LibreChat + Ollama local inference | | RTX 4090 (24 GB VRAM) | \~$0.35/hr | LibreChat + faster Ollama/vLLM | | A100 80 GB | \~$1.10/hr | LibreChat + large 70B+ models | > 💡 **Cost tip:** If you only use LibreChat to route API calls to OpenAI/Anthropic/Google, you only pay for the Clore.ai server compute (cheap), not the inference hardware. Budget \~$0.05–0.15/hr for a reliable LibreChat host. ### Prerequisites * Clore.ai server with SSH access * Docker + Docker Compose (pre-installed on Clore.ai) * Git (pre-installed on Clore.ai) * At least one LLM API key **or** a local Ollama/vLLM backend *** ## Quick Start ### Method 1: Docker Compose (Official — Recommended) LibreChat's official deployment uses Docker Compose with MongoDB and MeiliSearch for full functionality. **Step 1: Connect to your Clore.ai server** ```bash ssh root@ -p ``` **Step 2: Clone the repository** ```bash git clone https://github.com/danny-avila/LibreChat.git cd LibreChat ``` **Step 3: Configure environment** ```bash cp .env.example .env nano .env ``` Set at minimum: ```bash # In .env — critical settings MONGO_URI=mongodb://mongodb:27017/LibreChat JWT_SECRET=your-random-64-char-secret-here JWT_REFRESH_SECRET=another-random-64-char-secret-here CREDS_KEY=your-random-32-char-key-here CREDS_IV=your-random-16-char-iv-here # API Keys (add whichever you use) OPENAI_API_KEY=sk-your-openai-key ANTHROPIC_API_KEY=sk-ant-your-anthropic-key GOOGLE_KEY=your-google-gemini-key ``` Generate secrets quickly: ```bash # Generate random secrets node -e "console.log(require('crypto').randomBytes(32).toString('hex'))" ``` **Step 4: Start the stack** ```bash docker compose up -d ``` This starts: * `LibreChat` — main application on port 3080 * `MongoDB` — conversation and user storage * `MeiliSearch` — fast conversation search **Step 5: Verify and access** ```bash docker compose ps docker compose logs librechat --tail 30 ``` Open in browser: ``` http://:3080 ``` Register a new account on the login page. *** ### Method 2: Pre-built Docker Image (Fastest) If you want to skip building from source: ```bash mkdir -p ~/librechat && cd ~/librechat # Download just the docker-compose files curl -o docker-compose.yml https://raw.githubusercontent.com/danny-avila/LibreChat/main/docker-compose.yml curl -o .env https://raw.githubusercontent.com/danny-avila/LibreChat/main/.env.example # Edit configuration nano .env # Start docker compose up -d ``` *** ### Method 3: Single-Container Quick Test For a rapid proof-of-concept without MongoDB (limited functionality): ```bash docker run -d \ --name librechat \ --restart unless-stopped \ -p 3080:3080 \ -e OPENAI_API_KEY=sk-your-key \ -e JWT_SECRET=your-jwt-secret-here \ -e MONGO_URI=mongodb://host-gateway:27017/LibreChat \ --add-host=host-gateway:host-gateway \ ghcr.io/danny-avila/librechat-dev:latest ``` > ⚠️ This method requires a separate MongoDB instance. Use Method 1 for a complete setup. *** ## Configuration ### Adding AI Providers Edit `librechat.yaml` (create it in the project root) for advanced provider configuration: ```bash cat > librechat.yaml << 'EOF' version: 1.1.5 cache: true endpoints: openAI: models: default: ["gpt-4o", "gpt-4o-mini", "gpt-4-turbo", "gpt-3.5-turbo"] fetch: true anthropic: models: default: ["claude-opus-4-5", "claude-sonnet-4-5", "claude-3-haiku-20240307"] fetch: false google: models: default: ["gemini-1.5-pro", "gemini-1.5-flash", "gemini-pro"] fetch: false ollama: # Points to Ollama running on the same Clore.ai server baseURL: http://host-gateway:11434/v1 apiKey: ollama models: default: ["llama3.2", "mistral", "codellama"] fetch: true custom: - name: "OpenRouter" apiKey: "${OPENROUTER_API_KEY}" baseURL: "https://openrouter.ai/api/v1" models: default: ["meta-llama/llama-3.1-8b-instruct:free"] fetch: true titleConvo: true titleModel: "meta-llama/llama-3.1-8b-instruct:free" EOF ``` Mount this file in your `docker-compose.yml`: ```yaml services: LibreChat: volumes: - ./librechat.yaml:/app/librechat.yaml ``` ### Environment Variables Reference | Variable | Description | Example | | -------------------- | ------------------------------ | ----------------------------------- | | `MONGO_URI` | MongoDB connection string | `mongodb://mongodb:27017/LibreChat` | | `JWT_SECRET` | JWT signing secret (64+ chars) | Random hex string | | `OPENAI_API_KEY` | OpenAI key | `sk-...` | | `ANTHROPIC_API_KEY` | Anthropic key | `sk-ant-...` | | `GOOGLE_KEY` | Google Gemini key | `AI...` | | `ALLOW_REGISTRATION` | Enable public signup | `true` / `false` | | `ALLOW_EMAIL_LOGIN` | Enable email/password login | `true` | | `DEBUG_LOGGING` | Verbose logs | `true` | | `SEARCH` | Enable MeiliSearch | `true` | | `MEILI_MASTER_KEY` | MeiliSearch API key | Random string | ### Restricting Registration For private use, disable public registration after creating your account: ```bash # In .env ALLOW_REGISTRATION=false ``` Then restart: `docker compose restart LibreChat` ### Enabling Code Interpreter ```bash # In .env CODE_INTERPRETER_ENABLED=true ``` The code interpreter runs Python in an isolated Docker container. Ensure Docker socket is accessible. ### File Upload Configuration ```bash # In .env # Max file size in MB FILE_UPLOAD_SIZE_LIMIT=100 # Enable image uploads for vision models VISION_ENABLED=true ``` *** ## GPU Acceleration LibreChat does **not** use GPU directly — it's a routing layer. GPU acceleration applies to any local inference backend you connect to it. ### Connecting to Ollama (Same Server) If running Ollama on the same Clore.ai server (see [Ollama Guide](/guides/language-models/ollama.md)): ```bash # Start Ollama with GPU support docker run -d \ --name ollama \ --gpus all \ --restart unless-stopped \ -p 11434:11434 \ -v ollama_models:/root/.ollama \ ollama/ollama # Pull models docker exec ollama ollama pull llama3.2 docker exec ollama ollama pull codellama:13b # In librechat.yaml, set: # baseURL: http://172.17.0.1:11434/v1 ``` ### Connecting to vLLM (High Throughput) For high-concurrency deployments (see [vLLM Guide](/guides/language-models/vllm.md)): ```bash # Start vLLM on an A100 Clore.ai instance docker run -d \ --name vllm \ --gpus all \ --restart unless-stopped \ -p 8000:8000 \ -v hf_cache:/root/.cache/huggingface \ -e HF_TOKEN=your-hf-token \ vllm/vllm-openai:latest \ --model meta-llama/Llama-3.1-70B-Instruct \ --tensor-parallel-size 2 \ --max-model-len 8192 ``` In `librechat.yaml`: ```yaml custom: - name: "Local vLLM" apiKey: "not-needed" baseURL: "http://172.17.0.1:8000/v1" models: default: ["meta-llama/Llama-3.1-70B-Instruct"] fetch: true ``` ### GPU Sizing for Local Models | Model Size | Min VRAM | Recommended Clore GPU | Approx. Cost | | ---------- | -------- | --------------------- | ------------ | | 7–8B (Q4) | 6 GB | RTX 3090 | \~$0.20/hr | | 13B (Q4) | 10 GB | RTX 3090 | \~$0.20/hr | | 34B (Q4) | 24 GB | RTX 4090 | \~$0.35/hr | | 70B (Q4) | 48 GB | 2× RTX 3090 | \~$0.40/hr | | 70B (FP16) | 80 GB | A100 80GB | \~$1.10/hr | *** ## Tips & Best Practices ### Cost Management on Clore.ai ```bash # Snapshot your configuration before stopping the server docker compose exec mongodb mongodump --out /tmp/backup docker cp librechat-mongodb-1:/tmp/backup ./mongo-backup-$(date +%Y%m%d) # Stop all containers when not in use docker compose stop # Or scale down to zero cost by pausing the Clore.ai instance from the dashboard ``` ### Backup Strategy ```bash # Automated daily backup script cat > /root/backup-librechat.sh << 'EOF' #!/bin/bash cd ~/LibreChat docker compose exec -T mongodb mongodump --archive | \ gzip > ~/backups/librechat-$(date +%Y%m%d-%H%M).mongo.gz # Keep only last 7 days find ~/backups -name "*.mongo.gz" -mtime +7 -delete EOF chmod +x /root/backup-librechat.sh # Add to crontab: 0 2 * * * /root/backup-librechat.sh ``` ### Restoring from Backup ```bash # Restore MongoDB dump gunzip < ~/backups/librechat-20240101-0200.mongo.gz | \ docker compose exec -T mongodb mongorestore --archive ``` ### Securing LibreChat * Always set strong, unique values for `JWT_SECRET` and `CREDS_KEY` * Disable registration after initial user creation: `ALLOW_REGISTRATION=false` * Use a reverse proxy (nginx/Caddy) with HTTPS for production * Regularly update the Docker image: `docker compose pull && docker compose up -d` ### Nginx Reverse Proxy (Optional) ```bash cat > /etc/nginx/sites-available/librechat << 'EOF' server { listen 80; server_name your-domain.com; location / { proxy_pass http://localhost:3080; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection 'upgrade'; proxy_set_header Host $host; proxy_cache_bypass $http_upgrade; client_max_body_size 100M; } } EOF ln -s /etc/nginx/sites-available/librechat /etc/nginx/sites-enabled/ nginx -t && systemctl reload nginx ``` *** ## Troubleshooting ### Port 3080 not accessible ```bash # Check if container is running docker compose ps # Check port binding ss -tlnp | grep 3080 # View application logs docker compose logs librechat --tail 50 -f # Check Clore.ai firewall — ensure port 3080 is in your port mapping ``` ### MongoDB connection refused ```bash # Check MongoDB status docker compose ps mongodb docker compose logs mongodb --tail 20 # Verify MONGO_URI in .env matches the service name # Should be: mongodb://mongodb:27017/LibreChat (not localhost) # Test connection manually docker compose exec LibreChat node -e " const mongoose = require('mongoose'); mongoose.connect(process.env.MONGO_URI) .then(() => console.log('Connected!')) .catch(e => console.error(e)); " ``` ### JWT / Authentication errors ```bash # Regenerate secrets in .env node -e "console.log(require('crypto').randomBytes(32).toString('hex'))" # Update JWT_SECRET and JWT_REFRESH_SECRET # Then restart: docker compose restart LibreChat ``` ### Ollama models not appearing ```bash # Test Ollama API from LibreChat container docker compose exec LibreChat \ curl -s http://172.17.0.1:11434/v1/models | python3 -m json.tool # Ensure Ollama is listening on 0.0.0.0, not just localhost docker exec ollama ollama serve # Check startup logs for bind address ``` ### Out of disk space ```bash # Check disk usage df -h docker system df # Clean up Docker resources docker system prune -f docker volume prune -f # WARNING: removes unused volumes # Check LibreChat uploads directory du -sh ~/LibreChat/client/public/uploads ``` ### Update to latest version ```bash cd ~/LibreChat git pull origin main docker compose pull docker compose up -d --build ``` *** ## Further Reading * [LibreChat Documentation](https://docs.librechat.ai) — complete configuration reference * [LibreChat GitHub](https://github.com/danny-avila/LibreChat) — source, issues, changelog * [LibreChat Docker Hub](https://ghcr.io/danny-avila/librechat-dev) — image tags * [Running Ollama on Clore.ai](/guides/language-models/ollama.md) — local LLM backend * [Running vLLM on Clore.ai](/guides/language-models/vllm.md) — high-throughput inference * [GPU Comparison Guide](/guides/getting-started/gpu-comparison.md) — choosing the right GPU tier * [LibreChat Config File Reference](https://docs.librechat.ai/install/configuration/librechat_yaml.html) — `librechat.yaml` schema --- # Agent Instructions: Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://docs.clore.ai/guides/ai-platforms-and-agents/librechat.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.