Jan.ai Offline Assistant
Deploy Jan.ai Server on Clore.ai — a fully offline, OpenAI-compatible LLM server with model hub, conversation management, and GPU-accelerated inference powered by the Cortex engine.
Overview
Requirements
Hardware Requirements
Tier
GPU
VRAM
RAM
Storage
Clore.ai Price
Model VRAM Reference
Model
VRAM Required
Recommended GPU
Software Prerequisites
Quick Start
Step 1 — Rent a GPU Server on Clore.ai
Step 2 — Connect to Your Server
Step 3 — Install Docker Compose (if not present)
Step 4 — Deploy Jan Server with Docker Compose
Step 5 — Verify the Server is Running
Step 6 — Pull Your First Model
Step 7 — Start the Model & Chat
Configuration
Environment Variables
Variable
Default
Description
Multi-GPU Configuration
Custom Model Configuration
Securing the API with a Token
GPU Acceleration
Verifying CUDA Acceleration
Switching Inference Backends
Context Window and Batch Size Tuning
Parameter
Description
Recommendation
Tips & Best Practices
🎯 Model Selection for Clore.ai Budgets
💾 Persistent Model Storage
🔗 Using Jan Server as OpenAI Drop-in
📊 Monitoring Resource Usage
Troubleshooting
Container fails to start — GPU not found
Model download stuck or fails
Out of VRAM (CUDA out of memory)
Cannot connect to API from outside the container
Slow inference (CPU fallback)
Further Reading
Last updated
Was this helpful?