# Language Models

- [Overview](/guides/language-models/language-models.md)
- [Ollama](/guides/language-models/ollama.md): Run LLMs locally with Ollama on Clore.ai GPUs
- [Open WebUI](/guides/language-models/open-webui.md): ChatGPT-like interface for running LLMs on Clore.ai GPUs
- [vLLM](/guides/language-models/vllm.md): High-throughput LLM inference with vLLM on Clore.ai GPUs
- [Llama.cpp Server](/guides/language-models/llamacpp-server.md): Efficient LLM inference with llama.cpp server on Clore.ai GPUs
- [Text Generation WebUI](/guides/language-models/text-generation-webui.md): Run text-generation-webui for LLM inference on Clore.ai GPUs
- [ExLlamaV2](/guides/language-models/exllamav2-fast.md): Maximum speed LLM inference with ExLlamaV2 on Clore.ai GPUs
- [LocalAI](/guides/language-models/localai-openai-compatible.md): Self-hosted OpenAI-compatible API with LocalAI on Clore.ai
- [Llama 3.3 70B](/guides/language-models/llama33.md): Run Meta's Llama 3.3 70B model on Clore.ai GPUs
- [Mistral & Mixtral](/guides/language-models/mistral-mixtral.md): Run Mistral and Mixtral models on Clore.ai GPUs
- [DeepSeek Coder](/guides/language-models/deepseek-coder.md): Best-in-class code generation with DeepSeek Coder on Clore.ai
- [DeepSeek-V3](/guides/language-models/deepseek-v3.md): Run DeepSeek-V3 with exceptional reasoning on Clore.ai GPUs
- [DeepSeek-R1 Reasoning Model](/guides/language-models/deepseek-r1.md): Run DeepSeek-R1 open-source reasoning model on Clore.ai GPUs
- [Qwen2.5](/guides/language-models/qwen25.md): Run Alibaba's Qwen2.5 multilingual LLMs on Clore.ai GPUs
- [CodeLlama](/guides/language-models/codellama.md): Generate, complete, and explain code with CodeLlama on Clore.ai
- [Gemma 2](/guides/language-models/gemma2.md): Run Google's Gemma 2 models efficiently on Clore.ai GPUs
- [Phi-4](/guides/language-models/phi4.md): Run Microsoft's Phi-4 small language model on Clore.ai GPUs
- [Llama 4 (Scout & Maverick)](/guides/language-models/llama4.md): Run Meta Llama 4 Scout & Maverick MoE models on Clore.ai GPUs
- [Gemma 3](/guides/language-models/gemma3.md): Run Google Gemma 3 multimodal models on Clore.ai — beats Llama-405B at 15x smaller
- [Mistral Small 3.1](/guides/language-models/mistral-small.md): Deploy Mistral Small 3.1 (24B) on Clore.ai — the ideal single-GPU production model
- [Qwen3.5](/guides/language-models/qwen35.md): Run Alibaba Qwen3.5 on Clore.ai — the freshest frontier model (Feb 2026)
- [GLM-5](/guides/language-models/glm5.md): Deploy GLM-5 (744B MoE) by Zhipu AI on Clore.ai — API access and self-hosting with vLLM
- [GLM-4.7-Flash](/guides/language-models/glm-47-flash.md): Deploy GLM-4.7-Flash (30B MoE) by Zhipu AI on Clore.ai — efficient language model with 59.2% SWE-bench performance
- [Kimi K2.5](/guides/language-models/kimi-k2.md): Deploy Kimi K2.5 (1T MoE multimodal) by Moonshot AI on Clore.ai GPUs
- [Mistral Large 3 (675B MoE)](/guides/language-models/mistral-large3.md): Run Mistral Large 3 — a 675B MoE frontier model with 41B active parameters on Clore.ai GPUs
- [MiMo-V2-Flash](/guides/language-models/mimo-v2-flash.md): Deploy MiMo-V2-Flash (309B MoE) with speculative decoding on Clore.ai — ultra-fast inference with 150+ tok/s
- [Ling-2.5-1T (1 Trillion Parameters)](/guides/language-models/ling25.md): Run Ling-2.5-1T — Ant Group's 1 trillion parameter open-source LLM with hybrid linear attention on Clore.ai GPUs
- [LFM2-24B-A2B](/guides/language-models/lfm2-24b.md): Deploy LFM2-24B-A2B by Liquid AI on Clore.ai — hybrid SSM+Attention architecture with 24B total / 2B active parameters
- [DeepSeek V4 (1T MoE, Multimodal)](/guides/language-models/deepseek-v4.md): Deploy DeepSeek V4 — the trillion-parameter multimodal open-weight model — on Clore.ai GPU servers
- [TGI (Text Generation Inference)](/guides/language-models/tgi.md): Run HuggingFace Text Generation Inference (TGI) for production LLM serving on Clore.ai GPUs
- [SGLang](/guides/language-models/sglang.md): Deploy SGLang for high-performance LLM serving with RadixAttention on Clore.ai GPUs
- [Aphrodite Engine](/guides/language-models/aphrodite-engine.md): Run Aphrodite Engine for LLM inference on legacy and modern GPUs on Clore.ai
- [LiteLLM AI Gateway](/guides/language-models/litellm.md): Deploy LiteLLM as an AI Gateway proxy for 100+ LLMs on Clore.ai GPUs
- [MLC-LLM](/guides/language-models/mlc-llm.md)
- [PowerInfer](/guides/language-models/powerinfer.md)
- [LMDeploy](/guides/language-models/lmdeploy.md)
- [Mistral.rs](/guides/language-models/mistral-rs.md)
