Text Generation WebUI

Run the most popular LLM interface with support for all model formats.

circle-check

Renting on CLORE.AI

  1. Filter by GPU type, VRAM, and price

  2. Choose On-Demand (fixed rate) or Spot (bid price)

  3. Configure your order:

    • Select Docker image

    • Set ports (TCP for SSH, HTTP for web UIs)

    • Add environment variables if needed

    • Enter startup command

  4. Select payment: CLORE, BTC, or USDT/USDC

  5. Create order and wait for deployment

Access Your Server

  • Find connection details in My Orders

  • Web interfaces: Use the HTTP port URL

  • SSH: ssh -p <port> root@<proxy-address>

Why Text Generation WebUI?

  • Supports GGUF, GPTQ, AWQ, EXL2, HF formats

  • Built-in chat, notebook, and API modes

  • Extensions: voice, characters, multimodal

  • Fine-tuning support

  • Model switching on the fly

Requirements

Model Size
Min VRAM
Recommended

7B (Q4)

6GB

RTX 3060

13B (Q4)

10GB

RTX 3080

30B (Q4)

20GB

RTX 4090

70B (Q4)

40GB

A100

Quick Deploy

Docker Image:

Ports:

Environment:

Manual Installation

Image:

Ports:

Command:

Accessing Your Service

After deployment, find your http_pub URL in My Orders:

  1. Go to My Orders page

  2. Click on your order

  3. Find the http_pub URL (e.g., abc123.clorecloud.net)

Use https://YOUR_HTTP_PUB_URL instead of localhost in examples below.

Access WebUI

  1. Wait for deployment

  2. Find port 7860 mapping in My Orders

  3. Open: http://<proxy>:<port>

Download Models

From HuggingFace (in WebUI)

  1. Go to Model tab

  2. Enter model name: bartowski/Meta-Llama-3.1-8B-Instruct-GGUF

  3. Click Download

Via Command Line

For Chat:

For Coding:

For Roleplay:

Loading Models

  1. Model tab → Select model folder

  2. Model loader: llama.cpp

  3. Set n-gpu-layers:

    • RTX 3090: 35-40

    • RTX 4090: 45-50

    • A100: 80+

  4. Click Load

GPTQ (Fast, quantized)

  1. Download GPTQ model

  2. Model loader: ExLlama_HF or AutoGPTQ

  3. Load model

EXL2 (Best speed)

  1. Download EXL2 model

  2. Model loader: ExLlamav2_HF

  3. Load

Chat Configuration

Character Setup

  1. Go to ParametersCharacter

  2. Create or load character card

  3. Set:

    • Name

    • Context/persona

    • Example dialogue

Instruct Mode

For instruction-tuned models:

  1. ParametersInstruction template

  2. Select template matching your model:

    • Llama-2-chat

    • Mistral

    • ChatML

    • Alpaca

API Usage

Enable API

Start with --api flag (default port 5000)

OpenAI-compatible API

Native API

Extensions

Installing Extensions

Enable Extensions

  1. Session tab → Extensions

  2. Check boxes for desired extensions

  3. Click Apply and restart

Extension
Purpose

silero_tts

Voice output

whisper_stt

Voice input

superbooga

Document Q&A

sd_api_pictures

Image generation

multimodal

Image understanding

Performance Tuning

GGUF Settings

Memory Optimization

For limited VRAM:

Speed Optimization

Fine-tuning (LoRA)

Training Tab

  1. Go to Training tab

  2. Load base model

  3. Upload dataset (JSON format)

  4. Configure:

    • LoRA rank: 8-32

    • Learning rate: 1e-4

    • Epochs: 3-5

  5. Start training

Dataset Format

Saving Your Work

Troubleshooting

Model won't load

  • Check VRAM usage: nvidia-smi

  • Reduce n_gpu_layers

  • Use smaller quantization (Q4_K_M → Q4_K_S)

Slow generation

  • Increase n_gpu_layers

  • Use EXL2 instead of GGUF

  • Enable --no-mmap

triangle-exclamation

during generation - Reduce `n_ctx` (context length) - Use `--n-gpu-layers 0` for CPU-only - Try smaller model

Cost Estimate

Typical CLORE.AI marketplace rates (as of 2024):

GPU
Hourly Rate
Daily Rate
4-Hour Session

RTX 3060

~$0.03

~$0.70

~$0.12

RTX 3090

~$0.06

~$1.50

~$0.25

RTX 4090

~$0.10

~$2.30

~$0.40

A100 40GB

~$0.17

~$4.00

~$0.70

A100 80GB

~$0.25

~$6.00

~$1.00

Prices vary by provider and demand. Check CLORE.AI Marketplacearrow-up-right for current rates.

Save money:

  • Use Spot market for flexible workloads (often 30-50% cheaper)

  • Pay with CLORE tokens

  • Compare prices across different providers

Last updated

Was this helpful?