# Troubleshooting Common issues and solutions when renting GPU servers on CLORE.AI marketplace. {% hint style="success" %} All examples can be run on GPU servers rented through [CLORE.AI Marketplace](https://clore.ai/marketplace). {% endhint %} {% hint style="info" %} This guide is based on CLORE.AI platform technical documentation. {% endhint %} ## Table of Contents * [Order Creation Issues](#order-creation-issues) * [Connection Issues](#connection-issues) * [Container Issues](#container-issues) * [GPU Issues](#gpu-issues) * [Payment Issues](#payment-issues) * [Platform Limits](#platform-limits) *** ## Order Creation Issues ### Order fails: "Insufficient balance" **Cause:** Not enough funds to cover creation fee and minimum deposit. **Solution:** * Check your balance in the selected currency (CLORE, BTC, or USDT/USDC) * Creation fee is charged when order is created * Top up your balance with enough for several hours of rental ### Order fails: "Server not available" **Cause:** Server is already rented or offline. **Solution:** * Refresh the marketplace page * Check server status (online/offline indicator) * For Spot rentals - you may have been outbid ### Order stuck in "Creating" status **Cause:** Container is deploying or an error occurred. **Solution:** 1. Wait 2-5 minutes (Docker image is being pulled) 2. Check logs in **My Orders** 3. Large images (10GB+) take longer to download 4. If stuck for more than 10 minutes - cancel and retry *** ## Connection Issues ### Cannot connect via SSH **Cause:** Port not configured or container not ready. **Checklist:** 1. Port 22 must be set as **TCP** (not HTTP) 2. Container status must be **Active** (not Creating) 3. Use the correct mapped port from **My Orders** **Correct SSH command:** ```bash ssh -p root@ ``` Where `` is the public port (e.g., 45678), NOT port 22. ### SSH works but web interface doesn't open **Cause:** Port set as TCP instead of HTTP, or service not running. **Solution:** 1. Web interface ports must be set as **HTTP** (not TCP) 2. Service must listen on `0.0.0.0`, not `localhost` 3. Check logs - service may have crashed on startup **Correct port configuration:** ``` 22/tcp - SSH access 7860/http - Gradio/WebUI interface 8000/http - API server ``` ### "Connection refused" error **Cause:** Service inside container is not running or listening on wrong address. **Solution:** 1. SSH into container and check service status: ```bash ps aux | grep python netstat -tlnp ``` 2. Service must listen on `0.0.0.0`, not `127.0.0.1`: ```bash # Wrong: python app.py --host 127.0.0.1 # Correct: python app.py --host 0.0.0.0 ``` ### "Connection timed out" error **Cause:** Wrong address/port or network issues. **Checklist:** 1. Use Proxy address from **My Orders** (not server IP!) 2. Use Mapped port (public port, not container port) 3. Use correct protocol (http\:// for HTTP ports) *** ## Container Issues ### Container keeps restarting **Cause:** Error in startup command or insufficient resources. **Solution:** 1. Check logs in **My Orders** 2. Simplify startup command: ```bash # Bad - long command may fail: apt update && apt install -y ... && pip install ... && python ... # Better - start with simple command: sleep infinity ``` 3. Then SSH in and configure manually ### Cannot reset container **Cause:** Cooldown period between resets. **Fact:** Reset container has a **120 second** cooldown. **Solution:** Wait 2 minutes between reset attempts. ### Data lost after restart **Cause:** Data not in persistent storage. **Important:** * Data inside container is **preserved** on Reset Container * Data is **lost** when order is cancelled or expires * Always download results before ending rental: ```bash scp -P root@:/workspace/results.tar.gz ./ ``` ### Startup command not executing **Cause:** Syntax error or image issue. **Common mistakes:** ```bash # Error: extra space after \ apt update && \ apt install -y git # <-- space before next line # Correct: apt update && \ apt install -y git && \ python app.py ``` **Solution:** 1. Use simple startup: `bash` or `sleep infinity` 2. Configure everything via SSH 3. Or create custom Docker image with pre-installed software *** ## GPU Issues ### GPU not visible in container **Check:** ```bash nvidia-smi ``` **If command not found:** * Docker image must support CUDA * Use CUDA-enabled images: `pytorch/pytorch:2.5.1-cuda12.4-cudnn9-runtime` **If GPU not displayed:** * Verify server has GPU (check marketplace listing) * Contact server provider ### CUDA version mismatch **Error:** `CUDA driver version is insufficient for CUDA runtime version` **Cause:** CUDA version in image incompatible with server driver. **Solution:** * Check driver version: `nvidia-smi` (top right corner) * Use image with compatible CUDA version * Safe choices: CUDA 11.8, CUDA 12.1 ### Out of GPU memory **Error:** `CUDA out of memory` **Solutions:** 1. Use smaller model or quantization 2. Add memory optimization flags: * Stable Diffusion: `--medvram` or `--lowvram` * LLMs: `load_in_4bit=True` or `load_in_8bit=True` 3. Clear memory: `torch.cuda.empty_cache()` 4. Rent server with more VRAM *** ## Payment Issues ### Supported currencies CLORE.AI supports three currencies: * **CLORE** - platform's native token * **BTC** - Bitcoin * **USD** - stablecoins (if enabled by provider) ### Order cancelled: "Outbid" **Cause:** Someone offered higher price on Spot market. **Solution:** * Use **On-Demand** for guaranteed rental * Or increase your Spot bid price ### Balance charged but order not created **Cause:** Creation fee is charged even if order fails. **Solution:** * Creation fee is usually minimal * Check cancellation reason in history * Contact support for recurring issues *** ## Platform Limits Verified from CLORE.AI codebase: | Parameter | Limit | | --------------------------- | ---------------------------- | | Ports per order | **5** | | Total environment variables | **12,288 characters** (12KB) | | Single env var name | 128 characters | | Single env var value | 1,536 characters | | SSH key | **3,072 characters** | | SSH password | **32 characters** | | Jupyter token | **32 characters** | | Container reset cooldown | **120 seconds** | | Port range | 1-65535 | | Port protocols | TCP or HTTP only | *** ## Environment Variables Use environment variables for SSH and Jupyter access: | Variable | Purpose | Max Length | | --------------- | ---------------------- | ----------- | | `SSH_KEY` | Your public SSH key | 3,072 chars | | `SSH_PASSWORD` | SSH password | 32 chars | | `JUPYTER_TOKEN` | Jupyter notebook token | 32 chars | **Example configuration:** ``` SSH_PASSWORD=mypassword123 JUPYTER_TOKEN=mysecrettoken ``` *** ## Diagnostic Commands ```bash # Check GPU nvidia-smi # Check memory usage free -h # Check disk space df -h # Check running processes ps aux | grep python # Check open ports netstat -tlnp # Check recent error logs dmesg | tail -50 # Clear GPU memory (Python) import torch torch.cuda.empty_cache() ``` *** ## Getting Help If issue persists: 1. Check [CLORE.AI Documentation](https://docs.clore.ai/) 2. Describe issue with logs and screenshots 3. Include order ID and server ID --- # Agent Instructions: Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter: ``` GET https://docs.clore.ai/guides/getting-started/clore-troubleshooting.md?ask= ``` The question should be specific, self-contained, and written in natural language. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.