TabbyML Code Completion
Self-host TabbyML as a private GitHub Copilot alternative on Clore.ai
TabbyML is a self-hosted AI code completion server — a drop-in replacement for GitHub Copilot that keeps your code entirely on your own infrastructure. Licensed under Apache 2.0, it runs on Clore.ai GPUs and connects to VS Code, JetBrains, and Vim/Neovim via official extensions. Models range from StarCoder2-1B (fits on 4 GB VRAM) to StarCoder2-15B and DeepSeek-Coder for maximum quality.
All examples run on GPU servers rented through the CLORE.AI Marketplace.
Key Features
Self-hosted Copilot alternative — your code never leaves your server
Apache 2.0 license — free for commercial use, no restrictions
IDE extensions — VS Code, JetBrains (IntelliJ, PyCharm, WebStorm), Vim/Neovim
Multiple models — StarCoder2 (1B/3B/7B/15B), DeepSeek-Coder, CodeLlama
Repository context — RAG-powered code retrieval for project-aware completions
Docker deployment — single command to launch with GPU support
Admin dashboard — usage analytics, model management, user management
Chat interface — ask coding questions beyond autocompletion
Requirements
GPU
RTX 3060 12 GB
RTX 3080 10 GB+
VRAM
4 GB
10 GB
RAM
8 GB
16 GB
Disk
20 GB
50 GB
CUDA
11.8
12.1+
Clore.ai pricing: RTX 3080 ≈ $0.3–1/day · RTX 3060 ≈ $0.15–0.3/day
TabbyML is lightweight — even an RTX 3060 runs StarCoder2-7B with fast inference.
Quick Start
1. Deploy with Docker
2. Choose a Model
StarCoder2-1B
~3 GB
Fastest
Basic
RTX 3060, fast drafts
StarCoder2-3B
~5 GB
Fast
Good
General development
StarCoder2-7B
~8 GB
Medium
High
Recommended default
StarCoder2-15B
~16 GB
Slower
Best
Complex codebases
DeepSeek-Coder-6.7B
~8 GB
Medium
High
Python, JS, TypeScript
CodeLlama-7B
~8 GB
Medium
Good
General purpose
Switch models by changing the --model flag:
3. Install IDE Extensions
VS Code:
Open Extensions (Ctrl+Shift+X)
Search "Tabby" and install the official extension
Open Settings → search "Tabby"
Set the server endpoint:
http://<your-clore-ip>:8080
JetBrains (IntelliJ, PyCharm, WebStorm):
Settings → Plugins → Marketplace
Search "Tabby" and install
Settings → Tools → Tabby → Server endpoint:
http://<your-clore-ip>:8080
Vim/Neovim:
4. Access the Admin Dashboard
Open http://<your-clore-ip>:8080 in a browser. The dashboard provides:
Completion usage statistics
Model status and performance metrics
User and API token management
Repository indexing configuration
Usage Examples
Add Repository Context (RAG)
Index your repository for project-aware completions:
Use the Chat API
Run with Authentication
Run Without Docker (Direct Install)
Cost Comparison
GitHub Copilot
$19/user
❌ Cloud
~200 ms
TabbyML on RTX 3060
~$5–9/mo
✅ Self
~50 ms
TabbyML on RTX 3080
~$9–30/mo
✅ Self
~30 ms
TabbyML on RTX 4090
~$15–60/mo
✅ Self
~15 ms
For a small team (3–5 developers), a single RTX 3080 on Clore.ai replaces multiple Copilot subscriptions at a fraction of the cost.
Tips
StarCoder2-7B is the sweet spot — best quality-to-VRAM ratio for most teams
Enable repository context — RAG indexing dramatically improves completion relevance for large codebases
Expose port 8080 securely — use SSH tunneling or a reverse proxy with TLS for production deployments
Monitor VRAM usage —
nvidia-smito ensure the model fits with headroom for inference batchingUse the completion API for CI/CD integration — automate code review suggestions
Tabby supports multiple users — the admin dashboard lets you create API tokens per developer
Latency matters — choose a Clore.ai server geographically close to your team for the fastest completions
Troubleshooting
Docker container exits immediately
Check logs: docker logs tabby. Likely VRAM insufficient for model
IDE extension not connecting
Verify endpoint URL, check firewall/port forwarding on Clore.ai
Slow completions
Use a smaller model, or ensure GPU is not shared with other tasks
CUDA out of memory
Switch to a smaller model (StarCoder2-3B or 1B)
Repository indexing stuck
Check disk space and ensure the git repo is accessible
Auth token rejected
Regenerate token in admin dashboard, update IDE extension
High latency from remote IDE
Use SSH tunnel: ssh -L 8080:localhost:8080 root@<clore-ip>
Resources
Last updated
Was this helpful?