For the complete documentation index, see llms.txt. This page is also available as Markdown.

CodeLlama

Generate, complete, and explain code with CodeLlama on Clore.ai

Newer alternatives! For coding tasks, consider Qwen2.5-Coder (32B, state-of-the-art code gen) or DeepSeek-R1 (reasoning + coding). CodeLlama is still useful for lightweight deployments.

Generate, complete, and explain code with Meta's CodeLlama.

Renting on CLORE.AI

  1. Filter by GPU type, VRAM, and price

  2. Choose On-Demand (fixed rate) or Spot (bid price)

  3. Configure your order:

    • Select Docker image

    • Set ports (TCP for SSH, HTTP for web UIs)

    • Add environment variables if needed

    • Enter startup command

  4. Select payment: CLORE, BTC, or USDT/USDC

  5. Create order and wait for deployment

Access Your Server

  • Find connection details in My Orders

  • Web interfaces: Use the HTTP port URL

  • SSH: ssh -p <port> root@<proxy-address>

Model Variants

Model
Size
VRAM
Best For

CodeLlama-7B

7B

8GB

Fast completion

CodeLlama-13B

13B

16GB

Balanced

CodeLlama-34B

34B

40GB

Best quality

CodeLlama-70B

70B

80GB+

Maximum quality

Variants

  • Base: Code completion

  • Instruct: Follow instructions

  • Python: Python-specialized

Quick Deploy

Docker Image:

Ports:

Command:

Accessing Your Service

After deployment, find your http_pub URL in My Orders:

  1. Go to My Orders page

  2. Click on your order

  3. Find the http_pub URL (e.g., abc123.clorecloud.net)

Use https://YOUR_HTTP_PUB_URL instead of localhost in examples below.

Installation

Using Ollama

Using Transformers

Code Completion

Instruct Model

For following coding instructions:

Fill-in-the-Middle (FIM)

Python-Specialized Model

vLLM Server

API Usage

Code Explanation

Bug Fixing

Code Translation

Gradio Interface

Batch Processing

Use with Continue (VSCode)

Configure Continue extension:

Performance

Model
GPU
Tokens/sec

CodeLlama-7B

RTX 3090

~90

CodeLlama-7B

RTX 4090

~130

CodeLlama-13B

RTX 4090

~70

CodeLlama-34B

A100

~50

Troubleshooting

Poor Code Quality

  • Lower temperature (0.1-0.3)

  • Use Instruct variant

  • Larger model if possible

Incomplete Output

  • Increase max_new_tokens

  • Check context length

Slow Generation

  • Use vLLM

  • Quantize model

  • Use smaller variant

Cost Estimate

Typical CLORE.AI marketplace rates (as of 2024):

GPU
Hourly Rate
Daily Rate
4-Hour Session

RTX 3060

~$0.03

~$0.70

~$0.12

RTX 3090

~$0.06

~$1.50

~$0.25

RTX 4090

~$0.10

~$2.30

~$0.40

A100 40GB

~$0.17

~$4.00

~$0.70

A100 80GB

~$0.25

~$6.00

~$1.00

Prices vary by provider and demand. Check CLORE.AI Marketplace for current rates.

Save money:

  • Use Spot market for flexible workloads (often 30-50% cheaper)

  • Pay with CLORE tokens

  • Compare prices across different providers

Next Steps

  • Open Interpreter - Execute code

  • vLLM Inference - Production serving

  • Mistral/Mixtral - Alternative models

Last updated

Was this helpful?