CodeLlama

Generate, complete, and explain code with Meta's CodeLlama.

circle-check

Renting on CLORE.AI

  1. Filter by GPU type, VRAM, and price

  2. Choose On-Demand (fixed rate) or Spot (bid price)

  3. Configure your order:

    • Select Docker image

    • Set ports (TCP for SSH, HTTP for web UIs)

    • Add environment variables if needed

    • Enter startup command

  4. Select payment: CLORE, BTC, or USDT/USDC

  5. Create order and wait for deployment

Access Your Server

  • Find connection details in My Orders

  • Web interfaces: Use the HTTP port URL

  • SSH: ssh -p <port> root@<proxy-address>

Model Variants

Model
Size
VRAM
Best For

CodeLlama-7B

7B

8GB

Fast completion

CodeLlama-13B

13B

16GB

Balanced

CodeLlama-34B

34B

40GB

Best quality

CodeLlama-70B

70B

80GB+

Maximum quality

Variants

  • Base: Code completion

  • Instruct: Follow instructions

  • Python: Python-specialized

Quick Deploy

Docker Image:

Ports:

Command:

Accessing Your Service

After deployment, find your http_pub URL in My Orders:

  1. Go to My Orders page

  2. Click on your order

  3. Find the http_pub URL (e.g., abc123.clorecloud.net)

Use https://YOUR_HTTP_PUB_URL instead of localhost in examples below.

Installation

Using Ollama

Using Transformers

Code Completion

Instruct Model

For following coding instructions:

Fill-in-the-Middle (FIM)

Python-Specialized Model

vLLM Server

API Usage

Code Explanation

Bug Fixing

Code Translation

Gradio Interface

Batch Processing

Use with Continue (VSCode)

Configure Continue extension:

Performance

Model
GPU
Tokens/sec

CodeLlama-7B

RTX 3090

~90

CodeLlama-7B

RTX 4090

~130

CodeLlama-13B

RTX 4090

~70

CodeLlama-34B

A100

~50

Troubleshooting

Poor Code Quality

  • Lower temperature (0.1-0.3)

  • Use Instruct variant

  • Larger model if possible

Incomplete Output

  • Increase max_new_tokens

  • Check context length

Slow Generation

  • Use vLLM

  • Quantize model

  • Use smaller variant

Cost Estimate

Typical CLORE.AI marketplace rates (as of 2024):

GPU
Hourly Rate
Daily Rate
4-Hour Session

RTX 3060

~$0.03

~$0.70

~$0.12

RTX 3090

~$0.06

~$1.50

~$0.25

RTX 4090

~$0.10

~$2.30

~$0.40

A100 40GB

~$0.17

~$4.00

~$0.70

A100 80GB

~$0.25

~$6.00

~$1.00

Prices vary by provider and demand. Check CLORE.AI Marketplacearrow-up-right for current rates.

Save money:

  • Use Spot market for flexible workloads (often 30-50% cheaper)

  • Pay with CLORE tokens

  • Compare prices across different providers

Next Steps

  • Open Interpreter - Execute code

  • vLLM Inference - Production serving

  • Mistral/Mixtral - Alternative models

Last updated

Was this helpful?