DeepSeek-V3

Run DeepSeek-V3 with exceptional reasoning on Clore.ai GPUs

Run DeepSeek-V3, the state-of-the-art open-source LLM with exceptional reasoning capabilities on CLORE.AI GPUs.

circle-check
circle-info

Updated: DeepSeek-V3-0324 (March 2024) — The latest revision of DeepSeek-V3 brings significant improvements in code generation, mathematical reasoning, and general problem-solving. See the changelog section for details.

Why DeepSeek-V3?

  • State-of-the-art - Competes with GPT-4o and Claude 3.5 Sonnet

  • 671B MoE - 671B total params, 37B active per token (efficient inference)

  • Improved reasoning - DeepSeek-V3-0324 is significantly better at math and code

  • Efficient - MoE architecture reduces compute costs vs dense models

  • Open source - Fully open weights under MIT license

  • Long context - 128K token context window

What's New in DeepSeek-V3-0324

DeepSeek-V3-0324 (March 2024 revision) introduces meaningful improvements across key domains:

Code Generation

  • +8-12% on HumanEval compared to original V3

  • Better at multi-file codebases and complex refactoring tasks

  • Improved understanding of modern frameworks (FastAPI, Pydantic v2, LangChain v0.3)

  • More reliable at generating complete, runnable code without omissions

Mathematical Reasoning

  • +5% on MATH-500 benchmark

  • Better step-by-step proof construction

  • Improved numerical accuracy for multi-step problems

  • Enhanced ability to identify and correct mistakes mid-solution

General Reasoning

  • Stronger logical deduction and causal inference

  • Better at multi-step planning tasks

  • More consistent performance on edge cases and ambiguous prompts

  • Improved instruction following on complex, multi-constraint requests

Quick Deploy on CLORE.AI

Docker Image:

Ports:

Command (Multi-GPU Required):

Accessing Your Service

After deployment, find your http_pub URL in My Orders:

  1. Go to My Orders page

  2. Click on your order

  3. Find the http_pub URL (e.g., abc123.clorecloud.net)

Use https://YOUR_HTTP_PUB_URL instead of localhost in examples below.

Verify It's Working

circle-exclamation

Model Variants

Model
Parameters
Active
VRAM Required
HuggingFace

Hardware Requirements

Full Precision

Model
Minimum
Recommended

DeepSeek-V3-0324

8x A100 80GB

8x H100 80GB

DeepSeek-V2.5

4x A100 80GB

4x H100 80GB

DeepSeek-V2-Lite

RTX 4090 24GB

A100 40GB

Quantized (AWQ/GPTQ)

Model
Quantization
VRAM

DeepSeek-V3-0324

INT4

4x80GB

DeepSeek-V2.5

INT4

2x80GB

DeepSeek-V2-Lite

INT4

8GB

Installation

Using Transformers

Using Ollama

API Usage

OpenAI-Compatible API (vLLM)

Streaming

cURL

DeepSeek-V2-Lite (Single GPU)

For users with limited hardware:

Code Generation

DeepSeek-V3-0324 is best-in-class for code:

Advanced code tasks where V3-0324 excels:

Math & Reasoning

Multi-GPU Configuration

8x GPU (Full Model — V3-0324)

4x GPU (V2.5)

Performance

Throughput (tokens/sec)

Model
GPUs
Context
Tokens/sec

DeepSeek-V3-0324

8x H100

32K

~85

DeepSeek-V3-0324

8x A100 80GB

32K

~52

DeepSeek-V3-0324 INT4

4x A100 80GB

16K

~38

DeepSeek-V2.5

4x A100 80GB

16K

~70

DeepSeek-V2.5

2x A100 80GB

8K

~45

DeepSeek-V2-Lite

RTX 4090

8K

~40

DeepSeek-V2-Lite

RTX 3090

4K

~25

Time to First Token (TTFT)

Model
Configuration
TTFT

DeepSeek-V3-0324

8x H100

~750ms

DeepSeek-V3-0324

8x A100

~1100ms

DeepSeek-V2.5

4x A100

~500ms

DeepSeek-V2-Lite

RTX 4090

~150ms

Memory Usage

Model
Precision
VRAM Required

DeepSeek-V3-0324

FP16

8x 80GB

DeepSeek-V3-0324

INT4

4x 80GB

DeepSeek-V2.5

FP16

4x 80GB

DeepSeek-V2.5

INT4

2x 80GB

DeepSeek-V2-Lite

FP16

20GB

DeepSeek-V2-Lite

INT4

10GB

Benchmarks

DeepSeek-V3-0324 vs Competition

Benchmark
V3-0324
V3 (original)
GPT-4o
Claude 3.5 Sonnet

MMLU

88.5%

87.1%

88.7%

88.3%

HumanEval

90.2%

82.6%

90.2%

92.0%

MATH-500

67.1%

61.6%

76.6%

71.1%

GSM8K

92.1%

89.3%

95.8%

96.4%

LiveCodeBench

72.4%

65.9%

71.3%

73.8%

Codeforces Rating

1850

1720

1780

1790

Note: MATH-500 improvement from V3 → V3-0324 is +5.5 percentage points.

Docker Compose

GPU Requirements Summary

Use Case
Recommended Setup
Cost/Hour

Full DeepSeek-V3-0324

8x A100 80GB

~$2.00

DeepSeek-V2.5

4x A100 80GB

~$1.00

Development/Testing

RTX 4090 (V2-Lite)

~$0.10

Production API

8x H100 80GB

~$3.00

Cost Estimate

Typical CLORE.AI marketplace rates:

GPU Configuration
Hourly Rate
Daily Rate

RTX 4090 24GB

~$0.10

~$2.30

A100 40GB

~$0.17

~$4.00

A100 80GB

~$0.25

~$6.00

4x A100 80GB

~$1.00

~$24.00

8x A100 80GB

~$2.00

~$48.00

Prices vary by provider. Check CLORE.AI Marketplacearrow-up-right for current rates.

Save money:

  • Use Spot market for development (often 30-50% cheaper)

  • Pay with CLORE tokens

  • Use DeepSeek-V2-Lite for testing before scaling up

Troubleshooting

Out of Memory

Model Download Slow

trust_remote_code Error

Multi-GPU Not Working

DeepSeek vs Others

Feature
DeepSeek-V3-0324
Llama 3.1 405B
Mixtral 8x22B

Parameters

671B (37B active)

405B

176B (44B active)

Context

128K

128K

64K

Code

Excellent

Great

Good

Math

Excellent

Good

Good

Min VRAM

8x80GB

8x80GB

2x80GB

License

MIT

Llama 3.1

Apache 2.0

Use DeepSeek-V3 when:

  • Best reasoning performance needed

  • Code generation is primary use

  • Math/logic tasks are important

  • Have multi-GPU setup available

  • Want fully open-source weights (MIT license)

Next Steps

Last updated

Was this helpful?