GroundingDINO

Detect any object using text descriptions with GroundingDINO.

circle-check
circle-info

All examples in this guide can be run on GPU servers rented through CLORE.AI Marketplacearrow-up-right marketplace.

Renting on CLORE.AI

  1. Filter by GPU type, VRAM, and price

  2. Choose On-Demand (fixed rate) or Spot (bid price)

  3. Configure your order:

    • Select Docker image

    • Set ports (TCP for SSH, HTTP for web UIs)

    • Add environment variables if needed

    • Enter startup command

  4. Select payment: CLORE, BTC, or USDT/USDC

  5. Create order and wait for deployment

Access Your Server

  • Find connection details in My Orders

  • Web interfaces: Use the HTTP port URL

  • SSH: ssh -p <port> root@<proxy-address>

What is GroundingDINO?

GroundingDINO by IDEA-Research enables:

  • Zero-shot object detection with text prompts

  • Detect any object without training

  • High-accuracy bounding box localization

  • Combine with SAM for automatic segmentation

Resources

Component
Minimum
Recommended
Optimal

GPU

RTX 3060 12GB

RTX 4080 16GB

RTX 4090 24GB

VRAM

6GB

12GB

16GB

CPU

4 cores

8 cores

16 cores

RAM

16GB

32GB

64GB

Storage

20GB SSD

50GB NVMe

100GB NVMe

Internet

100 Mbps

500 Mbps

1 Gbps

Quick Deploy on CLORE.AI

Docker Image:

Ports:

Command:

Accessing Your Service

After deployment, find your http_pub URL in My Orders:

  1. Go to My Orders page

  2. Click on your order

  3. Find the http_pub URL (e.g., abc123.clorecloud.net)

Use https://YOUR_HTTP_PUB_URL instead of localhost in examples below.

Installation

What You Can Create

Automated Labeling

  • Auto-annotate datasets for ML training

  • Generate bounding boxes from descriptions

  • Speed up data labeling pipelines

  • Find specific objects in image databases

  • Content moderation systems

  • Product recognition in retail

Robotics & Automation

  • Object localization for robot arms

  • Inventory management systems

  • Quality control inspection

Creative Applications

  • Auto-crop subjects from photos

  • Generate object masks with SAM

  • Content-aware image editing

Analytics

  • Count objects in images

  • Track inventory from photos

  • Wildlife monitoring

Basic Usage

GroundingDINO + SAM (Grounded-SAM)

Combine detection with segmentation:

Batch Processing

Custom Detection Pipeline

Gradio Interface

Performance

Task
Resolution
GPU
Speed

Single image

800x600

RTX 3090

120ms

Single image

800x600

RTX 4090

80ms

Single image

1920x1080

RTX 4090

150ms

Batch (10 images)

800x600

RTX 4090

600ms

Common Problems & Solutions

Low Detection Accuracy

Problem: Objects not being detected

Solutions:

  • Lower box_threshold to 0.2-0.3

  • Lower text_threshold to 0.15-0.2

  • Use more specific object descriptions

  • Separate objects with " . " not commas

Out of Memory

Problem: CUDA OOM on large images

Solutions:

Slow Inference

Problem: Detection takes too long

Solutions:

  • Use smaller input images

  • Batch process multiple images

  • Use FP16 inference

  • Rent faster GPU (RTX 4090, A100)

False Positives

Problem: Detecting wrong objects

Solutions:

  • Increase box_threshold to 0.4-0.5

  • Be more specific in prompts

  • Use negative prompts (filter results post-detection)

Troubleshooting

Objects not detected

  • Use more specific text descriptions

  • Try different phrasings

  • Lower confidence threshold

Bounding boxes wrong

  • Be more specific in text prompt

  • Use "." to separate multiple objects

  • Check image quality

triangle-exclamation
  • Reduce image resolution

  • Process images one at a time

  • Use smaller model variant

Slow inference

  • Use TensorRT for speedup

  • Batch similar-sized images

  • Enable FP16 inference

Cost Estimate

Typical CLORE.AI marketplace rates (as of 2024):

GPU
Hourly Rate
Daily Rate
4-Hour Session

RTX 3060

~$0.03

~$0.70

~$0.12

RTX 3090

~$0.06

~$1.50

~$0.25

RTX 4090

~$0.10

~$2.30

~$0.40

A100 40GB

~$0.17

~$4.00

~$0.70

A100 80GB

~$0.25

~$6.00

~$1.00

Prices vary by provider and demand. Check CLORE.AI Marketplacearrow-up-right for current rates.

Save money:

  • Use Spot market for flexible workloads (often 30-50% cheaper)

  • Pay with CLORE tokens

  • Compare prices across different providers

Next Steps

  • SAM2 - Segment detected objects

  • Florence-2 - More vision tasks

  • YOLO - Faster detection for known classes

Last updated

Was this helpful?