Monitoring with Prometheus + Grafana

What We're Building

A complete monitoring stack with Prometheus and Grafana to track your Clore.ai GPU usage, costs, performance metrics, and create beautiful dashboards with alerting.

Key Features:

  • GPU utilization and memory metrics

  • Cost tracking per workload

  • Order status monitoring

  • Price history visualization

  • Alert rules for cost thresholds

  • Beautiful Grafana dashboards

Prerequisites

  • Clore.ai account with API key

  • Docker and Docker Compose

  • Python 3.10+

pip install prometheus_client requests flask

Architecture Overview

Step 1: Prometheus Exporter for Clore.ai

Step 2: Prometheus Configuration

Step 3: Alert Rules

Step 4: Grafana Dashboard

Step 5: Docker Compose Stack

Alertmanager Configuration

Running the Stack

Key Metrics to Monitor

Metric
Description
Alert Threshold

clore_daily_cost_usd

Estimated daily spend

> $50

clore_orders_active_total

Running orders

N/A

clore_gpu_available_total

Available GPUs

< 5

clore_gpu_price_spot_usd

Current spot price

Price drops

clore_wallet_balance

Wallet balance

< 10 CLORE

Next Steps

Last updated

Was this helpful?