Model Training Scheduler (Auto-Rent on Price Drop)

What We're Building

An intelligent training scheduler that monitors Clore.ai GPU prices and automatically provisions servers when prices drop below your target. Queue training jobs that execute at optimal pricing, maximizing your ML budget.

Key Features:

  • Real-time price monitoring with configurable thresholds

  • Job queue with priority scheduling

  • Automatic provisioning when prices drop

  • Multi-job batch execution

  • Email/Slack notifications for price alerts

  • Cost tracking and budgeting

  • Graceful job preemption handling

Prerequisites

pip install requests schedule redis

Architecture Overview

Step 1: Price Monitor

Step 2: Job Queue

Step 3: Training Scheduler

Full Script: Production Scheduler

Usage Examples

Cost Savings Example

Job
Target Price
Waited
Actual Price
Savings

BERT Training

$0.35/hr

2h

$0.32/hr

9%

SD Training

$1.20/hr

4h

$1.05/hr

12%

RL Training

$0.30/hr

1h

$0.28/hr

7%

Average savings: 10-15% by waiting for optimal prices

Next Steps

Last updated

Was this helpful?