Batch Processing

Process large workloads efficiently on CLORE.AI GPUs.

circle-check

When to Use Batch Processing

  • Processing hundreds/thousands of items

  • Converting large datasets

  • Generating many images/videos

  • Bulk transcription

  • Training data preparation


LLM Batch Processing

vLLM Batch API

vLLM handles batching automatically with continuous batching:

from openai import OpenAI
import asyncio
import aiohttp

client = OpenAI(base_url="http://server:8000/v1", api_key="dummy")

# Synchronous batch
def process_batch_sync(prompts):
    results = []
    for prompt in prompts:
        response = client.chat.completions.create(
            model="meta-llama/Llama-3.1-8B-Instruct",
            messages=[{"role": "user", "content": prompt}]
        )
        results.append(response.choices[0].message.content)
    return results

# Process 100 prompts
prompts = [f"Summarize topic {i}" for i in range(100)]
results = process_batch_sync(prompts)

Async Batch Processing (Faster)

Batch with Progress Tracking

Save Progress for Long Batches


Image Generation Batch

SD WebUI Batch

ComfyUI Batch with Queue

FLUX Batch Processing


Audio Batch Processing

Whisper Batch Transcription

Parallel Whisper (Multiple GPUs)


Video Batch Processing

Batch Video Generation (SVD)


Data Pipeline Patterns

Producer-Consumer Pattern

Map-Reduce Pattern


Optimization Tips

1. Right-Size Concurrency

2. Batch Size Tuning

3. Memory Management

4. Save Intermediate Results


Cost Optimization

Estimate Before Running

Use Spot Instances

  • 30-50% cheaper

  • Good for batch jobs (interruptible)

  • Save checkpoints frequently

Off-Peak Processing

  • Queue jobs during low-demand hours

  • Often better GPU availability

  • Potentially lower spot prices


Next Steps

Last updated

Was this helpful?