Integrate GPU testing and deployment into your CI/CD pipelines. This chapter covers GitHub Actions, GitLab CI, Docker, and secrets management — with full working configs.
Secrets Management
Before configuring any pipeline, store your Clore API key securely.
GitHub Actions
Go to your repo → Settings → Secrets and variables → Actions
Click New repository secret
Name: CLORE_API_KEY, Value: your API key
GitLab CI
Go to your project → Settings → CI/CD → Variables
Add variable: Key = CLORE_API_KEY, Value = your API key
Check Mask variable and Protect variable
General Rules
Never hardcode API keys in source code or CI configs
Use environment variables or secrets managers
Rotate keys periodically
Restrict key scope: use a dedicated API key for CI (not your main account key)
GitHub Actions
Basic: GPU Smoke Test
Run nvidia-smi on a Clore GPU on every push to main.
Advanced: Matrix GPU Testing
Test your code on multiple GPU types in parallel.
Supporting script ci/run_gpu_test.py:
GitLab CI
Basic Pipeline
Parallel GPU Jobs
Docker
SDK Script Container
Package your SDK automation scripts in a Docker image.
Docker Compose for Local Development
Run:
Multi-Stage Build for Production
Cleanup & Safety
Always Cancel Orders in CI
Every CI job must cancel its orders in a finally block or a post-job step:
#!/usr/bin/env python3
"""Run a test script on a rented Clore GPU."""
import argparse
import subprocess
import sys
import time
from clore_ai import CloreAI
from clore_ai.exceptions import CloreAPIError
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--gpu", required=True)
parser.add_argument("--max-price", type=float, default=1.0)
parser.add_argument("--script", required=True)
parser.add_argument("--image", default="cloreai/pytorch")
parser.add_argument("--timeout", type=int, default=600)
args = parser.parse_args()
client = CloreAI()
order_id = None
try:
# Find server
servers = client.marketplace(gpu=args.gpu, max_price_usd=args.max_price)
if not servers:
print(f"::warning::No {args.gpu} servers available under ${args.max_price}")
sys.exit(0)
servers.sort(key=lambda s: s.price_usd or float("inf"))
best = servers[0]
print(f"Server {best.id}: {best.gpu_count}x {best.gpu_model} @ ${best.price_usd:.4f}/h")
# Create order
order = client.create_order(
server_id=best.id,
image=args.image,
type="on-demand",
currency="bitcoin",
ssh_password="CIMatrix123",
ports={"22": "tcp"},
)
order_id = order.id
print(f"Order {order_id} created, waiting for SSH...")
# Wait for SSH
time.sleep(30)
orders = client.my_orders()
active = next((o for o in orders if o.id == order_id), None)
if not active or not active.pub_cluster:
print("::error::Instance did not start")
sys.exit(1)
host = active.pub_cluster
port = 22
if active.tcp_ports and "22" in active.tcp_ports:
port = active.tcp_ports["22"]
# Run the test script
ssh_cmd = [
"ssh", "-o", "StrictHostKeyChecking=no",
"-p", str(port), f"root@{host}",
args.script,
]
result = subprocess.run(ssh_cmd, timeout=args.timeout)
sys.exit(result.returncode)
except CloreAPIError as e:
print(f"::error::API error: {e}")
sys.exit(1)
finally:
if order_id:
try:
client.cancel_order(order_id, issue="CI complete")
except Exception:
pass
if __name__ == "__main__":
main()