# Jupyter ML Training

Set up JupyterLab with GPU support for machine learning experiments and model training.

{% hint style="success" %}
All examples can be run on GPU servers rented through [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## Server Requirements

| Parameter    | Minimum     | Recommended |
| ------------ | ----------- | ----------- |
| RAM          | 16GB        | 32GB+       |
| VRAM         | 8GB         | 16GB+       |
| Network      | 200Mbps     | 500Mbps+    |
| Startup Time | 2-3 minutes | -           |

{% hint style="info" %}
JupyterLab itself is lightweight. Choose GPU and RAM based on your training workload requirements.
{% endhint %}

## Quick Deploy

**Docker Image:**

```
pytorch/pytorch:2.5.1-cuda12.4-cudnn9-runtime
```

**Ports:**

```
22/tcp
8888/http
6006/http
```

**Environment:**

```
JUPYTER_TOKEN=your_secure_token_here
```

**Command:**

```bash
pip install jupyterlab tensorboard && \
jupyter lab --ip=0.0.0.0 --port=8888 --allow-root --NotebookApp.token='your_secure_token_here'
```

## Accessing Your Service

After deployment, find your `http_pub` URL in **My Orders**:

1. Go to **My Orders** page
2. Click on your order
3. Find the `http_pub` URL (e.g., `abc123.clorecloud.net`)

Use `https://YOUR_HTTP_PUB_URL` instead of `localhost` in examples below.

### Verify It's Working

```bash
# Check if JupyterLab is accessible
curl https://your-http-pub.clorecloud.net/

# Access with token
# https://your-http-pub.clorecloud.net/?token=your_secure_token_here
```

{% hint style="warning" %}
If you get HTTP 502, wait 2-3 minutes - the service is installing dependencies.
{% endhint %}

## Renting on CLORE.AI

1. Visit [CLORE.AI Marketplace](https://clore.ai/marketplace)
2. Filter by GPU type, VRAM, and price
3. Choose **On-Demand** (fixed rate) or **Spot** (bid price)
4. Configure your order:
   * Select Docker image
   * Set ports (TCP for SSH, HTTP for web UIs)
   * Add environment variables if needed
   * Enter startup command
5. Select payment: **CLORE**, **BTC**, or **USDT/USDC**
6. Create order and wait for deployment

### Access Your Server

* Find connection details in **My Orders**
* Web interfaces: Use the HTTP port URL
* SSH: `ssh -p <port> root@<proxy-address>`

## Access Jupyter

1. Wait for deployment
2. Find port 8888 mapping
3. Open: `http://<proxy>:<port>?token=your_secure_token_here`

## Pre-configured ML Image

For full ML environment:

**Image:**

```
jupyter/pytorch-notebook:cuda12-pytorch-2.1.0
```

Or build custom:

```dockerfile
FROM pytorch/pytorch:2.5.1-cuda12.4-cudnn9-runtime

RUN pip install --no-cache-dir \
    jupyterlab \
    numpy pandas matplotlib seaborn \
    scikit-learn \
    transformers datasets accelerate \
    tensorboard wandb \
    opencv-python pillow \
    tqdm rich

EXPOSE 8888 6006

CMD ["jupyter", "lab", "--ip=0.0.0.0", "--allow-root"]
```

## Essential Libraries

### Install in Jupyter

```python
!pip install transformers datasets accelerate bitsandbytes
!pip install wandb tensorboard
!pip install scikit-learn xgboost lightgbm
!pip install opencv-python albumentations
```

### Create requirements.txt

```

# ML Frameworks
torch>=2.1.0
torchvision
torchaudio

# NLP
transformers>=4.36.0
datasets
tokenizers
sentencepiece

# Training
accelerate
bitsandbytes
peft
trl

# Monitoring
wandb
tensorboard

# Data
numpy
pandas
matplotlib
seaborn
scikit-learn

# Computer Vision
opencv-python
pillow
albumentations
```

## Training Examples

### PyTorch Image Classification

```python
import torch
import torch.nn as nn
import torchvision
from torchvision import transforms
from torch.utils.data import DataLoader

# Check GPU
print(f"GPU: {torch.cuda.get_device_name(0)}")
print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

# Load data
transform = transforms.Compose([
    transforms.Resize(224),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

train_data = torchvision.datasets.CIFAR10(
    root='./data', train=True, download=True, transform=transform
)
train_loader = DataLoader(train_data, batch_size=64, shuffle=True, num_workers=4)

# Model
model = torchvision.models.resnet18(pretrained=True)
model.fc = nn.Linear(512, 10)
model = model.cuda()

# Training
optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)
criterion = nn.CrossEntropyLoss()

for epoch in range(10):
    model.train()
    for images, labels in train_loader:
        images, labels = images.cuda(), labels.cuda()

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

    print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}")

# Save model
torch.save(model.state_dict(), 'model.pth')
```

### HuggingFace Text Classification

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import TrainingArguments, Trainer
from datasets import load_dataset
import numpy as np

# Load dataset
dataset = load_dataset("imdb")

# Load model
model_name = "distilbert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)

# Tokenize
def tokenize(examples):
    return tokenizer(examples["text"], padding="max_length", truncation=True)

tokenized = dataset.map(tokenize, batched=True)

# Training
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=64,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=100,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized["train"],
    eval_dataset=tokenized["test"],
)

trainer.train()
trainer.save_model("./best_model")
```

### LLM Fine-tuning with LoRA

```python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training
from datasets import load_dataset
from trl import SFTTrainer
import torch

# Load model with 4-bit quantization
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
)

model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-v0.1",
    quantization_config=bnb_config,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1")
tokenizer.pad_token = tokenizer.eos_token

# Configure LoRA
lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

model = prepare_model_for_kbit_training(model)
model = get_peft_model(model, lora_config)

# Load dataset
dataset = load_dataset("timdettmers/openassistant-guanaco")

# Train
trainer = SFTTrainer(
    model=model,
    train_dataset=dataset["train"],
    dataset_text_field="text",
    max_seq_length=512,
    tokenizer=tokenizer,
    args=TrainingArguments(
        output_dir="./lora_output",
        num_train_epochs=1,
        per_device_train_batch_size=4,
        gradient_accumulation_steps=4,
        learning_rate=2e-4,
        fp16=True,
        logging_steps=10,
        save_steps=100,
    ),
)

trainer.train()
trainer.save_model("./final_lora")
```

## TensorBoard Integration

### Start TensorBoard

```python
%load_ext tensorboard
%tensorboard --logdir ./logs --port 6006 --bind_all
```

Or via terminal:

```bash
tensorboard --logdir ./logs --port 6006 --bind_all &
```

### Log Training Metrics

```python
from torch.utils.tensorboard import SummaryWriter

writer = SummaryWriter('./logs')

for epoch in range(epochs):
    # ... training loop ...
    writer.add_scalar('Loss/train', train_loss, epoch)
    writer.add_scalar('Loss/val', val_loss, epoch)
    writer.add_scalar('Accuracy/val', accuracy, epoch)

writer.close()
```

## Weights & Biases Integration

```python
import wandb

wandb.init(project="my-project", name="experiment-1")

# Log metrics
wandb.log({"loss": loss, "accuracy": acc})

# Log model
wandb.save("model.pth")

# Finish
wandb.finish()
```

## Data Management

### Download Datasets

```python

# HuggingFace datasets
from datasets import load_dataset
dataset = load_dataset("squad")

# Kaggle datasets
!pip install kaggle
!kaggle datasets download -d username/dataset-name

# Direct download
!wget https://example.com/data.zip
!unzip data.zip
```

### Mount Cloud Storage

```python

# S3
!pip install boto3
import boto3
s3 = boto3.client('s3')
s3.download_file('bucket', 'key', 'local_path')

# Google Cloud
!pip install google-cloud-storage
from google.cloud import storage
client = storage.Client()
bucket = client.bucket('my-bucket')
blob = bucket.blob('data.zip')
blob.download_to_filename('data.zip')
```

## Saving Work

### Save to External Storage

```python

# Save model to S3
import boto3
s3 = boto3.client('s3',
    aws_access_key_id='YOUR_KEY',
    aws_secret_access_key='YOUR_SECRET'
)
s3.upload_file('model.pth', 'my-bucket', 'models/model.pth')
```

### Before Ending Session

```bash

# Download important files
scp -P <port> root@<host>:/workspace/model.pth ./
scp -P <port> -r root@<host>:/workspace/results/ ./results/
```

## Multi-GPU Training

```python
import torch.distributed as dist
from torch.nn.parallel import DistributedDataParallel

# Check GPUs
print(f"Available GPUs: {torch.cuda.device_count()}")

# DataParallel (simple)
model = nn.DataParallel(model)

# DistributedDataParallel (better)

# Launch with: torchrun --nproc_per_node=4 train.py
dist.init_process_group("nccl")
model = DistributedDataParallel(model)
```

## Performance Tips

### Memory Optimization

```python

# Gradient checkpointing
model.gradient_checkpointing_enable()

# Mixed precision
from torch.cuda.amp import autocast, GradScaler
scaler = GradScaler()

with autocast():
    output = model(input)
    loss = criterion(output, target)

scaler.scale(loss).backward()
scaler.step(optimizer)
scaler.update()
```

### Data Loading

```python

# Faster data loading
loader = DataLoader(
    dataset,
    batch_size=64,
    num_workers=8,      # Use multiple workers
    pin_memory=True,    # Faster GPU transfer
    prefetch_factor=2   # Prefetch batches
)
```

## Troubleshooting

## Cost Estimate

Typical CLORE.AI marketplace rates (as of 2024):

| GPU       | Hourly Rate | Daily Rate | 4-Hour Session |
| --------- | ----------- | ---------- | -------------- |
| RTX 3060  | \~$0.03     | \~$0.70    | \~$0.12        |
| RTX 3090  | \~$0.06     | \~$1.50    | \~$0.25        |
| RTX 4090  | \~$0.10     | \~$2.30    | \~$0.40        |
| A100 40GB | \~$0.17     | \~$4.00    | \~$0.70        |
| A100 80GB | \~$0.25     | \~$6.00    | \~$1.00        |

*Prices vary by provider and demand. Check* [*CLORE.AI Marketplace*](https://clore.ai/marketplace) *for current rates.*

**Save money:**

* Use **Spot** market for flexible workloads (often 30-50% cheaper)
* Pay with **CLORE** tokens
* Compare prices across different providers


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/training/jupyter-ml-training.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
