> For the complete documentation index, see [llms.txt](https://docs.clore.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.clore.ai/guides/guides_v2-de/computer-vision/yolov9-v10.md). # YOLOv9/v10-Detektion > **Modernste Echtzeit-Objekterkennung — trainieren und bereitstellen der neuesten YOLO-Modelle auf GPU** YOLO (You Only Look Once) bleibt der Goldstandard für Echtzeit-Objekterkennung. YOLOv9 führte Programmable Gradient Information (PGI) und Generalized Efficient Layer Aggregation Network (GELAN) ein, während YOLOv10 eine NMS-freie Erkennung mit dualen Label-Zuweisungen brachte. Beide liefern erstklassige Genauigkeits-/Geschwindigkeits-Kompromisse auf NVIDIA-GPUs. * **YOLOv9 GitHub:** [WongKinYiu/yolov9](https://github.com/WongKinYiu/yolov9) — 8K+ ⭐ * **YOLOv10 GitHub:** [THU-MIG/yolov10](https://github.com/THU-MIG/yolov10) — 10K+ ⭐ * **Ultralytics (vereinheitlicht):** [ultralytics/ultralytics](https://github.com/ultralytics/ultralytics) — 32K+ ⭐ *** ## YOLOv9 vs YOLOv10 vs YOLOv8 — Schneller Vergleich | Modell | mAP50-95 | Geschwindigkeit (A100) | Parameter | NMS | | -------- | -------- | ---------------------- | --------- | ------------ | | YOLOv8x | 53.9 | 14.2ms | 68.2M | Erforderlich | | YOLOv9e | 55.6 | 16.8ms | 57.3M | Erforderlich | | YOLOv10x | 54.4 | 10.7ms | 29.5M | **Frei** | | YOLOv10b | 53.0 | 8.8ms | 19.1M | **Frei** | | YOLOv10s | 46.8 | 4.2ms | 7.2M | **Frei** | {% hint style="success" %} **YOLOv10 ist NMS-frei** — kein nachgelagerter Non-Maximum-Suppression-Schritt. Dies ermöglicht End-to-End-Bereitstellung und ist besonders vorteilhaft für Edge-/Embedded-Szenarien und TensorRT-Bereitstellung. {% endhint %} *** ## Anwendungsfälle * **Sicherheit & Überwachung** — Echtzeit-Erkennung von Personen/Fahrzeugen/Objekten * **Autonome Fahrzeuge** — Erkennung von Fußgängern und Hindernissen * **Fertigung QC** — Fehlererkennung in Produktionslinien * **Einzelhandelsanalyse** — Kundenfluss- und Produkterkennung * **Medizinische Bildgebung** — Anomalieerkennung in Röntgenaufnahmen und Scans * **Sportanalyse** — Spieler- und Ballverfolgung * **Landwirtschaft** — Erkennung von Pflanzenkrankheiten und Schädlingen *** ## Voraussetzungen * Clore.ai-Konto mit GPU-Vermietung * Trainingsdaten (für kundenspezifisches Modelltraining) oder Verwendung von COCO-vortrainierten Gewichten * Grundlegende Python- und Kommandozeilenkenntnisse *** ## Schritt 1 — Mieten Sie eine GPU auf Clore.ai 1. Gehe zu [clore.ai](https://clore.ai) → **Marktplatz** 2. Wählen Sie die GPU basierend auf Ihrer Aufgabe: * **Nur Inferenz:** RTX 3080/3090 oder RTX 4080 — ausgezeichnetes Preis-/Leistungsverhältnis * **Training kleiner Modelle:** RTX 4090 24GB * **Training großer Modelle (YOLOv9e/YOLOv10x):** A100 40/80GB {% hint style="info" %} **Für Echtzeit-Inferenz** (Videoströme) liefert die RTX 3090 oder RTX 4090 je nach Modellvariante 100–500 FPS. Sogar das kleinste YOLOv10n läuft mit TensorRT auf einer 4090 mit 1000+ FPS. {% endhint %} *** ## Schritt 2 — Bereitstellen des Ultralytics-Containers Das offizielle Ultralytics Docker-Image unterstützt YOLOv8, YOLOv9 und YOLOv10 über eine einheitliche API: **Docker-Image:** ``` ultralytics/ultralytics:latest ``` **Ports:** ``` 22 8000 ``` **Umgebungsvariablen:** ``` NVIDIA_VISIBLE_DEVICES=all NVIDIA_DRIVER_CAPABILITIES=compute,utility ``` **Festplatte:** Mindestens 20 GB (vortrainierte Gewichte + Ihr Datensatz) *** ## Schritt 3 — Verbinden und Überprüfen ```bash ssh root@ -p # GPU prüfen nvidia-smi # Ultralytics-Installation prüfen python3 -c "import ultralytics; ultralytics.checks()" # Sollte GPU-Informationen, CUDA-Version und Modellverfügbarkeit anzeigen ``` *** ## Schritt 4 — Schnelle Inferenz mit vortrainierten Modellen ### YOLOv10-Inferenz (NMS-frei) ```python from ultralytics import YOLO import cv2 # YOLOv10-Modell laden (wird bei Bedarf automatisch heruntergeladen) model = YOLO("yolov10x.pt") # Optionen: n, s, m, b, l, x # Inferenz auf einem Bild ausführen results = model("https://ultralytics.com/images/bus.jpg") # Ergebnisse anzeigen for result in results: boxes = result.boxes print(f"Detected {len(boxes)} objects") for box in boxes: cls = int(box.cls[0]) conf = float(box.conf[0]) xyxy = box.xyxy[0].tolist() print(f" {model.names[cls]}: {conf:.2f} at {[int(x) for x in xyxy]}") # Annotiertes Bild speichern results[0].save("output.jpg") ``` ### YOLOv9-Inferenz ```python from ultralytics import YOLO # YOLOv9-Modell laden model = YOLO("yolov9e.pt") # Optionen: t, s, m, c, e # Batch-Inferenz für maximale Durchsatzleistung results = model( source=[ "image1.jpg", "image2.jpg", "image3.jpg", ], batch=8, # Verarbeite 8 Bilder parallel device="cuda", conf=0.25, # Konfidenzschwelle iou=0.45, # NMS-IoU-Schwelle (nicht erforderlich für v10) imgsz=640, half=True # FP16 für 2x Beschleunigung ) ``` ### Echtzeit-Videostrom-Inferenz ```python from ultralytics import YOLO import cv2 model = YOLO("yolov10s.pt") # Für Webcam (device=0) oder Videodatei cap = cv2.VideoCapture("input_video.mp4") # Videoeigenschaften abrufen fps = cap.get(cv2.CAP_PROP_FPS) width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)) height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) # Ausgabeschreiber out = cv2.VideoWriter( "output_video.mp4", cv2.VideoWriter_fourcc(*"mp4v"), fps, (width, height) ) frame_count = 0 while cap.isOpened(): ret, frame = cap.read() if not ret: break results = model(frame, conf=0.25, verbose=False) annotated = results[0].plot() out.write(annotated) frame_count += 1 if frame_count % 100 == 0: print(f"Processed {frame_count} frames") cap.release() out.release() print("Fertig! Ausgabe gespeichert in output_video.mp4") ``` *** ## Schritt 5 — Ein benutzerdefiniertes Modell trainieren ### Bereite dein Dataset vor YOLO verwendet eine bestimmte Verzeichnisstruktur und ein bestimmtes Label-Format: ``` dataset/ ├── images/ │ ├── train/ # Trainingsbilder (.jpg/.png) │ ├── val/ # Validierungsbilder │ └── test/ # Testbilder (optional) └── labels/ ├── train/ # Label-Dateien (.txt) ├── val/ └── test/ ``` Jede Label-Datei (gleicher Name wie das Bild, `.txt` Erweiterung) enthält: ``` # class_id center_x center_y width height (alle normalisiert 0-1) 0 0.512 0.334 0.256 0.412 1 0.123 0.654 0.089 0.123 ``` ### Datensatz-Konfiguration erstellen ```bash cat > /workspace/custom_dataset.yaml << 'EOF' # Datensatzkonfiguration path: /workspace/dataset train: images/train val: images/val test: images/test # Anzahl der Klassen nc: 3 # Klassennamen names: 0: person 1: car 2: bicycle EOF ``` ### Import von Roboflow (empfohlen) ```python # Roboflow installieren pip install roboflow from roboflow import Roboflow rf = Roboflow(api_key="YOUR_API_KEY") project = rf.workspace("your-workspace").project("your-project") version = project.version(1) dataset = version.download("yolov9") # Datensatz befindet sich jetzt unter ./your-project-1/ ``` ### YOLOv10 trainieren ```python from ultralytics import YOLO # Vortrainiertes YOLOv10-Modell laden (Transfer Learning) model = YOLO("yolov10m.pt") # Medium-Variante — guter Kompromiss results = model.train( data="/workspace/custom_dataset.yaml", epochs=100, imgsz=640, batch=16, # An Ihre GPU-VRAM anpassen device="cuda", workers=8, project="/workspace/runs", name="yolov10_custom", patience=50, # Frühes Stoppen save=True, save_period=10, # Speichert Checkpoint alle 10 Epochen plots=True, val=True, augment=True, # Datenaugmentation degrees=10.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.1, copy_paste=0.1, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, amp=True # Automatic Mixed Precision (FP16) ) print(f"Training abgeschlossen! Bestes mAP: {results.results_dict['metrics/mAP50-95(B)']:.3f}") ``` ### YOLOv9 trainieren ```python from ultralytics import YOLO model = YOLO("yolov9e.pt") results = model.train( data="/workspace/custom_dataset.yaml", epochs=100, imgsz=640, batch=8, # v9e ist größer, benötigt kleinere Batch device="cuda", workers=8, project="/workspace/runs", name="yolov9_custom", amp=True, optimizer="SGD", momentum=0.937, weight_decay=0.0005 ) ``` {% hint style="info" %} **Trainingstipps:** * **Batch-Größe:** Beginnen Sie mit `batch=16` für RTX 4090, `batch=32` für A100 40GB * **Bildgröße:** `imgsz=640` ist Standard; verwenden Sie 1280 für hochauflösende Aufgaben * **Epochen:** 100 Epochen sind typisch für Feinabstimmung, 300+ für Training von Grund auf * **AMP (Mixed Precision):** Immer aktivieren `amp=True` für 1,5–2x Beschleunigung {% endhint %} *** ## Schritt 6 — Export zu TensorRT für maximale Geschwindigkeit ```python from ultralytics import YOLO # Trainiertes Modell laden model = YOLO("/workspace/runs/yolov10_custom/weights/best.pt") # Export zu TensorRT (FP16 für bestes Geschwindigkeits-/Genauigkeitsverhältnis) model.export( format="engine", # TensorRT-Engine device="cuda", half=True, # FP16 dynamic=False, # Statische Formen für maximale TRT-Optimierung batch=1, # Optimiert für Batch-Größe 1 (Echtzeit) imgsz=640, workspace=4 # TRT-Workspace in GB ) # Gespeichert als: best.engine # TRT-Engine laden und ausführen trt_model = YOLO("best.engine") results = trt_model("image.jpg") ``` ### In ONNX exportieren ```python # Export zu ONNX für Bereitstellungsflexibilität model.export( format="onnx", opset=17, half=True, # FP16-Gewichte dynamic=True, # Dynamische Batch-Größe simplify=True ) ``` *** ## Schritt 7 — Als REST-API bereitstellen ```bash pip install fastapi uvicorn python-multipart cat > /workspace/yolo_api.py << 'EOF' from fastapi import FastAPI, File, UploadFile from fastapi.responses import JSONResponse, FileResponse from ultralytics import YOLO from PIL import Image import io import uuid import os app = FastAPI(title="YOLOv10 Detection API") model = YOLO("yolov10x.pt") @app.get("/health") async def health(): return {"status": "ok", "model": "yolov10x", "device": "cuda"} @app.post("/detect") async def detect( file: UploadFile = File(...), conf: float = 0.25, iou: float = 0.45, return_image: bool = False ): # Hochgeladenes Bild lesen image_data = await file.read() img = Image.open(io.BytesIO(image_data)).convert("RGB") # Erkennung ausführen results = model(img, conf=conf, iou=iou, verbose=False) result = results[0] # Antwort aufbauen detections = [] for box in result.boxes: detections.append({ "class": model.names[int(box.cls[0])], "confidence": round(float(box.conf[0]), 4), "bbox": [round(x, 2) for x in box.xyxy[0].tolist()], "class_id": int(box.cls[0]) }) response = { "count": len(detections), "detections": detections, "image_size": list(result.orig_shape) } if return_image: output_path = f"/tmp/{uuid.uuid4()}.jpg" result.save(filename=output_path) return FileResponse(output_path, media_type="image/jpeg") return JSONResponse(response) @app.post("/detect/batch") async def detect_batch(files: list[UploadFile] = File(...)): results = [] for file in files: data = await file.read() img = Image.open(io.BytesIO(data)).convert("RGB") res = model(img, verbose=False)[0] results.append({ "filename": file.filename, "count": len(res.boxes), "detections": [ {"class": model.names[int(b.cls[0])], "conf": float(b.conf[0])} for b in res.boxes ] }) return JSONResponse({"results": results}) if __name__ == "__main__": import uvicorn uvicorn.run(app, host="0.0.0.0", port=8000) EOF python3 /workspace/yolo_api.py & # Die API testen curl -X POST "http://localhost:8000/detect" \ -F "file=@test_image.jpg" | python3 -m json.tool ``` *** ## Schritt 8 — Validieren und Benchmarken Ihres Modells ```python from ultralytics import YOLO model = YOLO("yolov10x.pt") # Validierung auf dem COCO-Datensatz metrics = model.val( data="coco.yaml", imgsz=640, batch=32, device="cuda", half=True ) print(f"mAP50: {metrics.box.map50:.3f}") print(f"mAP50-95: {metrics.box.map:.3f}") print(f"Precision: {metrics.box.mp:.3f}") print(f"Recall: {metrics.box.mr:.3f}") # Geschwindigkeit benchmarken model.benchmark( format="engine", # Mehrere Exportformate vergleichen imgsz=640, half=True, device="cuda" ) ``` *** ## Ergebnisse herunterladen ```bash # Von Ihrer lokalen Maschine: scp -P root@:/workspace/runs/yolov10_custom/weights/best.pt ./ scp -P root@:/workspace/output_video.mp4 ./ # Gesamten Trainingslauf herunterladen rsync -avz -e "ssh -p " \ root@:/workspace/runs/ \ ./yolo_training_runs/ ``` *** ## Fehlerbehebung ### CUDA Out of Memory während des Trainings ```python # Batch-Größe reduzieren model.train(data="data.yaml", batch=4, imgsz=640) # Oder Gradient Checkpointing aktivieren model.train(data="data.yaml", batch=8, imgsz=640, cache=False) ``` ### Langsame Trainingsgeschwindigkeit ```python # Caching aktivieren (lädt Datensatz in RAM/GPU) model.train(data="data.yaml", cache=True) # Cache in RAM model.train(data="data.yaml", cache="disk") # Cache auf Festplatte # Anzahl der Worker erhöhen (Vorsicht: zu viele können verlangsamen) model.train(data="data.yaml", workers=8) ``` ### Niedriges mAP / Schlechte Erkennung ```bash # Überprüfen Sie, ob die Labels korrekt sind (normalisiert, innerhalb 0-1) python3 -c " from ultralytics.data.utils import check_det_dataset check_det_dataset('custom_dataset.yaml') " # Trainingsbeispiele visualisieren python3 -c " from ultralytics import YOLO model = YOLO('yolov10m.pt') model.train(data='data.yaml', epochs=1, batch=4, plots=True) # Prüfen Sie /workspace/runs/train/exp/train_batch*.jpg " ``` *** ## Leistungsreferenz (Clore.ai GPUs) | Modell | GPU | Batch | FPS (Inference) | mAP50-95 | | ------------ | -------- | ----- | --------------- | -------- | | YOLOv10n | RTX 3090 | 1 | 1,200 | 38.5 | | YOLOv10s | RTX 3090 | 1 | 780 | 46.8 | | YOLOv10m | RTX 4090 | 1 | 950 | 51.3 | | YOLOv10x | RTX 4090 | 1 | 380 | 54.4 | | YOLOv9e | A100 40G | 1 | 720 | 55.6 | | YOLOv10x TRT | RTX 4090 | 1 | 920 | 54.2 | *** ## Weitere Ressourcen * [Ultralytics-Dokumentation](https://docs.ultralytics.com/) * [YOLOv9-Paper](https://arxiv.org/abs/2402.13616) * [YOLOv10-Paper](https://arxiv.org/abs/2405.14458) * [Roboflow Universe](https://universe.roboflow.com/) — 100K+ öffentliche Datensätze * [Ultralytics HUB](https://hub.ultralytics.com/) — Cloud-Trainingsplattform * [COCO-Datensatz](https://cocodataset.org/) — Standard-Benchmark-Datensatz *** *YOLOv9 und YOLOv10 auf Clore.ai GPU-Vermietungen bieten einen erschwinglichen Weg, kundenspezifische Objekterkennungsmodelle zu trainieren und Echtzeit-Inferenzpipelines bereitzustellen — ohne den Overhead von AWS SageMaker oder Google Vertex AI.* *** ## Clore.ai GPU-Empfehlungen | Anwendungsfall | Empfohlene GPU | Geschätzte Kosten auf Clore.ai | | --------------------------- | --------------- | ------------------------------ | | Entwicklung/Tests | RTX 3090 (24GB) | \~$0.12/gpu/hr | | Produktions-Inferenz | RTX 4090 (24GB) | \~$0.70/gpu/hr | | Training mit großen Batches | A100 80GB | \~$1.20/gpu/hr | > 💡 Alle Beispiele in diesem Leitfaden können bereitgestellt werden auf [Clore.ai](https://clore.ai/marketplace) GPU-Servern. Durchsuchen Sie verfügbare GPUs und mieten Sie stundenweise — keine Verpflichtungen, voller Root-Zugriff. --- # Agent Instructions This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com. ## Querying This Documentation If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question. Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter: ``` GET https://docs.clore.ai/guides/guides_v2-de/computer-vision/yolov9-v10.md?ask=&goal= ``` `ask` is the immediate question: it should be specific, self-contained, and written in natural language. `goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal. The response will contain a direct answer to the question and relevant excerpts and sources from the documentation. Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.