Segment Anything

Präzise Bildsegmentierung mit Metas SAM auf Clore.ai‑GPUs

Verwenden Sie Metas SAM für präzise Bildsegmentierung auf der GPU.

Alle Beispiele können auf GPU-Servern ausgeführt werden, die über CLORE.AI Marketplace.

Mieten auf CLORE.AI

Besuchen Sie CLORE.AI Marketplace
Nach GPU-Typ, VRAM und Preis filtern
Wählen On-Demand (Festpreis) oder Spot (Gebotspreis)
Konfigurieren Sie Ihre Bestellung:
- Docker-Image auswählen
- Ports festlegen (TCP für SSH, HTTP für Web-UIs)
- Umgebungsvariablen bei Bedarf hinzufügen
- Startbefehl eingeben
Zahlung auswählen: CLORE, BTC, oder USDT/USDC
Bestellung erstellen und auf Bereitstellung warten

Zugriff auf Ihren Server

Verbindungsdetails finden Sie in Meine Bestellungen
Webschnittstellen: Verwenden Sie die HTTP-Port-URL
SSH: ssh -p <port> root@<proxy-address>

Was ist SAM?

Segment Anything Model (SAM) kann:

Jedes Objekt in Bildern segmentieren
Mit Eingabeaufforderungen arbeiten (Punkte, Boxen, Text)
Automatische Masken erzeugen
Mit jedem Bildtyp umgehen

Modellvarianten

Modell

VRAM

Qualität

Geschwindigkeit

SAM-H (sehr groß)

8GB

Am besten

Langsam

SAM-L (groß)

6GB

Großartig

Mittel

SAM-B (Basis)

4GB

Gut

Schnell

SAM2

8GB+

Am besten

Mittel

Schnelle Bereitstellung

Docker-Image:

pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel

Ports:

22/tcp
7860/http

Befehl:

pip install segment-anything gradio opencv-python && \
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth && \
python -c "
import gradio as gr
import numpy as np
from segment_anything import sam_model_registry, SamPredictor
import cv2

sam = sam_model_registry['vit_h'](checkpoint='sam_vit_h_4b8939.pth').cuda()
predictor = SamPredictor(sam)

def segment(image, evt: gr.SelectData):
    predictor.set_image(image)
    point = np.array([[evt.index[0], evt.index[1]]])
    masks, _, _ = predictor.predict(point_coords=point, point_labels=np.array([1]))
    mask = masks[0]
    colored = np.zeros_like(image)
    colored[mask] = [255, 0, 0]
    result = cv2.addWeighted(image, 0.7, colored, 0.3, 0)
    return result

demo = gr.Interface(fn=segment, inputs=gr.Image(), outputs=gr.Image(), title='Click to Segment')
demo.launch(server_name='0.0.0.0', server_port=7860)
"

Zugriff auf Ihren Dienst

Nach der Bereitstellung finden Sie Ihre http_pub URL in Meine Bestellungen:

Gehen Sie zur Meine Bestellungen Seite
Klicken Sie auf Ihre Bestellung
Finden Sie die http_pub URL (z. B., abc123.clorecloud.net)

Verwenden Sie https://IHRE_HTTP_PUB_URL anstelle von localhost in den Beispielen unten.

Installation

pip install segment-anything opencv-python

Modelle herunterladen


# SAM-H (beste Qualität)
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

# SAM-L (ausgewogen)
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_l_0b3195.pth

# SAM-B (schnell)
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth

Python-API

Grundlegende Segmentierung mit Punkten

from segment_anything import sam_model_registry, SamPredictor
import cv2
import numpy as np

# Modell laden
sam = sam_model_registry["vit_h"](checkpoint="sam_vit_h_4b8939.pth")
sam.to("cuda")

predictor = SamPredictor(sam)

# Bild laden
image = cv2.imread("photo.jpg")
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Bild setzen
predictor.set_image(image_rgb)

# Segmentieren mit Punkt-Eingabe
input_point = np.array([[500, 375]])  # x-, y-Koordinaten
input_label = np.array([1])  # 1 = Vordergrund, 0 = Hintergrund

masks, scores, logits = predictor.predict(
    point_coords=input_point,
    point_labels=input_label,
    multimask_output=True
)

# Beste Maske auswählen
best_mask = masks[np.argmax(scores)]

# Maske speichern
cv2.imwrite("mask.png", best_mask.astype(np.uint8) * 255)

Box-Eingabeaufforderung


# Segmentieren mit Begrenzungsrahmen
input_box = np.array([100, 100, 400, 400])  # x1, y1, x2, y2

masks, scores, _ = predictor.predict(
    box=input_box,
    multimask_output=False
)

Mehrere Punkte


# Mehrere Vordergrund-/Hintergrundpunkte
input_points = np.array([
    [500, 375],   # Punkt 1
    [550, 400],   # Punkt 2
    [100, 100],   # Hintergrundpunkt
])
input_labels = np.array([1, 1, 0])  # 1=Vordergrund, 0=Hintergrund

masks, scores, _ = predictor.predict(
    point_coords=input_points,
    point_labels=input_labels,
    multimask_output=True
)

Kombinierte Box + Punkt

masks, scores, _ = predictor.predict(
    point_coords=input_point,
    point_labels=input_label,
    box=input_box,
    multimask_output=False
)

Automatische Maskengenerierung

Alle möglichen Masken erzeugen:

from segment_anything import SamAutomaticMaskGenerator
import cv2

sam = sam_model_registry["vit_h"](checkpoint="sam_vit_h_4b8939.pth")
sam.to("cuda")

mask_generator = SamAutomaticMaskGenerator(
    model=sam,
    points_per_side=32,
    pred_iou_thresh=0.86,
    stability_score_thresh=0.92,
    crop_n_layers=1,
    crop_n_points_downscale_factor=2,
    min_mask_region_area=100
)

image = cv2.imread("photo.jpg")
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

masks = mask_generator.generate(image_rgb)

# Jede Maske enthält:

# - 'segmentation': binäre Maske

# - 'area': Maskenfläche in Pixeln

# - 'bbox': Begrenzungsrahmen

# - 'predicted_iou': Qualitätswert

# - 'stability_score': Stabilitätswert

print(f"Found {len(masks)} masks")

Alle Masken visualisieren

import matplotlib.pyplot as plt

def show_masks(image, masks):
    plt.figure(figsize=(20, 20))
    plt.imshow(image)

    sorted_masks = sorted(masks, key=lambda x: x['area'], reverse=True)

    for mask in sorted_masks:
        m = mask['segmentation']
        color = np.random.random(3)
        colored = np.zeros((*m.shape, 4))
        colored[m] = [*color, 0.5]
        plt.imshow(colored)

    plt.axis('off')
    plt.savefig('all_masks.png')

show_masks(image_rgb, masks)

SAM 2 (neueste Version)

pip install sam2

from sam2.sam2_image_predictor import SAM2ImagePredictor

predictor = SAM2ImagePredictor.from_pretrained("facebook/sam2-hiera-large")

with torch.inference_mode():
    predictor.set_image(image)
    masks, scores, _ = predictor.predict(
        point_coords=points,
        point_labels=labels
    )

Hintergrund entfernen

from segment_anything import sam_model_registry, SamPredictor
import cv2
import numpy as np

sam = sam_model_registry["vit_h"](checkpoint="sam_vit_h_4b8939.pth")
sam.to("cuda")
predictor = SamPredictor(sam)

def remove_background(image_path, point):
    image = cv2.imread(image_path)
    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    predictor.set_image(image_rgb)

    masks, scores, _ = predictor.predict(
        point_coords=np.array([point]),
        point_labels=np.array([1]),
        multimask_output=True
    )

    best_mask = masks[np.argmax(scores)]

    # RGBA-Bild erstellen
    result = cv2.cvtColor(image, cv2.COLOR_BGR2BGRA)
    result[:, :, 3] = best_mask.astype(np.uint8) * 255

    return result

# Auf das Objekt klicken, das behalten werden soll
result = remove_background("photo.jpg", [400, 300])
cv2.imwrite("no_background.png", result)

Objekt extrahieren

def extract_object(image_path, point):
    image = cv2.imread(image_path)
    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    predictor.set_image(image_rgb)

    masks, scores, _ = predictor.predict(
        point_coords=np.array([point]),
        point_labels=np.array([1]),
        multimask_output=True
    )

    best_mask = masks[np.argmax(scores)]

    # Begrenzungsrahmen ermitteln
    rows = np.any(best_mask, axis=1)
    cols = np.any(best_mask, axis=0)
    y1, y2 = np.where(rows)[0][[0, -1]]
    x1, x2 = np.where(cols)[0][[0, -1]]

    # Zuschneiden
    cropped = image[y1:y2+1, x1:x2+1]
    mask_cropped = best_mask[y1:y2+1, x1:x2+1]

    # Maske anwenden
    result = cv2.cvtColor(cropped, cv2.COLOR_BGR2BGRA)
    result[:, :, 3] = mask_cropped.astype(np.uint8) * 255

    return result

Batch-Verarbeitung

import os
from segment_anything import sam_model_registry, SamAutomaticMaskGenerator
import cv2
import json

sam = sam_model_registry["vit_h"](checkpoint="sam_vit_h_4b8939.pth")
sam.to("cuda")

mask_generator = SamAutomaticMaskGenerator(sam)

input_dir = "./images"
output_dir = "./segmented"
os.makedirs(output_dir, exist_ok=True)

for filename in os.listdir(input_dir):
    if filename.lower().endswith(('.png', '.jpg', '.jpeg')):
        image = cv2.imread(os.path.join(input_dir, filename))
        image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

        masks = mask_generator.generate(image_rgb)

        # Masken als JSON speichern
        mask_data = []
        for i, mask in enumerate(masks):
            mask_data.append({
                'id': i,
                'area': int(mask['area']),
                'bbox': mask['bbox'],
                'score': float(mask['predicted_iou'])
            })

            # Einzelne Maske speichern
            cv2.imwrite(
                os.path.join(output_dir, f"{filename}_mask_{i}.png"),
                mask['segmentation'].astype(np.uint8) * 255
            )

        with open(os.path.join(output_dir, f"{filename}_masks.json"), 'w') as f:
            json.dump(mask_data, f)

API-Server

from fastapi import FastAPI, UploadFile
from fastapi.responses import Response
from segment_anything import sam_model_registry, SamPredictor
import cv2
import numpy as np
import json

app = FastAPI()

sam = sam_model_registry["vit_h"](checkpoint="sam_vit_h_4b8939.pth")
sam.to("cuda")
predictor = SamPredictor(sam)

@app.post("/segment")
async def segment(file: UploadFile, x: int, y: int):
    contents = await file.read()
    nparr = np.frombuffer(contents, np.uint8)
    image = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    predictor.set_image(image_rgb)

    masks, scores, _ = predictor.predict(
        point_coords=np.array([[x, y]]),
        point_labels=np.array([1]),
        multimask_output=True
    )

    best_mask = masks[np.argmax(scores)]

    _, encoded = cv2.imencode('.png', best_mask.astype(np.uint8) * 255)
    return Response(content=encoded.tobytes(), media_type="image/png")

Integration mit Stable Diffusion

Verwenden Sie SAM-Masken für Inpainting:


# Maske mit SAM erzeugen
predictor.set_image(image)
masks, scores, _ = predictor.predict(point_coords=point, point_labels=label)
mask = masks[np.argmax(scores)]

# Für Inpainting in SD verwenden
from diffusers import StableDiffusionInpaintPipeline

pipe = StableDiffusionInpaintPipeline.from_pretrained("runwayml/stable-diffusion-inpainting")
pipe.to("cuda")

result = pipe(
    prompt="a red sports car",
    image=image,
    mask_image=mask
).images[0]

Leistung

Modell

Bildgröße

GPU

Zeit

SAM-H

1024x1024

RTX 3090

~0.5s

SAM-L

1024x1024

RTX 3090

~0.3s

SAM-B

1024x1024

RTX 3090

~0.2s

SAM2

1024x1024

RTX 4090

~0.3s

Speicheroptimierung


# Für begrenzten VRAM
sam = sam_model_registry["vit_b"](checkpoint="sam_vit_b_01ec64.pth")  # Kleinere Modellvariante verwenden

# Oder die Punkte für die automatische Generierung reduzieren
mask_generator = SamAutomaticMaskGenerator(
    model=sam,
    points_per_side=16,  # Von 32 reduzieren
)

Fehlerbehebung

CUDA: Kein Speicher

Verwenden Sie SAM-B statt SAM-H
Bildgröße vor der Verarbeitung reduzieren
Cache leeren: torch.cuda.empty_cache()

Schlechte Segmentierung

Fügen Sie mehr Punkte hinzu (Vordergrund + Hintergrund)
Verwenden Sie Box-Eingabeaufforderungen für bessere Führung
Versuchen Sie multimask_output=True und wählen Sie die beste aus

Kostenabschätzung

Typische CLORE.AI-Marktplatztarife (Stand 2024):

GPU

Stundensatz

Tagessatz

4-Stunden-Sitzung

RTX 3060

~$0.03

~$0.70

~$0.12

RTX 3090

~$0.06

~$1.50

~$0.25

RTX 4090

~$0.10

~$2.30

~$0.40

A100 40GB

~$0.17

~$4.00

~$0.70

A100 80GB

~$0.25

~$6.00

~$1.00

Preise variieren je nach Anbieter und Nachfrage. Prüfen Sie CLORE.AI Marketplace auf aktuelle Preise.

Geld sparen:

Verwenden Sie Spot Markt für flexible Workloads (oft 30–50% günstiger)
Bezahlen mit CLORE Token
Preise bei verschiedenen Anbietern vergleichen

Nächste Schritte

Stable Diffusion Inpainting
ControlNet Anleitung
Real-ESRGAN-Upscaling

VorherigeControlNet NächsteDepth Anything

Zuletzt aktualisiert vor 24 Tagen

War das hilfreich?

hashtagMieten auf CLORE.AI

hashtagZugriff auf Ihren Server

hashtagWas ist SAM?

hashtagModellvarianten

hashtagSchnelle Bereitstellung

hashtagZugriff auf Ihren Dienst

hashtagInstallation

hashtagModelle herunterladen

hashtagPython-API

hashtagGrundlegende Segmentierung mit Punkten

hashtagBox-Eingabeaufforderung

hashtagMehrere Punkte

hashtagKombinierte Box + Punkt

hashtagAutomatische Maskengenerierung

hashtagAlle Masken visualisieren

hashtagSAM 2 (neueste Version)

hashtagHintergrund entfernen

hashtagObjekt extrahieren

hashtagBatch-Verarbeitung

hashtagAPI-Server

hashtagIntegration mit Stable Diffusion

hashtagLeistung

hashtagSpeicheroptimierung

hashtagFehlerbehebung

hashtagCUDA: Kein Speicher

hashtagSchlechte Segmentierung

hashtagKostenabschätzung

hashtagNächste Schritte

Mieten auf CLORE.AI

Zugriff auf Ihren Server

Was ist SAM?

Modellvarianten

Schnelle Bereitstellung

Zugriff auf Ihren Dienst

Installation

Modelle herunterladen

Python-API

Grundlegende Segmentierung mit Punkten

Box-Eingabeaufforderung

Mehrere Punkte

Kombinierte Box + Punkt

Automatische Maskengenerierung

Alle Masken visualisieren

SAM 2 (neueste Version)

Hintergrund entfernen

Objekt extrahieren

Batch-Verarbeitung

API-Server

Integration mit Stable Diffusion

Leistung

Speicheroptimierung

Fehlerbehebung

CUDA: Kein Speicher

Schlechte Segmentierung

Kostenabschätzung

Nächste Schritte