# InstantID

Generate images with any face identity using just one reference photo.

{% hint style="success" %}
All examples can be run on GPU servers rented through [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## Renting on CLORE.AI

1. Visit [CLORE.AI Marketplace](https://clore.ai/marketplace)
2. Filter by GPU type, VRAM, and price
3. Choose **On-Demand** (fixed rate) or **Spot** (bid price)
4. Configure your order:
   * Select Docker image
   * Set ports (TCP for SSH, HTTP for web UIs)
   * Add environment variables if needed
   * Enter startup command
5. Select payment: **CLORE**, **BTC**, or **USDT/USDC**
6. Create order and wait for deployment

### Access Your Server

* Find connection details in **My Orders**
* Web interfaces: Use the HTTP port URL
* SSH: `ssh -p <port> root@<proxy-address>`

## What is InstantID?

InstantID preserves facial identity:

* Use any reference face
* Zero-shot - no training needed
* Works with any style/prompt
* Better than LoRA training

## Requirements

| Mode         | VRAM  | Recommended |
| ------------ | ----- | ----------- |
| Basic        | 12GB  | RTX 4080    |
| High Quality | 16GB  | RTX 4090    |
| With Pose    | 16GB+ | RTX 4090    |

## Quick Deploy

**Docker Image:**

```
pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
```

**Ports:**

```
22/tcp
7860/http
```

**Command:**

```bash
pip install diffusers transformers accelerate opencv-python insightface onnxruntime-gpu && \
huggingface-cli download InstantX/InstantID --local-dir ./checkpoints && \
python instantid_app.py
```

## Accessing Your Service

After deployment, find your `http_pub` URL in **My Orders**:

1. Go to **My Orders** page
2. Click on your order
3. Find the `http_pub` URL (e.g., `abc123.clorecloud.net`)

Use `https://YOUR_HTTP_PUB_URL` instead of `localhost` in examples below.

## Installation

```bash
pip install diffusers transformers accelerate
pip install opencv-python insightface onnxruntime-gpu
pip install huggingface_hub

# Download models
huggingface-cli download InstantX/InstantID --local-dir ./checkpoints
```

## Basic Usage

```python
import torch
import cv2
import numpy as np
from PIL import Image
from diffusers import StableDiffusionXLPipeline, DDIMScheduler
from insightface.app import FaceAnalysis

# Initialize face analyzer
app = FaceAnalysis(name='antelopev2', root='./', providers=['CUDAExecutionProvider'])
app.prepare(ctx_id=0, det_size=(640, 640))

# Load pipeline
pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16
).to("cuda")

# Load InstantID components
from diffusers import ControlNetModel

controlnet = ControlNetModel.from_pretrained(
    "./checkpoints/ControlNetModel",
    torch_dtype=torch.float16
)

# Load IP-Adapter for face
pipe.load_ip_adapter(
    "./checkpoints",
    subfolder="",
    weight_name="ip-adapter.bin"
)

# Process reference face
face_image = cv2.imread("reference_face.jpg")
faces = app.get(face_image)
face_emb = faces[0].normed_embedding

# Generate with face identity
image = pipe(
    prompt="portrait of a person as an astronaut, space background",
    negative_prompt="ugly, blurry, low quality",
    ip_adapter_image_embeds=[torch.tensor(face_emb).unsqueeze(0)],
    num_inference_steps=30,
    guidance_scale=7.5
).images[0]

image.save("output.png")
```

## Using Diffusers Pipeline

```python
from diffusers import StableDiffusionXLInstantIDPipeline, DDIMScheduler
from insightface.app import FaceAnalysis
import torch
import cv2

# Load face analyzer
app = FaceAnalysis(name='antelopev2', providers=['CUDAExecutionProvider'])
app.prepare(ctx_id=0)

# Load pipeline
pipe = StableDiffusionXLInstantIDPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet="./checkpoints/ControlNetModel",
    torch_dtype=torch.float16
).to("cuda")

pipe.load_ip_adapter_instantid("./checkpoints/ip-adapter.bin")

# Get face embedding
face_image = cv2.imread("face.jpg")
face_info = app.get(face_image)[0]

face_emb = face_info.normed_embedding
face_kps = face_info.kps

# Generate
image = pipe(
    prompt="watercolor portrait painting, artistic",
    face_emb=face_emb,
    face_kps=face_kps,
    num_inference_steps=30
).images[0]

image.save("portrait.png")
```

## Style Examples

### Professional Headshot

```python
prompt = "professional corporate headshot, studio lighting, gray background, business attire"
negative = "cartoon, anime, illustration, blurry"
```

### Artistic Portrait

```python
prompt = "oil painting portrait in the style of Rembrandt, dramatic lighting, museum quality"
negative = "photo, realistic, modern"
```

### Fantasy Character

```python
prompt = "fantasy elf character, pointed ears, magical forest background, ethereal lighting"
negative = "human ears, modern clothing, realistic"
```

### Anime Style

```python
prompt = "anime character portrait, studio ghibli style, detailed, beautiful"
negative = "realistic, photo, 3d render"
```

## With Pose Control

```python
from diffusers.utils import load_image

# Load pose reference
pose_image = load_image("pose_reference.jpg")

# Generate with face AND pose
image = pipe(
    prompt="person in action pose, dynamic, high quality",
    face_emb=face_emb,
    face_kps=face_kps,
    image=pose_image,  # Pose reference
    controlnet_conditioning_scale=0.8,
    num_inference_steps=30
).images[0]
```

## Gradio Interface

```python
import gradio as gr
import torch
import cv2
import numpy as np
from diffusers import StableDiffusionXLInstantIDPipeline
from insightface.app import FaceAnalysis

app = FaceAnalysis(name='antelopev2', providers=['CUDAExecutionProvider'])
app.prepare(ctx_id=0)

pipe = StableDiffusionXLInstantIDPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet="./checkpoints/ControlNetModel",
    torch_dtype=torch.float16
).to("cuda")

pipe.load_ip_adapter_instantid("./checkpoints/ip-adapter.bin")

def generate(face_image, prompt, negative_prompt, strength, steps):
    # Convert to cv2 format
    face_cv = cv2.cvtColor(np.array(face_image), cv2.COLOR_RGB2BGR)

    # Get face info
    faces = app.get(face_cv)
    if len(faces) == 0:
        return None, "No face detected!"

    face_info = faces[0]
    face_emb = face_info.normed_embedding
    face_kps = face_info.kps

    # Generate
    image = pipe(
        prompt=prompt,
        negative_prompt=negative_prompt,
        face_emb=face_emb,
        face_kps=face_kps,
        ip_adapter_scale=strength,
        num_inference_steps=steps
    ).images[0]

    return image, "Success!"

demo = gr.Interface(
    fn=generate,
    inputs=[
        gr.Image(type="pil", label="Reference Face"),
        gr.Textbox(label="Prompt", value="professional portrait"),
        gr.Textbox(label="Negative Prompt", value="ugly, blurry"),
        gr.Slider(0.1, 1.0, value=0.8, label="Identity Strength"),
        gr.Slider(10, 50, value=30, step=1, label="Steps")
    ],
    outputs=[
        gr.Image(label="Generated Image"),
        gr.Textbox(label="Status")
    ],
    title="InstantID - Identity Preserving Generation"
)

demo.launch(server_name="0.0.0.0", server_port=7860)
```

## Batch Face Swap

```python
import os
from pathlib import Path

def batch_generate(face_image_path, prompts, output_dir):
    # Load face
    face_cv = cv2.imread(face_image_path)
    face_info = app.get(face_cv)[0]
    face_emb = face_info.normed_embedding
    face_kps = face_info.kps

    os.makedirs(output_dir, exist_ok=True)

    for i, prompt in enumerate(prompts):
        print(f"Generating {i+1}/{len(prompts)}: {prompt[:50]}...")

        image = pipe(
            prompt=prompt,
            negative_prompt="ugly, blurry, deformed",
            face_emb=face_emb,
            face_kps=face_kps,
            num_inference_steps=30
        ).images[0]

        image.save(f"{output_dir}/output_{i:03d}.png")

# Usage
prompts = [
    "astronaut in space suit, Earth background",
    "medieval knight in armor",
    "scientist in laboratory",
    "chef in restaurant kitchen",
    "athlete on sports field"
]

batch_generate("my_face.jpg", prompts, "./outputs")
```

## Identity Strength Control

```python

# Low strength - more style, less identity
image_stylized = pipe(
    prompt=prompt,
    face_emb=face_emb,
    ip_adapter_scale=0.4,  # Low
    num_inference_steps=30
).images[0]

# High strength - more identity, less style
image_faithful = pipe(
    prompt=prompt,
    face_emb=face_emb,
    ip_adapter_scale=0.9,  # High
    num_inference_steps=30
).images[0]
```

## Memory Optimization

```python

# Enable optimizations
pipe.enable_model_cpu_offload()
pipe.enable_vae_slicing()

# Or use sequential offload for very low VRAM
pipe.enable_sequential_cpu_offload()
```

## Performance

| Mode      | GPU      | Time per Image |
| --------- | -------- | -------------- |
| Basic     | RTX 4090 | \~8s           |
| With Pose | RTX 4090 | \~12s          |
| Basic     | RTX 3090 | \~15s          |
| Basic     | A100     | \~5s           |

## Troubleshooting

### No Face Detected

* Ensure face is clearly visible
* Good lighting in reference image
* Face should be front-facing

### Identity Not Preserved

* Increase ip\_adapter\_scale
* Use clearer reference photo
* Avoid extreme angles

### Style Not Applied

* Decrease ip\_adapter\_scale
* More descriptive prompt
* Increase guidance\_scale

## Cost Estimate

Typical CLORE.AI marketplace rates (as of 2024):

| GPU       | Hourly Rate | Daily Rate | 4-Hour Session |
| --------- | ----------- | ---------- | -------------- |
| RTX 3060  | \~$0.03     | \~$0.70    | \~$0.12        |
| RTX 3090  | \~$0.06     | \~$1.50    | \~$0.25        |
| RTX 4090  | \~$0.10     | \~$2.30    | \~$0.40        |
| A100 40GB | \~$0.17     | \~$4.00    | \~$0.70        |
| A100 80GB | \~$0.25     | \~$6.00    | \~$1.00        |

*Prices vary by provider and demand. Check* [*CLORE.AI Marketplace*](https://clore.ai/marketplace) *for current rates.*

**Save money:**

* Use **Spot** market for flexible workloads (often 30-50% cheaper)
* Pay with **CLORE** tokens
* Compare prices across different providers

## Next Steps

* [IP-Adapter](https://docs.clore.ai/guides/face-and-identity/ip-adapter) - Image prompting
* Stable Diffusion WebUI - InstantID extension
* [ControlNet](https://docs.clore.ai/guides/image-processing/controlnet-advanced) - Pose control
