# InstantID

Generate images with any face identity using just one reference photo.

{% hint style="success" %}
All examples can be run on GPU servers rented through [CLORE.AI Marketplace](https://clore.ai/marketplace).
{% endhint %}

## Renting on CLORE.AI

1. Visit [CLORE.AI Marketplace](https://clore.ai/marketplace)
2. Filter by GPU type, VRAM, and price
3. Choose **On-Demand** (fixed rate) or **Spot** (bid price)
4. Configure your order:
   * Select Docker image
   * Set ports (TCP for SSH, HTTP for web UIs)
   * Add environment variables if needed
   * Enter startup command
5. Select payment: **CLORE**, **BTC**, or **USDT/USDC**
6. Create order and wait for deployment

### Access Your Server

* Find connection details in **My Orders**
* Web interfaces: Use the HTTP port URL
* SSH: `ssh -p <port> root@<proxy-address>`

## What is InstantID?

InstantID preserves facial identity:

* Use any reference face
* Zero-shot - no training needed
* Works with any style/prompt
* Better than LoRA training

## Requirements

| Mode         | VRAM  | Recommended |
| ------------ | ----- | ----------- |
| Basic        | 12GB  | RTX 4080    |
| High Quality | 16GB  | RTX 4090    |
| With Pose    | 16GB+ | RTX 4090    |

## Quick Deploy

**Docker Image:**

```
pytorch/pytorch:2.5.1-cuda12.4-cudnn9-devel
```

**Ports:**

```
22/tcp
7860/http
```

**Command:**

```bash
pip install diffusers transformers accelerate opencv-python insightface onnxruntime-gpu && \
huggingface-cli download InstantX/InstantID --local-dir ./checkpoints && \
python instantid_app.py
```

## Accessing Your Service

After deployment, find your `http_pub` URL in **My Orders**:

1. Go to **My Orders** page
2. Click on your order
3. Find the `http_pub` URL (e.g., `abc123.clorecloud.net`)

Use `https://YOUR_HTTP_PUB_URL` instead of `localhost` in examples below.

## Installation

```bash
pip install diffusers transformers accelerate
pip install opencv-python insightface onnxruntime-gpu
pip install huggingface_hub

# Download models
huggingface-cli download InstantX/InstantID --local-dir ./checkpoints
```

## Basic Usage

```python
import torch
import cv2
import numpy as np
from PIL import Image
from diffusers import StableDiffusionXLPipeline, DDIMScheduler
from insightface.app import FaceAnalysis

# Initialize face analyzer
app = FaceAnalysis(name='antelopev2', root='./', providers=['CUDAExecutionProvider'])
app.prepare(ctx_id=0, det_size=(640, 640))

# Load pipeline
pipe = StableDiffusionXLPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    torch_dtype=torch.float16
).to("cuda")

# Load InstantID components
from diffusers import ControlNetModel

controlnet = ControlNetModel.from_pretrained(
    "./checkpoints/ControlNetModel",
    torch_dtype=torch.float16
)

# Load IP-Adapter for face
pipe.load_ip_adapter(
    "./checkpoints",
    subfolder="",
    weight_name="ip-adapter.bin"
)

# Process reference face
face_image = cv2.imread("reference_face.jpg")
faces = app.get(face_image)
face_emb = faces[0].normed_embedding

# Generate with face identity
image = pipe(
    prompt="portrait of a person as an astronaut, space background",
    negative_prompt="ugly, blurry, low quality",
    ip_adapter_image_embeds=[torch.tensor(face_emb).unsqueeze(0)],
    num_inference_steps=30,
    guidance_scale=7.5
).images[0]

image.save("output.png")
```

## Using Diffusers Pipeline

```python
from diffusers import StableDiffusionXLInstantIDPipeline, DDIMScheduler
from insightface.app import FaceAnalysis
import torch
import cv2

# Load face analyzer
app = FaceAnalysis(name='antelopev2', providers=['CUDAExecutionProvider'])
app.prepare(ctx_id=0)

# Load pipeline
pipe = StableDiffusionXLInstantIDPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet="./checkpoints/ControlNetModel",
    torch_dtype=torch.float16
).to("cuda")

pipe.load_ip_adapter_instantid("./checkpoints/ip-adapter.bin")

# Get face embedding
face_image = cv2.imread("face.jpg")
face_info = app.get(face_image)[0]

face_emb = face_info.normed_embedding
face_kps = face_info.kps

# Generate
image = pipe(
    prompt="watercolor portrait painting, artistic",
    face_emb=face_emb,
    face_kps=face_kps,
    num_inference_steps=30
).images[0]

image.save("portrait.png")
```

## Style Examples

### Professional Headshot

```python
prompt = "professional corporate headshot, studio lighting, gray background, business attire"
negative = "cartoon, anime, illustration, blurry"
```

### Artistic Portrait

```python
prompt = "oil painting portrait in the style of Rembrandt, dramatic lighting, museum quality"
negative = "photo, realistic, modern"
```

### Fantasy Character

```python
prompt = "fantasy elf character, pointed ears, magical forest background, ethereal lighting"
negative = "human ears, modern clothing, realistic"
```

### Anime Style

```python
prompt = "anime character portrait, studio ghibli style, detailed, beautiful"
negative = "realistic, photo, 3d render"
```

## With Pose Control

```python
from diffusers.utils import load_image

# Load pose reference
pose_image = load_image("pose_reference.jpg")

# Generate with face AND pose
image = pipe(
    prompt="person in action pose, dynamic, high quality",
    face_emb=face_emb,
    face_kps=face_kps,
    image=pose_image,  # Pose reference
    controlnet_conditioning_scale=0.8,
    num_inference_steps=30
).images[0]
```

## Gradio Interface

```python
import gradio as gr
import torch
import cv2
import numpy as np
from diffusers import StableDiffusionXLInstantIDPipeline
from insightface.app import FaceAnalysis

app = FaceAnalysis(name='antelopev2', providers=['CUDAExecutionProvider'])
app.prepare(ctx_id=0)

pipe = StableDiffusionXLInstantIDPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet="./checkpoints/ControlNetModel",
    torch_dtype=torch.float16
).to("cuda")

pipe.load_ip_adapter_instantid("./checkpoints/ip-adapter.bin")

def generate(face_image, prompt, negative_prompt, strength, steps):
    # Convert to cv2 format
    face_cv = cv2.cvtColor(np.array(face_image), cv2.COLOR_RGB2BGR)

    # Get face info
    faces = app.get(face_cv)
    if len(faces) == 0:
        return None, "No face detected!"

    face_info = faces[0]
    face_emb = face_info.normed_embedding
    face_kps = face_info.kps

    # Generate
    image = pipe(
        prompt=prompt,
        negative_prompt=negative_prompt,
        face_emb=face_emb,
        face_kps=face_kps,
        ip_adapter_scale=strength,
        num_inference_steps=steps
    ).images[0]

    return image, "Success!"

demo = gr.Interface(
    fn=generate,
    inputs=[
        gr.Image(type="pil", label="Reference Face"),
        gr.Textbox(label="Prompt", value="professional portrait"),
        gr.Textbox(label="Negative Prompt", value="ugly, blurry"),
        gr.Slider(0.1, 1.0, value=0.8, label="Identity Strength"),
        gr.Slider(10, 50, value=30, step=1, label="Steps")
    ],
    outputs=[
        gr.Image(label="Generated Image"),
        gr.Textbox(label="Status")
    ],
    title="InstantID - Identity Preserving Generation"
)

demo.launch(server_name="0.0.0.0", server_port=7860)
```

## Batch Face Swap

```python
import os
from pathlib import Path

def batch_generate(face_image_path, prompts, output_dir):
    # Load face
    face_cv = cv2.imread(face_image_path)
    face_info = app.get(face_cv)[0]
    face_emb = face_info.normed_embedding
    face_kps = face_info.kps

    os.makedirs(output_dir, exist_ok=True)

    for i, prompt in enumerate(prompts):
        print(f"Generating {i+1}/{len(prompts)}: {prompt[:50]}...")

        image = pipe(
            prompt=prompt,
            negative_prompt="ugly, blurry, deformed",
            face_emb=face_emb,
            face_kps=face_kps,
            num_inference_steps=30
        ).images[0]

        image.save(f"{output_dir}/output_{i:03d}.png")

# Usage
prompts = [
    "astronaut in space suit, Earth background",
    "medieval knight in armor",
    "scientist in laboratory",
    "chef in restaurant kitchen",
    "athlete on sports field"
]

batch_generate("my_face.jpg", prompts, "./outputs")
```

## Identity Strength Control

```python

# Low strength - more style, less identity
image_stylized = pipe(
    prompt=prompt,
    face_emb=face_emb,
    ip_adapter_scale=0.4,  # Low
    num_inference_steps=30
).images[0]

# High strength - more identity, less style
image_faithful = pipe(
    prompt=prompt,
    face_emb=face_emb,
    ip_adapter_scale=0.9,  # High
    num_inference_steps=30
).images[0]
```

## Memory Optimization

```python

# Enable optimizations
pipe.enable_model_cpu_offload()
pipe.enable_vae_slicing()

# Or use sequential offload for very low VRAM
pipe.enable_sequential_cpu_offload()
```

## Performance

| Mode      | GPU      | Time per Image |
| --------- | -------- | -------------- |
| Basic     | RTX 4090 | \~8s           |
| With Pose | RTX 4090 | \~12s          |
| Basic     | RTX 3090 | \~15s          |
| Basic     | A100     | \~5s           |

## Troubleshooting

### No Face Detected

* Ensure face is clearly visible
* Good lighting in reference image
* Face should be front-facing

### Identity Not Preserved

* Increase ip\_adapter\_scale
* Use clearer reference photo
* Avoid extreme angles

### Style Not Applied

* Decrease ip\_adapter\_scale
* More descriptive prompt
* Increase guidance\_scale

## Cost Estimate

Typical CLORE.AI marketplace rates (as of 2024):

| GPU       | Hourly Rate | Daily Rate | 4-Hour Session |
| --------- | ----------- | ---------- | -------------- |
| RTX 3060  | \~$0.03     | \~$0.70    | \~$0.12        |
| RTX 3090  | \~$0.06     | \~$1.50    | \~$0.25        |
| RTX 4090  | \~$0.10     | \~$2.30    | \~$0.40        |
| A100 40GB | \~$0.17     | \~$4.00    | \~$0.70        |
| A100 80GB | \~$0.25     | \~$6.00    | \~$1.00        |

*Prices vary by provider and demand. Check* [*CLORE.AI Marketplace*](https://clore.ai/marketplace) *for current rates.*

**Save money:**

* Use **Spot** market for flexible workloads (often 30-50% cheaper)
* Pay with **CLORE** tokens
* Compare prices across different providers

## Next Steps

* [IP-Adapter](https://docs.clore.ai/guides/face-and-identity/ip-adapter) - Image prompting
* Stable Diffusion WebUI - InstantID extension
* [ControlNet](https://docs.clore.ai/guides/image-processing/controlnet-advanced) - Pose control


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.clore.ai/guides/face-and-identity/instantid.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
