Wav2Lip

Sync lips to any audio with Wav2Lip.

circle-check

Renting on CLORE.AI

  1. Filter by GPU type, VRAM, and price

  2. Choose On-Demand (fixed rate) or Spot (bid price)

  3. Configure your order:

    • Select Docker image

    • Set ports (TCP for SSH, HTTP for web UIs)

    • Add environment variables if needed

    • Enter startup command

  4. Select payment: CLORE, BTC, or USDT/USDC

  5. Create order and wait for deployment

Access Your Server

  • Find connection details in My Orders

  • Web interfaces: Use the HTTP port URL

  • SSH: ssh -p <port> root@<proxy-address>

What is Wav2Lip?

Wav2Lip provides:

  • Accurate lip-sync for any face

  • Works with any audio

  • Video or image input

  • Real-time capable

Requirements

Mode
VRAM
Recommended

Basic

4GB

RTX 3060

High Quality

6GB

RTX 3080

HD

8GB

RTX 4080

Quick Deploy

Docker Image:

Ports:

Command:

Accessing Your Service

After deployment, find your http_pub URL in My Orders:

  1. Go to My Orders page

  2. Click on your order

  3. Find the http_pub URL (e.g., abc123.clorecloud.net)

Use https://YOUR_HTTP_PUB_URL instead of localhost in examples below.

Installation

Basic Usage

Command Line

With Image Input

Python API

Quality Options

Standard Quality (Faster)

High Quality (GAN)

Parameters

Padding Tips

Face Position
Recommended Pads

Centered

0 10 0 0

Close-up

0 15 0 0

Far

0 5 0 0

Batch Processing

Gradio Interface

API Server

TTS + Wav2Lip Pipeline

Complete text-to-video:

Post-Processing

Upscale Result

Add Audio Back

Troubleshooting

Face Not Detected

  • Ensure face is clearly visible

  • Good lighting

  • Front-facing preferred

  • Higher resolution input

Poor Sync Quality

  • Use wav2lip_gan.pth

  • Adjust padding

  • Check audio sample rate (16kHz recommended)

Choppy Output

  • Increase resize_factor

  • Disable nosmooth

  • Use higher quality input video

Performance

Input
GPU
Processing Time

10s video

RTX 3060

~30s

10s video

RTX 4090

~15s

30s video

RTX 4090

~45s

Image + 10s audio

RTX 3090

~20s

Comparison with SadTalker

Feature
Wav2Lip
SadTalker

Lip accuracy

Excellent

Good

Head movement

None

Natural

Expression

None

Controllable

Speed

Faster

Slower

Best for

Dubbing

Avatars

Cost Estimate

Typical CLORE.AI marketplace rates (as of 2024):

GPU
Hourly Rate
Daily Rate
4-Hour Session

RTX 3060

~$0.03

~$0.70

~$0.12

RTX 3090

~$0.06

~$1.50

~$0.25

RTX 4090

~$0.10

~$2.30

~$0.40

A100 40GB

~$0.17

~$4.00

~$0.70

A100 80GB

~$0.25

~$6.00

~$1.00

Prices vary by provider and demand. Check CLORE.AI Marketplacearrow-up-right for current rates.

Save money:

  • Use Spot market for flexible workloads (often 30-50% cheaper)

  • Pay with CLORE tokens

  • Compare prices across different providers

Next Steps

Last updated

Was this helpful?