# ACE-Step Music Generation

ACE-Step 1.5 is the open-source music generation breakthrough everyone was waiting for. It generates **complete songs with vocals and instruments** from text prompts, rivaling commercial services like Suno — but runs locally on your GPU with an **MIT license**. The killer feature? It needs **less than 4GB VRAM**, making it the most accessible AI music tool ever. Generate a full track in 2–8 seconds on an RTX 4090.

## Key Features

* **Full song generation**: Vocals + instruments + effects in one pass
* **< 4GB VRAM**: Runs on even the cheapest GPUs (RTX 3060, even GTX 1060!)
* **2–8 seconds per track**: Near-instant generation on modern GPUs
* **MIT license**: Full commercial use, no restrictions
* **Lyrics support**: Write your own lyrics with verse/chorus structure
* **Style control**: Genre tags, mood, tempo, instrumentation
* **ComfyUI integration**: Node-based workflow for complex music pipelines

## Requirements

| Component | Minimum           | Recommended        |
| --------- | ----------------- | ------------------ |
| GPU       | Any with 4GB VRAM | RTX 3060 or better |
| VRAM      | 4GB               | 6GB+               |
| RAM       | 8GB               | 16GB               |
| Disk      | 10GB              | 15GB               |
| Python    | 3.10+             | 3.11               |

**Recommended Clore.ai GPU**: RTX 3060 6GB (\~$0.15–0.3/day) — yes, the cheapest GPU works!

### Speed Reference

| GPU           | Generation Time (30s track) |
| ------------- | --------------------------- |
| GTX 1060 6GB  | \~15–20 sec                 |
| RTX 3060 12GB | \~6–10 sec                  |
| RTX 3080 10GB | \~4–6 sec                   |
| RTX 4090 24GB | \~2–3 sec                   |

## Installation

### Standalone

```bash
git clone https://github.com/ace-step/ACE-Step.git
cd ACE-Step
pip install -e .

# Or via pip (if available)
pip install ace-step
```

### ComfyUI Integration

```bash
cd ComfyUI/custom_nodes
git clone https://github.com/ace-step/ComfyUI-ACE-Step
pip install -r ComfyUI-ACE-Step/requirements.txt
# Restart ComfyUI — ACE-Step nodes will appear
```

## Quick Start

### Installation

ACE-Step is a Gradio web app — not a pip package. Install from Git:

```bash
# Clone and set up
git clone https://github.com/ACE-Step/ACE-Step-1.5.git
cd ACE-Step-1.5

# Option A: uv (recommended)
pip install uv
uv sync

# Option B: pip
pip install -r requirements.txt
```

### Launch Web UI

```bash
# Start Gradio interface
python app.py --port 7860 --share

# For low VRAM (< 6GB):
python app.py --port 7860 --half
```

Open `http://localhost:7860` in your browser. The UI has:

1. **Prompt field** — describe the style: "upbeat electronic pop, 120 BPM"
2. **Lyrics field** — write verses with `[Verse]`, `[Chorus]` tags
3. **Duration slider** — 15–120 seconds
4. **Generate button** — click and wait 2–8 seconds

### Generate with Lyrics (Web UI)

Enter in the lyrics field:

```
[Verse 1]
I rent the GPUs late at night
The servers humming, screens so bright
Training models, chasing dreams
Nothing's ever what it seems

[Chorus]
We're building something new today
The future's just a prompt away
With every token, every line
The code and music intertwine
```

Set prompt to: `indie rock ballad, acoustic guitar, emotional, male vocal`

### CLI / Pipeline Usage

```bash
# Generate from command line using the pipeline script directly
cd ACE-Step-1.5
python acestep/acestep_v15_pipeline.py \
  --prompt "lo-fi hip hop, chill, rainy day, piano, soft drums" \
  --lyrics "" \
  --duration 30 \
  --output output.wav
```

### ComfyUI Integration (Batch Workflow)

```bash
# Install ComfyUI nodes for batch generation
cd ComfyUI/custom_nodes
git clone https://github.com/ACE-Step/ComfyUI-ACE-Step
pip install -r ComfyUI-ACE-Step/requirements.txt
# Restart ComfyUI — ACE-Step nodes appear in the node menu
```

ComfyUI nodes let you batch-generate multiple tracks with different prompts in a visual workflow.

### Style Tags

Control generation with style tags:

```python
# Genre tags
"pop", "rock", "electronic", "hip-hop", "jazz", "classical", "metal",
"lo-fi", "synthwave", "ambient", "folk", "R&B", "country"

# Mood tags
"happy", "sad", "energetic", "chill", "dark", "epic", "romantic"

# Instrument tags
"piano", "guitar", "drums", "bass", "synth", "strings", "violin"

# Vocal tags
"male vocal", "female vocal", "choir", "no vocals", "humming"

# Technical tags
"120 BPM", "minor key", "major key", "4/4 time"
```

## Web UI

```bash
cd ACE-Step
python app.py --port 7860
# Open http://localhost:7860
```

The web UI provides:

* Text prompt input with style presets
* Lyrics editor with verse/chorus formatting
* Duration and quality sliders
* Real-time waveform preview
* Download as WAV or MP3

## Use Cases on Clore.ai

| Use Case                    | Setup                    | Cost        |
| --------------------------- | ------------------------ | ----------- |
| Background music for videos | RTX 3060, batch generate | \~$0.15/day |
| Song prototyping / demos    | RTX 3080, real-time      | \~$0.3/day  |
| Music production pipeline   | RTX 4090 + ComfyUI       | \~$1/day    |
| Podcast intros/outros       | Any GPU, one-shot        | \~$0.15/day |

## Tips for Clore.ai Users

* **Cheapest AI workload possible**: At $0.15/day for RTX 3060, generate hundreds of tracks for pennies
* **Batch overnight**: Rent a GPU for 8 hours ($0.05–0.1), generate 500+ tracks
* **ComfyUI for pipelines**: Chain with image generation for album art workflows
* **Export quality**: Generate at highest quality, then process in a DAW if needed
* **Style mixing**: Combine genres in prompts: "lo-fi jazz hip hop with vinyl crackle" works surprisingly well

## Troubleshooting

| Issue                  | Solution                                                                                                      |
| ---------------------- | ------------------------------------------------------------------------------------------------------------- |
| CUDA not found         | Ensure PyTorch is installed with CUDA: `pip install torch --index-url https://download.pytorch.org/whl/cu121` |
| Model download slow    | Set `HF_HUB_ENABLE_HF_TRANSFER=1` for faster downloads                                                        |
| Audio sounds distorted | Try lower temperature (0.7) or fewer inference steps                                                          |
| Out of memory on 4GB   | Reduce duration to 15 seconds; upgrade to 6GB GPU                                                             |
| ComfyUI nodes missing  | Restart ComfyUI after installing the custom nodes                                                             |

## ACE-Step vs Suno vs AudioCraft

| Feature           | ACE-Step 1.5  | Suno v4     | AudioCraft     |
| ----------------- | ------------- | ----------- | -------------- |
| Full songs        | ✅             | ✅           | ❌ (music only) |
| Vocals            | ✅             | ✅           | ❌              |
| Local/self-hosted | ✅             | ❌ (cloud)   | ✅              |
| License           | MIT           | Proprietary | MIT            |
| Min VRAM          | 4GB           | N/A         | 16GB           |
| Speed (30s)       | 2–8 sec       | \~30 sec    | \~60 sec       |
| Cost              | $0.15/day GPU | $10/mo sub  | $0.3/day GPU   |

## Further Reading

* [GitHub Repository](https://github.com/ace-step/ACE-Step)
* [ComfyUI Nodes](https://github.com/ace-step/ComfyUI-ACE-Step)
* [AudioCraft Guide](https://docs.clore.ai/guides/audio-and-voice/audiocraft-music) — for instrumental-only music
