A full-featured Suno AI-like web studio for the open-source HeartMuLa music generation model. Generate full songs with vocals from lyrics and style tags — all running locally on your GPU.
Getting Started • Features • Architecture • API Reference • Configuration
| Feature | Description |
|---|---|
| Lyrics Editor | Write lyrics with structure tag insertion ([Verse], [Chorus], [Bridge], etc.) |
| Style Tag Picker | Categorized, searchable tags across genre, mood, instruments, vocals, and tempo |
| Generation Queue | Submit multiple songs — real-time progress via Server-Sent Events (SSE) |
| Track Library | Search, sort, filter, and manage your generated tracks |
| Waveform Player | Persistent mini player with WaveSurfer.js waveform visualization |
| Audio Transcription | Upload existing audio to extract lyrics via HeartTranscriptor |
| GPU Monitoring | Live VRAM usage display and model state indicators |
| Crash Recovery | SQLite WAL-mode database — stale jobs auto-recover on restart |
| Dark Theme | Full dark mode UI built with shadcn/ui components |
Screenshots coming soon — run
make devand see for yourself!
| Tool | Version | Purpose |
|---|---|---|
| Python | 3.10+ | Backend runtime |
| Node.js | 18+ | Frontend runtime |
| uv | latest | Python package manager |
| pnpm | 8+ | Node.js package manager |
| NVIDIA GPU | 12GB+ VRAM | Model inference (optional for mock mode) |
# Clone the repository
git clone https://github.com/rustyorb/heartmula-studio.git
cd heartmula-studio
# Install dependencies
make setup-backend
make setup-frontend
# Start both servers (backend on :8000, frontend on :3000)
make devThen open http://localhost:3000 — you'll land on the Studio page.
Note
The app runs in mock mode by default — no GPU or HeartMuLa model installation required. Mock generation creates silent MP3 files with simulated progress to test the full pipeline.
Backend only
cd backend
uv sync # Install Python dependencies
uv run alembic upgrade head # Run database migrations
uv run uvicorn app.main:app --reload --port 8000Frontend only
cd frontend
pnpm install # Install Node dependencies
pnpm dev # Start dev server on :3000With real HeartMuLa model
# Install HeartMuLa (requires PyTorch + CUDA)
pip install heartlib
# The model downloads automatically on first generation (~6GB)
# For GPUs with < 14GB VRAM, mmgp mode is auto-enabledheartmula-studio/
├── backend/ # FastAPI + Python
│ ├── app/
│ │ ├── main.py # App factory, lifespan, router registration
│ │ ├── config.py # pydantic-settings configuration
│ │ ├── database.py # Async SQLite + WAL mode
│ │ ├── dependencies.py # FastAPI dependency injection
│ │ ├── models/ # SQLAlchemy ORM models
│ │ │ ├── job.py # GenerationJob
│ │ │ ├── track.py # Track
│ │ │ └── settings.py # UserSettings (singleton)
│ │ ├── schemas/ # Pydantic request/response types
│ │ ├── routers/ # API endpoints
│ │ │ ├── generation.py # POST /generate, GET/DELETE /jobs
│ │ │ ├── tracks.py # CRUD /tracks
│ │ │ ├── transcription.py# POST /transcribe
│ │ │ ├── events.py # GET /events (SSE stream)
│ │ │ ├── system.py # GET /health, /gpu
│ │ │ └── settings.py # GET/PUT /settings
│ │ ├── services/ # Business logic
│ │ │ ├── pipeline_manager.py # HeartMuLa model lifecycle
│ │ │ ├── job_queue.py # SQLite-backed persistent queue
│ │ │ ├── generation_worker.py # Background generation loop
│ │ │ ├── event_broadcaster.py # SSE fan-out to clients
│ │ │ ├── gpu_manager.py # GPU detection + VRAM monitoring
│ │ │ ├── storage_service.py # File I/O for outputs/uploads
│ │ │ └── transcription_service.py
│ │ └── utils/
│ │ ├── progress_hook.py # tqdm monkey-patch for progress
│ │ └── audio.py # Duration detection
│ ├── alembic/ # Database migrations
│ ├── data/ # Runtime: SQLite DB, outputs, uploads
│ └── pyproject.toml
│
├── frontend/ # Next.js 15 + React 19
│ ├── src/
│ │ ├── app/ # Pages (App Router)
│ │ │ ├── studio/ # Lyrics editor + queue panel
│ │ │ ├── library/ # Track grid with filters
│ │ │ ├── transcribe/ # Audio upload + transcript view
│ │ │ └── settings/ # Defaults + system info
│ │ ├── components/
│ │ │ ├── layout/ # Sidebar, Header, AppShell, SSEProvider
│ │ │ ├── studio/ # LyricsEditor, TagPicker, GenerationParams
│ │ │ ├── queue/ # JobCard, QueuePanel
│ │ │ ├── library/ # TrackCard, TrackGrid, TrackFilters
│ │ │ ├── player/ # MiniPlayer (WaveSurfer.js)
│ │ │ ├── transcription/ # UploadZone, TranscriptView
│ │ │ ├── shared/ # GpuStatus, ModelStatus, ProgressBar
│ │ │ └── ui/ # 14 shadcn/ui primitives
│ │ ├── stores/ # Zustand state management
│ │ │ ├── useStudioStore # Lyrics, tags, params, submit
│ │ │ ├── useQueueStore # Job queue, SSE-driven updates
│ │ │ ├── usePlayerStore # Audio playback state
│ │ │ ├── useLibraryStore # Tracks, search, pagination
│ │ │ ├── useSystemStore # Connection, GPU, model state
│ │ │ └── useSettingsStore# User preferences
│ │ ├── hooks/ # useSSE, useAudioPlayer, useDebounce
│ │ ├── lib/ # API client, SSE manager, utils
│ │ └── types/ # TypeScript interfaces
│ └── package.json
│
└── Makefile # dev, setup, migrate commands
|
Backend
|
Frontend
|
sequenceDiagram
participant U as Browser
participant F as Next.js Frontend
participant B as FastAPI Backend
participant Q as Job Queue (SQLite)
participant W as Generation Worker
participant M as HeartMuLa Model
U->>F: Write lyrics + select tags
F->>B: POST /api/generate
B->>Q: Enqueue job (pending)
B-->>F: SSE: job:queued
B->>U: { job_id, queue_position }
W->>Q: Dequeue job → processing
W-->>F: SSE: job:started
W->>M: Generate audio
loop Every ~50 frames
M-->>W: Progress update
W-->>F: SSE: job:progress
end
M->>W: Audio file saved
W->>Q: Mark completed
W-->>F: SSE: job:completed { output_url }
U->>F: Click Play
F->>B: GET /outputs/...mp3
F->>U: WaveSurfer renders waveform
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/generate |
Submit a generation job |
GET |
/api/jobs |
List all jobs (filterable by status) |
GET |
/api/jobs/{id} |
Get job details |
DELETE |
/api/jobs/{id} |
Cancel a pending job |
POST /api/generate — Request body
{
"lyrics": "[Verse]\nHello world\n\n[Chorus]\nSing along",
"tags": "pop, electronic, female vocal",
"max_length_ms": 240000,
"temperature": 1.0,
"topk": 50,
"cfg_scale": 1.5
}| Field | Type | Default | Range | Description |
|---|---|---|---|---|
lyrics |
string | required | — | Song lyrics with optional [Section] markers |
tags |
string | required | — | Comma-separated style tags |
max_length_ms |
int | 240000 |
30,000–360,000 | Maximum audio duration |
temperature |
float | 1.0 |
0.1–2.0 | Sampling temperature |
topk |
int | 50 |
1–500 | Top-K sampling |
cfg_scale |
float | 1.5 |
1.0–10.0 | Classifier-free guidance scale |
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/tracks |
List tracks with search, sort, pagination |
GET |
/api/tracks/{id} |
Get track details |
PATCH |
/api/tracks/{id} |
Update title, tags, or favorite |
DELETE |
/api/tracks/{id} |
Delete track and audio file |
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/transcribe |
Upload audio → extract lyrics (SSE result) |
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/health |
Health check with model state |
GET |
/api/gpu |
GPU/VRAM status |
GET |
/api/events |
SSE stream (real-time updates) |
GET |
/api/settings |
Get user preferences |
PUT |
/api/settings |
Update user preferences |
| Event | Payload | Description |
|---|---|---|
system:connected |
{} |
Initial connection established |
model:loading |
{ progress, message } |
Model loading progress |
model:ready |
{ state } |
Model loaded and ready |
gpu:status |
{ name, vram_total_gb, ... } |
GPU status update |
job:queued |
{ job_id, status, ... } |
New job added to queue |
job:started |
{ job_id } |
Job dequeued, generation starting |
job:progress |
{ job_id, step, total_steps, progress } |
Frame-by-frame progress |
job:completed |
{ job_id, track_id, output_url, duration_ms } |
Generation finished |
job:failed |
{ job_id, error } |
Generation error |
job:cancelled |
{ job_id } |
Job cancelled by user |
transcription:completed |
{ job_id, lyrics } |
Transcription finished |
transcription:failed |
{ job_id, error } |
Transcription error |
| Variable | Default | Description |
|---|---|---|
HEARTMULA_DATABASE_URL |
sqlite+aiosqlite:///data/heartmula.db |
Database connection string |
HEARTMULA_OUTPUT_DIR |
data/outputs |
Audio output directory |
HEARTMULA_UPLOAD_DIR |
data/uploads |
Upload temp directory |
HEARTMULA_MODEL_PATH |
(auto-download) | Path to HeartMuLa model weights |
NEXT_PUBLIC_API_URL |
(empty — uses proxy) | Backend URL override for frontend |
The tag picker includes 72 curated tags across 5 categories:
| Category | Tags |
|---|---|
| Genre | pop, rock, hip-hop, r&b, jazz, classical, electronic, country, folk, metal, punk, blues, soul, reggae, latin, indie |
| Mood | happy, sad, energetic, calm, romantic, dark, uplifting, melancholic, aggressive, dreamy, nostalgic, epic |
| Instruments | piano, guitar, drums, bass, synthesizer, violin, trumpet, saxophone, flute, organ, ukulele, cello |
| Vocals | male vocal, female vocal, duet, choir, rap, whisper, falsetto, opera |
| Tempo | slow, medium tempo, fast, uptempo |
erDiagram
generation_jobs {
string id PK
string status
text lyrics
text tags
int max_length_ms
float temperature
int topk
float cfg_scale
string output_path
int duration_ms
text error
datetime created_at
datetime started_at
datetime completed_at
}
tracks {
string id PK
string job_id FK
string title
text tags
text lyrics
string output_url
int duration_ms
int file_size_bytes
boolean favorite
datetime created_at
datetime updated_at
}
user_settings {
int id PK
float default_temperature
int default_topk
float default_cfg_scale
int default_max_length_ms
string theme
boolean auto_save_tracks
}
generation_jobs ||--o| tracks : "produces"
| Command | Description |
|---|---|
make dev |
Start both backend and frontend in parallel |
make dev-backend |
Start backend only (with hot reload) |
make dev-frontend |
Start frontend only |
make setup-backend |
Install Python deps + run migrations |
make setup-frontend |
Install Node.js dependencies |
make migrate |
Run pending database migrations |
make new-migration msg="description" |
Generate a new Alembic migration |
| Metric | Value |
|---|---|
| Total files | 110 |
| Lines of code | ~11,000 |
| Frontend components | 22 |
| Zustand stores | 6 |
| API endpoints | 12 |
| SSE event types | 12 |
| shadcn/ui primitives | 14 |
- Connect real HeartMuLa pipeline (replace mock)
- Waveform preview in track cards
- Batch generation (multiple songs from one prompt)
- Audio post-processing (normalize, fade in/out)
- Export playlist / zip download
- Docker Compose deployment
- Light theme support
- Keyboard shortcuts
Built on top of the incredible work by the HeartMuLa team — a 3-billion parameter open-source music generation model that creates full songs with vocals from lyrics and style tags.
HeartMuLa Studio — Generate music, locally.