HeartMuLa Studio

AI Music Generation Web UI

A full-featured Suno AI-like web studio for the open-source HeartMuLa music generation model. Generate full songs with vocals from lyrics and style tags — all running locally on your GPU.

Getting Started • Features • Architecture • API Reference • Configuration

Features

Feature	Description
Lyrics Editor	Write lyrics with structure tag insertion (`[Verse]`, `[Chorus]`, `[Bridge]`, etc.)
Style Tag Picker	Categorized, searchable tags across genre, mood, instruments, vocals, and tempo
Generation Queue	Submit multiple songs — real-time progress via Server-Sent Events (SSE)
Track Library	Search, sort, filter, and manage your generated tracks
Waveform Player	Persistent mini player with WaveSurfer.js waveform visualization
Audio Transcription	Upload existing audio to extract lyrics via HeartTranscriptor
GPU Monitoring	Live VRAM usage display and model state indicators
Crash Recovery	SQLite WAL-mode database — stale jobs auto-recover on restart
Dark Theme	Full dark mode UI built with shadcn/ui components

Screenshots

Screenshots coming soon — run make dev and see for yourself!

Getting Started

Prerequisites

Tool	Version	Purpose
Python	3.10+	Backend runtime
Node.js	18+	Frontend runtime
uv	latest	Python package manager
pnpm	8+	Node.js package manager
NVIDIA GPU	12GB+ VRAM	Model inference (optional for mock mode)

Quick Start

# Clone the repository
git clone https://github.com/rustyorb/heartmula-studio.git
cd heartmula-studio

# Install dependencies
make setup-backend
make setup-frontend

# Start both servers (backend on :8000, frontend on :3000)
make dev

Then open http://localhost:3000 — you'll land on the Studio page.

Note

The app runs in mock mode by default — no GPU or HeartMuLa model installation required. Mock generation creates silent MP3 files with simulated progress to test the full pipeline.

Manual Setup

Backend only

cd backend
uv sync                              # Install Python dependencies
uv run alembic upgrade head          # Run database migrations
uv run uvicorn app.main:app --reload --port 8000

Frontend only

cd frontend
pnpm install                         # Install Node dependencies
pnpm dev                             # Start dev server on :3000

With real HeartMuLa model

# Install HeartMuLa (requires PyTorch + CUDA)
pip install heartlib

# The model downloads automatically on first generation (~6GB)
# For GPUs with < 14GB VRAM, mmgp mode is auto-enabled

Architecture

heartmula-studio/
├── backend/                    # FastAPI + Python
│   ├── app/
│   │   ├── main.py             # App factory, lifespan, router registration
│   │   ├── config.py           # pydantic-settings configuration
│   │   ├── database.py         # Async SQLite + WAL mode
│   │   ├── dependencies.py     # FastAPI dependency injection
│   │   ├── models/             # SQLAlchemy ORM models
│   │   │   ├── job.py          #   GenerationJob
│   │   │   ├── track.py        #   Track
│   │   │   └── settings.py     #   UserSettings (singleton)
│   │   ├── schemas/            # Pydantic request/response types
│   │   ├── routers/            # API endpoints
│   │   │   ├── generation.py   #   POST /generate, GET/DELETE /jobs
│   │   │   ├── tracks.py       #   CRUD /tracks
│   │   │   ├── transcription.py#   POST /transcribe
│   │   │   ├── events.py       #   GET /events (SSE stream)
│   │   │   ├── system.py       #   GET /health, /gpu
│   │   │   └── settings.py     #   GET/PUT /settings
│   │   ├── services/           # Business logic
│   │   │   ├── pipeline_manager.py    # HeartMuLa model lifecycle
│   │   │   ├── job_queue.py           # SQLite-backed persistent queue
│   │   │   ├── generation_worker.py   # Background generation loop
│   │   │   ├── event_broadcaster.py   # SSE fan-out to clients
│   │   │   ├── gpu_manager.py         # GPU detection + VRAM monitoring
│   │   │   ├── storage_service.py     # File I/O for outputs/uploads
│   │   │   └── transcription_service.py
│   │   └── utils/
│   │       ├── progress_hook.py # tqdm monkey-patch for progress
│   │       └── audio.py         # Duration detection
│   ├── alembic/                # Database migrations
│   ├── data/                   # Runtime: SQLite DB, outputs, uploads
│   └── pyproject.toml
│
├── frontend/                   # Next.js 15 + React 19
│   ├── src/
│   │   ├── app/                # Pages (App Router)
│   │   │   ├── studio/         #   Lyrics editor + queue panel
│   │   │   ├── library/        #   Track grid with filters
│   │   │   ├── transcribe/     #   Audio upload + transcript view
│   │   │   └── settings/       #   Defaults + system info
│   │   ├── components/
│   │   │   ├── layout/         #   Sidebar, Header, AppShell, SSEProvider
│   │   │   ├── studio/         #   LyricsEditor, TagPicker, GenerationParams
│   │   │   ├── queue/          #   JobCard, QueuePanel
│   │   │   ├── library/        #   TrackCard, TrackGrid, TrackFilters
│   │   │   ├── player/         #   MiniPlayer (WaveSurfer.js)
│   │   │   ├── transcription/  #   UploadZone, TranscriptView
│   │   │   ├── shared/         #   GpuStatus, ModelStatus, ProgressBar
│   │   │   └── ui/             #   14 shadcn/ui primitives
│   │   ├── stores/             # Zustand state management
│   │   │   ├── useStudioStore  #   Lyrics, tags, params, submit
│   │   │   ├── useQueueStore   #   Job queue, SSE-driven updates
│   │   │   ├── usePlayerStore  #   Audio playback state
│   │   │   ├── useLibraryStore #   Tracks, search, pagination
│   │   │   ├── useSystemStore  #   Connection, GPU, model state
│   │   │   └── useSettingsStore#   User preferences
│   │   ├── hooks/              #   useSSE, useAudioPlayer, useDebounce
│   │   ├── lib/                #   API client, SSE manager, utils
│   │   └── types/              #   TypeScript interfaces
│   └── package.json
│
└── Makefile                    # dev, setup, migrate commands

Tech Stack

Backend

FastAPI — async Python API framework
SQLAlchemy 2.0 — async ORM
SQLite — WAL mode, zero-config database
SSE-Starlette — Server-Sent Events
Alembic — database migrations
Pydantic v2 — validation & settings
uv — fast Python package manager

Frontend

Next.js 15 — App Router, React Server Components
React 19 — UI library
TypeScript 5 — type safety
Tailwind CSS 4 — utility-first styling
shadcn/ui — Radix-based component library
Zustand — lightweight state management
WaveSurfer.js 7 — audio waveform visualization

Data Flow

sequenceDiagram
    participant U as Browser
    participant F as Next.js Frontend
    participant B as FastAPI Backend
    participant Q as Job Queue (SQLite)
    participant W as Generation Worker
    participant M as HeartMuLa Model

    U->>F: Write lyrics + select tags
    F->>B: POST /api/generate
    B->>Q: Enqueue job (pending)
    B-->>F: SSE: job:queued
    B->>U: { job_id, queue_position }

    W->>Q: Dequeue job → processing
    W-->>F: SSE: job:started
    W->>M: Generate audio
    loop Every ~50 frames
        M-->>W: Progress update
        W-->>F: SSE: job:progress
    end
    M->>W: Audio file saved
    W->>Q: Mark completed
    W-->>F: SSE: job:completed { output_url }

    U->>F: Click Play
    F->>B: GET /outputs/...mp3
    F->>U: WaveSurfer renders waveform

API Reference

Generation

Method	Endpoint	Description
`POST`	`/api/generate`	Submit a generation job
`GET`	`/api/jobs`	List all jobs (filterable by `status`)
`GET`	`/api/jobs/{id}`	Get job details
`DELETE`	`/api/jobs/{id}`	Cancel a pending job

POST /api/generate — Request body

{
  "lyrics": "[Verse]\nHello world\n\n[Chorus]\nSing along",
  "tags": "pop, electronic, female vocal",
  "max_length_ms": 240000,
  "temperature": 1.0,
  "topk": 50,
  "cfg_scale": 1.5
}

Field	Type	Default	Range	Description
`lyrics`	string	required	—	Song lyrics with optional `[Section]` markers
`tags`	string	required	—	Comma-separated style tags
`max_length_ms`	int	`240000`	30,000–360,000	Maximum audio duration
`temperature`	float	`1.0`	0.1–2.0	Sampling temperature
`topk`	int	`50`	1–500	Top-K sampling
`cfg_scale`	float	`1.5`	1.0–10.0	Classifier-free guidance scale

Tracks

Method	Endpoint	Description
`GET`	`/api/tracks`	List tracks with search, sort, pagination
`GET`	`/api/tracks/{id}`	Get track details
`PATCH`	`/api/tracks/{id}`	Update title, tags, or favorite
`DELETE`	`/api/tracks/{id}`	Delete track and audio file

Transcription

Method	Endpoint	Description
`POST`	`/api/transcribe`	Upload audio → extract lyrics (SSE result)

System

Method	Endpoint	Description
`GET`	`/api/health`	Health check with model state
`GET`	`/api/gpu`	GPU/VRAM status
`GET`	`/api/events`	SSE stream (real-time updates)
`GET`	`/api/settings`	Get user preferences
`PUT`	`/api/settings`	Update user preferences

SSE Events

Event	Payload	Description
`system:connected`	`{}`	Initial connection established
`model:loading`	`{ progress, message }`	Model loading progress
`model:ready`	`{ state }`	Model loaded and ready
`gpu:status`	`{ name, vram_total_gb, ... }`	GPU status update
`job:queued`	`{ job_id, status, ... }`	New job added to queue
`job:started`	`{ job_id }`	Job dequeued, generation starting
`job:progress`	`{ job_id, step, total_steps, progress }`	Frame-by-frame progress
`job:completed`	`{ job_id, track_id, output_url, duration_ms }`	Generation finished
`job:failed`	`{ job_id, error }`	Generation error
`job:cancelled`	`{ job_id }`	Job cancelled by user
`transcription:completed`	`{ job_id, lyrics }`	Transcription finished
`transcription:failed`	`{ job_id, error }`	Transcription error

Configuration

Environment Variables

Variable	Default	Description
`HEARTMULA_DATABASE_URL`	`sqlite+aiosqlite:///data/heartmula.db`	Database connection string
`HEARTMULA_OUTPUT_DIR`	`data/outputs`	Audio output directory
`HEARTMULA_UPLOAD_DIR`	`data/uploads`	Upload temp directory
`HEARTMULA_MODEL_PATH`	(auto-download)	Path to HeartMuLa model weights
`NEXT_PUBLIC_API_URL`	(empty — uses proxy)	Backend URL override for frontend

Style Tags

The tag picker includes 72 curated tags across 5 categories:

Category	Tags
Genre	pop, rock, hip-hop, r&b, jazz, classical, electronic, country, folk, metal, punk, blues, soul, reggae, latin, indie
Mood	happy, sad, energetic, calm, romantic, dark, uplifting, melancholic, aggressive, dreamy, nostalgic, epic
Instruments	piano, guitar, drums, bass, synthesizer, violin, trumpet, saxophone, flute, organ, ukulele, cello
Vocals	male vocal, female vocal, duet, choir, rap, whisper, falsetto, opera
Tempo	slow, medium tempo, fast, uptempo

Database Schema

erDiagram
    generation_jobs {
        string id PK
        string status
        text lyrics
        text tags
        int max_length_ms
        float temperature
        int topk
        float cfg_scale
        string output_path
        int duration_ms
        text error
        datetime created_at
        datetime started_at
        datetime completed_at
    }

    tracks {
        string id PK
        string job_id FK
        string title
        text tags
        text lyrics
        string output_url
        int duration_ms
        int file_size_bytes
        boolean favorite
        datetime created_at
        datetime updated_at
    }

    user_settings {
        int id PK
        float default_temperature
        int default_topk
        float default_cfg_scale
        int default_max_length_ms
        string theme
        boolean auto_save_tracks
    }

    generation_jobs ||--o| tracks : "produces"

Development

Makefile Commands

Command	Description
`make dev`	Start both backend and frontend in parallel
`make dev-backend`	Start backend only (with hot reload)
`make dev-frontend`	Start frontend only
`make setup-backend`	Install Python deps + run migrations
`make setup-frontend`	Install Node.js dependencies
`make migrate`	Run pending database migrations
`make new-migration msg="description"`	Generate a new Alembic migration

Project Stats

Metric	Value
Total files	110
Lines of code	~11,000
Frontend components	22
Zustand stores	6
API endpoints	12
SSE event types	12
shadcn/ui primitives	14

Roadmap

Connect real HeartMuLa pipeline (replace mock)
Waveform preview in track cards
Batch generation (multiple songs from one prompt)
Audio post-processing (normalize, fade in/out)
Export playlist / zip download
Docker Compose deployment
Light theme support
Keyboard shortcuts

Credits

Built on top of the incredible work by the HeartMuLa team — a 3-billion parameter open-source music generation model that creates full songs with vocals from lyrics and style tags.

HeartMuLa Studio — Generate music, locally.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
start.bat		start.bat
start.sh		start.sh
stop.bat		stop.bat
stop.sh		stop.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HeartMuLa Studio

AI Music Generation Web UI

Features

Screenshots

Getting Started

Prerequisites

Quick Start

Manual Setup

Architecture

Tech Stack

Data Flow

API Reference

Generation

Tracks

Transcription

System

SSE Events

Configuration

Environment Variables

Style Tags

Database Schema

Development

Makefile Commands

Project Stats

Roadmap

Credits

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

rustyorb/heartmula-studio

Folders and files

Latest commit

History

Repository files navigation

HeartMuLa Studio

AI Music Generation Web UI

Features

Screenshots

Getting Started

Prerequisites

Quick Start

Manual Setup

Architecture

Tech Stack

Data Flow

API Reference

Generation

Tracks

Transcription

System

SSE Events

Configuration

Environment Variables

Style Tags

Database Schema

Development

Makefile Commands

Project Stats

Roadmap

Credits

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages