Spotify Music Taste Clustering

Cluster your Spotify library using interpretable audio features and GPT-powered lyric analysis. Produces a 33-dimensional interpretable feature vector where every dimension has human-readable meaning.

For the full methodology, design decisions, and technical deep-dives, see the accompanying essay.

Time estimate: ~3-4 hours for 1,500 songs (mostly waiting: downloads, audio extraction, GPT API calls). All steps cache progress, so you can stop/resume.

What It Does

Extracts your Spotify library w/ Metadata
Downloads your Spotify saved songs as MP3s
Fetches lyrics from Genius + MusixMatch
Extracts audio features via Essentia (genre, mood, energy, etc.)
Classifies lyrics via GPT (valence, themes, explicit content, etc.)
Clusters songs into meaningful groups
Visualizes with interactive 3D UMAP

Quickstart

Prerequisites

Python 3.9+
FFmpeg (brew install ffmpeg / apt install ffmpeg)
~2GB disk for Essentia models

1. Setup

Clone and install dependencies.

git clone https://github.com/yourusername/spotify-clustering.git
cd spotify-clustering
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

2. API Keys

You'll need Spotify (to fetch your library), Genius (for lyrics), and OpenAI (for lyric classification).

# .env file

# Spotify - https://developer.spotify.com/dashboard
# Set redirect URI to http://127.0.0.1:3000/callback
SPOTIFY_CLIENT_ID=...
SPOTIFY_CLIENT_SECRET=...

# Genius - https://genius.com/api-clients
GENIUS_ACCESS_TOKEN=...

# OpenAI - https://platform.openai.com/api-keys
OPENAI_API_KEY=...

3. Fetch Your Library

Pulls your saved tracks metadata from Spotify. First run opens browser for OAuth.

python spotify/fetch_spotify_saved_songs.py

4. Download Songs

Downloads MP3s for local audio analysis. Safe to stop/resume.

python songs/download_via_spotdl.py   # or download_via_ytdlp.py

5. Fetch Lyrics

Fetches lyrics from Genius. Also safe to stop/resume.

python lyrics/fetch_lyrics.py

6. Run Analysis

Extracts audio features (Essentia) and classifies lyrics (GPT). First run is slow (~2-3 hours for 1,500 songs: ~90 min audio extraction + ~60 min GPT API calls). Uses cache afterward.

python analysis/run_analysis.py --songs songs/data/ --lyrics lyrics/data/

7. Interactive Dashboard

Explore clusters, tune parameters, and visualize results.

streamlit run analysis/interactive_interpretability.py

8. Export to Spotify Playlists (Optional)

Creates Spotify playlists from your clusters.

python export/export_clusters_as_playlists.py --dry-run  # preview
python export/export_clusters_as_playlists.py            # create

Key Files

File	Purpose
`analysis/run_analysis.py`	Main entry point
`analysis/interactive_interpretability.py`	Streamlit dashboard
`analysis/pipeline/interpretable_features.py`	33-dim vector construction
`analysis/pipeline/audio_analysis.py`	Essentia feature extraction
`analysis/pipeline/lyric_features.py`	GPT lyric classification
`analysis/pipeline/config.py`	Configuration & scales

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
.streamlit		.streamlit
analysis		analysis
export		export
lyrics		lyrics
songs		songs
spotify		spotify
tools		tools
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spotify Music Taste Clustering

What It Does

Quickstart

Prerequisites

1. Setup

2. API Keys

3. Fetch Your Library

4. Download Songs

5. Fetch Lyrics

6. Run Analysis

7. Interactive Dashboard

8. Export to Spotify Playlists (Optional)

Key Files

About

Uh oh!

Releases

Packages

Languages

IslamTayeb/spotify-clustering

Folders and files

Latest commit

History

Repository files navigation

Spotify Music Taste Clustering

What It Does

Quickstart

Prerequisites

1. Setup

2. API Keys

3. Fetch Your Library

4. Download Songs

5. Fetch Lyrics

6. Run Analysis

7. Interactive Dashboard

8. Export to Spotify Playlists (Optional)

Key Files

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages