-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Description
Thank you for sharing it with community great tool and I would say it is UMAP+HDBSCAN on steroids!
Quick question though, when I try to cluster 30k of text embeddings, I am getting a lot of the texts being grouped as outliers. I have tried to change params like noise_level, base_min_cluster_size or min_number_clusters, about 10-15% of the population is outlier, If I run UMAP+HDBSCAN manually I get significantly low number of outliers
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels