-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Description
Hi,
I am interested in producing a cluster simmilar to the one you did with arxiv. I'm working with a set of web pages from Common Crawl ~6M urls. I have them reduced to embeddings using this. How did you decide for the arxiv project the config of node_embedding_dim, neighbor_scale, and n_neighbors or at least what are rational ranges so and I can search on that areas. because currently I end with ~65% of points not being noise in no cluster. even using noise_level=0
thanks
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels