-
Notifications
You must be signed in to change notification settings - Fork 1
Description
SUMO generates plots for cophenetic correlation coefficient and the proportion of ambiguously clustered pairs to assist with determining the optimal number of clusters. Additionally, the following metrics can be helpful in certain scenarios and should be generated:
-
Jaccard index: In some cases, as we go from
kclusters tok+1clusters, a tiny number of samples are assigned to the new cluster. In such a scenario,k+1clusters may offer little information regarding classification compared tokclusters. Ifais the number of pairs of samples that are in the same subgroup forkand the same subgroup fork+1clusters, andbis the number of pairs of samples that are either in the same group inkand different ink+1or same group ink+1, but different ink, then you can calculate this index asa / (a+b). -
Silhouette score: can be calculated based on
Hcalculated each time, and the final score can be based on those. -
Agreement score: How many pairs of samples in each run of the solver get assigned labels that agree with the consensus labels.