-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Hi Cathrine,
First of all thanks for developing this very interesting method! I tried out CHOIR on one of my datasets to see how the results would compare with my previous clustering analysis. I understand that one of the main advantages of CHOIR is that in principle it should identify the "correct" number of clusters without over or under clustering.
I have a snRNA-seq dataset of postmortem human cortical tissue from 8 individuals totaling ~66k nuclei. Based on my previous analysis and what we expect based on other studies, this dataset has clusters corresponding to seven major cell lineages (oligodendrocytes, OPCs, microglia, astrocytes, vascular cells, excitatory neurons, and inhibitory neurons). We generally expect that each of these major cell types would have some subclusters as well. I was surprised that after running CHOIR, I ended up with 132 clusters identified in my dataset, many of which contained very few nuclei. I ran CHOIR using the harmony dim reduction that I had already computed. I am not sure what is going on here or if you have any advice in this scenario? Below I am including the code that I ran as well as a UMAP plot comparing my previous clustering to the CHOIR clustering.
seurat_obj <- CHOIR(
seurat_obj,
reduction = seurat_obj@reductions$harmony@cell.embeddings,
var_features = VariableFeatures(seurat_obj)
)
