Skip to content

Unexpectledly large number of clusters #7

@smorabit

Description

@smorabit

Hi Cathrine,

First of all thanks for developing this very interesting method! I tried out CHOIR on one of my datasets to see how the results would compare with my previous clustering analysis. I understand that one of the main advantages of CHOIR is that in principle it should identify the "correct" number of clusters without over or under clustering.

I have a snRNA-seq dataset of postmortem human cortical tissue from 8 individuals totaling ~66k nuclei. Based on my previous analysis and what we expect based on other studies, this dataset has clusters corresponding to seven major cell lineages (oligodendrocytes, OPCs, microglia, astrocytes, vascular cells, excitatory neurons, and inhibitory neurons). We generally expect that each of these major cell types would have some subclusters as well. I was surprised that after running CHOIR, I ended up with 132 clusters identified in my dataset, many of which contained very few nuclei. I ran CHOIR using the harmony dim reduction that I had already computed. I am not sure what is going on here or if you have any advice in this scenario? Below I am including the code that I ran as well as a UMAP plot comparing my previous clustering to the CHOIR clustering.

seurat_obj <- CHOIR(
    seurat_obj, 
    reduction = seurat_obj@reductions$harmony@cell.embeddings,
    var_features = VariableFeatures(seurat_obj)
)
CHOIR_clusters

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions