-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
Hi CHOIR team,
I’m trying to use CHOIR to sub-cluster a T-cell subset, but I ended up with ~50 clusters, which seems higher than expected. The whole T cells has 34,581 genes × 228,417 cells). Here is my trying Workflow:
# 1. Extract T cells
Idents(all) <- "celltype_broad"
t_cell <- subset(all, subset = celltype_broad %in% c("CD4 T", "CD8 T", "MAIT"))
# 2. Clear previous embeddings
t_cell@reductions <- list()
# 3. Standard Seurat preprocessing
t_cell <- NormalizeData(t_cell)
t_cell <- FindVariableFeatures(t_cell, nfeatures = 3000)
t_cell <- ScaleData(t_cell)
t_cell <- RunPCA(t_cell, verbose = FALSE)
# 4. CHOIR clustering & visualization
library(CHOIR)
t_cell <- CHOIR(t_cell, n_cores = 4)
t_cell <- runCHOIRumap(t_cell, reduction = "P0_reduction")
plotCHOIR(t_cell, accuracy_scores = TRUE, plot_nearest = FALSE)
Questions
1. Cluster count: Is obtaining ~50 clusters for a CD4/CD8/MAIT subset typical with CHOIR, or does it indicate over-splitting?
2. Workflow validation: Is my preprocessing + CHOIR() + runCHOIRumap() + plotCHOIR() pipeline correct?
3. Resetting reductions: Could t_cell@reductions <- list() remove critical metadata or embeddings?
The UMAP looks like this
Thank you for your guidance!
Metadata
Metadata
Assignees
Labels
No labels