Skip to content

Pruning stuck checking for underclustering #29

@doliv071

Description

@doliv071

Hi All,

I was testing out CHOIR and it was going smoothly until the last step where it got stuck trying to resolve underclustering in 1 cluster. It made 20 attempts over the course of about an hour before I killed it. Is there a parameter to control this?

> clusSCE <- CHOIR::CHOIR(clusSCE)
----------------------------------------
- CHOIR - Part 1: Build clustering tree
----------------------------------------
2025-04-09 12:00:10 PM : (Step 1/7) Checking inputs and preparing object..

Input data:
 - Object type: SingleCellExperiment
 - # of cells: 13731
 - # of batches: 1
 - # of modalities: 1
 - ATAC data: FALSE
 - Countsplitting: FALSE
 - Assay used to build tree: logcounts
 - Assay used to prune tree: logcounts

Proceeding with the following parameters:
 - Intermediate data stored under key: CHOIR
 - Alpha: 0.05
 - Multiple comparison adjustment: bonferroni
 - Features to train RF: var
 - # of excluded features: 0
 - # of permutations: 100
 - # of RF trees: 50
 - Use variance: TRUE
 - Minimum accuracy: 0.5
 - Minimum connections: 1
 - Maximum repeated errors: 20
 - Distance approximation: TRUE
 - Maximum cells sampled: Inf
 - Downsampling rate: 0.3689
 - Minimum reads: >0 reads
 - Maximum clusters: auto
 - Minimum cluster depth: 2000
 - Normalization method: none
 - Subtree dimensionality reductions: TRUE
 - Dimensionality reduction method: Default
 - Dimensionality reduction parameters provided: No
 - # of variable features: Default
 - Batch correction method: none
 - Batch correction parameters provided: No
 - Nearest neighbor parameters provided: 
     - verbose: FALSE
 - Clustering parameters provided: 
     - algorithm: 1
     - group.singletons: TRUE
     - verbose: FALSE
 - # of cores: 22
 - Random seed: 1

2025-04-09 12:00:10 PM : (Step 2/7) Running initial dimensionality reduction..
2025-04-09 12:00:10 PM : Preparing input matrix using 'logcounts' assay..
2025-04-09 12:00:18 PM : Running PCA with 2000 variable features..
2025-04-09 12:00:37 PM : (Step 3/7) Generating initial nearest neighbors graph..
2025-04-09 12:00:42 PM : (Step 4/7) Identify starting clustering resolution..
                      [[ Current tree: 6 iterations in 14s ]]                       Starting resolution: 0.001
2025-04-09 12:01:05 PM : (Step 5/7) Building root clustering tree..
                      [[ Current tree: 8 iterations in 22s ]] 
                      
                      Identified 2 clusters in root tree.
2025-04-09 12:01:30 PM : (Step 6/7) Subclustering root tree..
2025-04-09 12:01:52 PM : 10% (Subtree 1/2, 13708 cells), 2 total clusters.                             
2025-04-09 12:01:56 PM : 15% (Subtree 1/2, 13708 cells), 2 total clusters.                             
2025-04-09 12:08:35 PM : 27% (Subtree 1/2, 13708 cells), 58 total clusters.                            
2025-04-09 12:09:10 PM : 35% (Subtree 1/2, 13708 cells), 64 total clusters.                            
2025-04-09 12:11:37 PM : 42% (Subtree 1/2, 13708 cells), 90 total clusters.                            
2025-04-09 12:12:04 PM : 57% (Subtree 1/2, 13708 cells), 99 total clusters.                            
2025-04-09 12:12:06 PM : 65% (Subtree 1/2, 13708 cells), 100 total clusters.                           
2025-04-09 12:12:15 PM : 72% (Subtree 1/2, 13708 cells), 104 total clusters.                           
2025-04-09 12:12:23 PM : 87% (Subtree 1/2, 13708 cells), 108 total clusters.                           
2025-04-09 12:12:28 PM : 95% (Subtree 1/2, 13708 cells), 112 total clusters.                           
2025-04-09 12:12:30 PM : 100% (Subtree 2/2, 23 cells), 112 total clusters.                             
2025-04-09 12:12:30 PM : 100% (Subtree 2/2, 23 cells), 113 total clusters.                             
Generating subtrees.. [==============================================================] 100% in 00:10:59

2025-04-09 12:12:30 PM : (Step 7/7) Compiling full clustering tree..
                      Full tree has 75 levels and 111 clusters.

----------------------------------------
- CHOIR - Part 2: Prune clustering tree
----------------------------------------
2025-04-09 12:12:32 PM : (Step 1/2) Checking inputs and preparing object..

Input data:
 - Object type: SingleCellExperiment
 - # of cells: 13731
 - # of batches: 1
 - # of modalities: 1
 - # of subtrees: 3
 - # of levels: 75
 - # of starting clusters: 111
 - Countsplitting: FALSE
 - Assay used to build tree: logcounts
 - Assay used to prune tree: logcounts

Proceeding with the following parameters:
 - Intermediate data stored under key: CHOIR
 - Alpha: 0.05
 - Multiple comparison adjustment: bonferroni
 - Features to train RF: var
 - # of excluded features: 0
 - # of permutations: 100
 - # of RF trees: 50
 - Use variance: TRUE
 - Minimum accuracy: 0.5
 - Minimum connections: 1
 - Maximum repeated errors: 20
 - Distance approximation: TRUE
 - Distance awareness: 2
 - All metrics collected: FALSE
 - Maximum cells sampled: Inf
 - Downsampling rate: 0.3689
 - Minimum reads: >0 reads
 - Normalization method: none
 - Batch correction method: none
 - Clustering parameters provided: 
     - algorithm: 1
     - group.singletons: TRUE
     - verbose: FALSE
 - # of cores: 22
 - Random seed: 1

2025-04-09 12:12:33 PM : (Step 2/2) Iterating through clustering tree..
2025-04-09 12:13:46 PM : 10% (12/75 levels) in 1.21 min. 101 clusters remaining.                       
2025-04-09 12:15:32 PM : 20% (23/75 levels) in 2.97 min. 79 clusters remaining.                        
2025-04-09 12:17:49 PM : 30% (34/75 levels) in 5.26 min. 74 clusters remaining.                        
2025-04-09 12:19:18 PM : 40% (46/75 levels) in 6.74 min. 56 clusters remaining.                        
2025-04-09 12:21:04 PM : 50% (57/75 levels) in 8.51 min. 41 clusters remaining.                        
2025-04-09 12:26:18 PM : 60% (68/75 levels) in 13.75 min. 40 clusters remaining.                       
2025-04-09 12:28:01 PM : 70% (70/75 levels) in 15.47 min. 40 clusters remaining.                       
2025-04-09 12:33:43 PM : 81% (72/75 levels) in 21.16 min. 38 clusters remaining.                       
2025-04-09 12:34:00 PM : 90% (74/75 levels) in 21.44 min. 37 clusters remaining.                       
2025-04-09 12:34:58 PM : Additional comparisons necessary. 36 clusters remaining.                      
2025-04-09 12:35:52 PM : Additional comparisons necessary. 35 clusters remaining.                      
2025-04-09 12:36:54 PM : Additional comparisons necessary. 34 clusters remaining.                      
2025-04-09 12:37:51 PM : Additional comparisons necessary. 33 clusters remaining.                      
2025-04-09 12:37:57 PM : Checking for underclustering in 7 clusters.                                   
2025-04-09 12:38:14 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 12:39:32 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 12:41:30 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 12:42:45 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 12:44:45 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 12:46:09 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 12:48:13 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 12:49:29 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 12:51:33 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 12:52:49 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 12:54:46 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 12:56:04 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 12:58:15 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 12:59:39 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:01:48 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:03:12 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:05:25 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:06:51 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:09:04 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:10:26 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:12:34 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:14:02 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:16:13 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:17:37 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:19:42 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:21:05 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:22:57 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:24:14 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:26:09 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:27:25 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:29:25 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:30:43 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:32:39 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:33:54 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:35:48 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:37:06 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:39:17 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:40:48 PM : Checking for underclustering in 1 clusters.                                   
2025-04-09 01:42:59 PM : Additional comparisons necessary. 27 clusters remaining.                      
2025-04-09 01:44:25 PM : Checking for underclustering in 1 clusters. 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions