-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Hello,
I am trying to apply CHOIR on the RNA assay of a multiomic dataset. I did this in two ways. First, I applied Harmony integration using the harmony package followed by seurat clustering with resolution of 2 to be conservative. The second way I used the following command in CHOIR:
cellranger <- readRDS("integrated_cellranger_peaks_RNA.rds")
cellranger
DefaultAssay(cellranger) <- "RNA"
cellranger
cellranger <- CHOIR(cellranger,
batch_correction_method = "Harmony",
batch_labels = "sample",
use_assay = "RNA",
n_cores= 50)
However, the clusters do not look good when compared to seurat's clustering. See comparison below:
-
How is pca used when CHOIR says computing pca? Does it make sense to start the tree with pca? Shouldn't CHOIR use the dimensionality reduction obtained from Harmony instead to build the tree and the graphs?
-
Should I add use_slot = counts or = data?
-
I would like to understand more how the dimensionality reduction works when setting
subtree_reductions = TRUE.
Warning message in .validInput(subtree_reductions, "subtree_reductions", list(reduction, :
“Supplied dimensionality reduction matrix for parameter 'subtree_reductions' will only be used for the root tree. Thereafter, the dimensionality reductions for each subtree will be calculated according to the specified 'reduction_method'. To use only the supplied dimensionality reduction matrix, set parameter 'subtree_reductions' to FALSE.”
How are subsequent dimensionality reductions computed? It is my understanding that Harmony only provides a reduced dimensionality reduciton, and thats it. It does not provide a corrected counts or corrected data matrix. So do you compute PCA every time on the counts matrix for subsequent trees? If so, where does Harmony integration come to play here? I suspect the discrepancy is coming from the buildTree() function. These integration methods do not provide corrected counts or corrected scaled data matrices, so how does integration and batch correction come into play here.
Similarly, I get errors and issues with ATAC Harmony and Multiomic Harmony integrations. Also, should I be using counts or data below in the use_slot parameter?
cellranger <- CHOIR(cellranger,
use_assay = c("peaks", "RNA"),
use_slot = c("data", "data"),
atac = c(TRUE, FALSE),
batch_correction_method = c("Harmony", "Harmony"),
batch_labels = c("sample","sample"),
distance_approx = FALSE,
n_cores= 50)
Note that I have 8 different batches.

