Skip to content

Is it appropriate/possible to run IntegrateData following GLM-PCA? #37

@jeremymsimon

Description

@jeremymsimon

Hi @willtownes,
My typical Seurat workflow for multiple samples and conditions is to run PCA followed by CCA-based integration (now IntegrateLayers in Seurat v5), then identify one joint set of clusters. If I want to try swapping in GLM-PCA, is that supposed to work as-is or do I need to adjust somehow?

I just ran a test on real data using the approximation method mentioned here, where I used nullResiduals on my raw counts then ran PCA on that using the top 3k deviant genes, followed by IntegrateLayers.

The resulting UMAP and clusters looked nothing like my PCA-based analysis, so either I did something wrong or it is not appropriate to do this in the first place.

Can you share your thoughts on whether this is possible, and if so, some best practices for doing this in Seurat when the dataset is large (>50k cells)? The RunGLMPCA() helper function was itself taking too long on these data.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions