Dear Will,
I currently work with a dataset containing both gene expression data and surface marker information, using cite-seq. Now I wonder if you think there is any inherent problem with using all this data as input for the glm-pca-analysis? I have 9 surface markers, separating all major cell subsets, and I believe that including them would "weight" the analysis to make sure that for example CD4 and CD8 cells are readily separated, but if I by doing this violate some data distribution assumptions, etc, then of course I should avoid it.
Best regards
Jakob