Guidance on selecting parameters and choice of data scaling

Hi,

I've been using TAPE to deconvolve some cases from the TCGA dataset using reference single-cell sequencing (from 100-200 cells per phenotype, from 6 different phenotypes). It works well, and is very impressive. However, I find I get sometimes very varying results depending on some the parameters.

Is it important that all the cell types expected to be in the bulk data are represented in the reference dataset?

In what situations should StandardScaler or MinMax scaler be used?

Can you offer any advice on selecting a variance_threshold? In some examples you have 0.98, and the default is 0.8. Varying this parameter can strongly impact the proportions, sometimes even if its only altered slightly (e.g +/- 0.05)

I've generated d_priors for my analyses from references, does including them always increase the accuracy?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guidance on selecting parameters and choice of data scaling #9

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Guidance on selecting parameters and choice of data scaling #9

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions