Skip to content

Guidance on selecting parameters and choice of data scaling #9

@dr-michael-haley

Description

@dr-michael-haley

Hi,

I've been using TAPE to deconvolve some cases from the TCGA dataset using reference single-cell sequencing (from 100-200 cells per phenotype, from 6 different phenotypes). It works well, and is very impressive. However, I find I get sometimes very varying results depending on some the parameters.

Is it important that all the cell types expected to be in the bulk data are represented in the reference dataset?

In what situations should StandardScaler or MinMax scaler be used?

Can you offer any advice on selecting a variance_threshold? In some examples you have 0.98, and the default is 0.8. Varying this parameter can strongly impact the proportions, sometimes even if its only altered slightly (e.g +/- 0.05)

I've generated d_priors for my analyses from references, does including them always increase the accuracy?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions