Skip to content

episode 02 improvement ideas #8

@mgalland

Description

@mgalland

For the statistical refresher part:

What is a good p-value histogram? It should have a high peak on the left suggesting that you are comparing two conditions that have different distributions.
If distribution is uniform then no difference between your experimental conditions being tested.

Perhaps also split episode 02 into "statistical refresher" and a new episode termed 03 "statistics applied to RNA-seq"

For episode 02:

  1. population and sample notions
  2. simulate two populations from two countries with different heights.
  3. draw a sample + increase size of sample and make average + sd estimations.
  4. Case 1 = identical populations (= same country)
    • draw N samples of similar size. Say N = 5, N = 10 groups or N = 10,000 groups.
    • perform a t test for each of these three group sizes.
    • draw a p-value histogram for these 3 group sizes.
    • FDR procedure to control for type I error = false positives.
  5. Case 2 = different populations
    • draw N samples of similar size. Say N = 5, N = 10 groups or N = 10,000 groups.
    • perform a t test for each of these three group sizes.
    • draw a p-value histogram for these 3 group sizes.
    • FDR procedure to control for type I error = false positives.
  6. Type I error and type II error.
  7. FDR procedure
  8. Power

For episode 03 = application to RNA-seq

  • maximise biological replicate sample numbers to increase statistical power.
  • talk about sample sequencing depth = rarefaction curve.
  • p-values histogram profiles and what to do about it.

Useful links
https://www.bioconductor.org/packages/release/bioc/html/qvalue.html

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions