Skip to content

leishenggit/CircleBase2

Repository files navigation

CircleBase V2

An Integrated Platform for eccDNA Annotation Across Cancers and Species. Also see homepage

Scoring system for human

Dependencies

tips: Anaconda is always a good choice to install the dependencies.

Input files: bed files for the six regulatory categories and eccDNAs

  1. Chromatin_access.bed download
  2. Chromatin_interaction.bed download
  3. Epigenetic_regulation.bed download
  4. Genetic_variant.bed download
  5. Regulatory_elements.bed download
  6. Targeting_genes.bed download
  7. eccDNA_core.hg19.bed download

Supplementary files: chromosome-specific density of regulatory elements

  1. stat.Chromatin_access.bed download
  2. stat.Chromatin_interaction.bed download
  3. stat.Epigenetic_regulation.bed download
  4. stat.Genetic_variant.bed download
  5. stat.Regulatory_elements.bed download
  6. stat.Targeting_genes.bed download

How to run

  1. Go to the scoring system/human directory and set up all the dependencies
  2. Download all the input and supplementary files listed above and decompress them
  3. Run the run.sh shell script

Output

  • hits.stat.* files are annotated hits (records) count for each eccDNA in four regulatory categories. The last field is the count number.
  • *.score files include score for each eccDNA corresponding Gaussian mode in four regulatory categories. Here are the fields:
  1. eccDNA id.
  2. Chromosome to which the eccDNA belongs.
  3. Hits number for the eccDNA.
  4. Hits number after Box-Cox transformation for the eccDNA.
  5. Mean of the hits number for all eccDNAs at chromosome list on the second field (i.e., 𝜇 of the Gaussian distribution).
  6. Standard Deviation of the hits number for all eccDNAs at chromosome list on the second field (i.e., 𝜎 of the Gaussian distribution).
  7. Probability greater than the hits number in the corresponding Gaussian distribution.
  8. The score for the eccDNA (i.e., negative of the base 10 logarithm of the Probability).
  • *.nor files include normalized score of each category. The first 8 columns are same as *.score files, column 9 is the Z-score of the regulatory category and column 10 is the normalized score.

  • final.score.txt file is the final result we want. Here are the fields:

  1. eccDNA id.
  2. Average of normalized scores for all six regulatory categories. download here

Scoring system for mouse

Dependencies

Same as human, see above

Input files: bed file for the four regulatory categories and eccDNAs

  1. Chromatin_access.bed download
  2. Epigenetic_regulation.bed download
  3. Genetic_variant.bed download
  4. Regulatory_elements.bed download
  5. eccDNA_core.mm10.bed download

Supplementary files: chromosome-specific density of regulatory elements

  1. stat.Chromatin_access.bed download
  2. stat.Epigenetic_regulation.bed download
  3. stat.Genetic_variant.bed download
  4. stat.Regulatory_elements.bed download

How to run

  1. Go to the scoring system/mouse directory and set up all the dependencies
  2. Download all the input and supplementary files listed above and decompress them
  3. Run the run.sh shell script

Output

  • hits.stat.* files are annotated hits (records) count for each eccDNA in four regulatory categories. The last field is the count number.
  • *.score files include score for each eccDNA corresponding Gaussian mode in four regulatory categories. Here are the fields:
  1. eccDNA id.
  2. Chromosome to which the eccDNA belongs.
  3. Hits number for the eccDNA.
  4. Hits number after Box-Cox transformation for the eccDNA.
  5. Mean of the hits number for all eccDNAs at chromosome list on the second field (i.e., 𝜇 of the Gaussian distribution).
  6. Standard Deviation of the hits number for all eccDNAs at chromosome list on the second field (i.e., 𝜎 of the Gaussian distribution).
  7. Probability greater than the hits number in the corresponding Gaussian distribution.
  8. The score for the eccDNA (i.e., negative of the base 10 logarithm of the Probability).
  • *.nor files include normalized score of each category. The first 8 columns are same as *.score files, column 9 is the Z-score of the regulatory category and column 10 is the normalized score.

  • final.score.txt file is the final result we want. Here are the fields:

  1. eccDNA id.
  2. Average of normalized scores for all four regulatory categories. download here

About

An Integrated Platform for eccDNA Annotation Across Cancers and Species

Resources

License

Stars

Watchers

Forks

Packages

No packages published