Skip to content

This page contains list pipelines or analysis packages

License

Notifications You must be signed in to change notification settings

sadikmz/pipelines

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 

Repository files navigation

Pipelines / tools collection for analyzing various genomic dataset

Inspired by Awesome Pipeline

Reads correction

  • RAFT: Repeat Aware Fragmentation Tool)

Genome assembly (long-read) / long-read genome assembly and analysing tools

  • hifiasm
  • verkko
  • Nallo - a Nextflow pipeline for comprehensive human long-read genome analysis.

Assessing genome assembly

  • GQC - Genome Quality Checker
  • Inspector - assessing genome assembly based on long-read sequencing
  • KMGC: Kmer-based tools to evaluate and improve T2T level genome assemblies

Kmer profiling

  • Brisk - Exact resource-efficient dictionary for k-mers.
  • digest - Fast and flexible minimizer digestion with digest.

Repeatitive element identification and annotation

  • ModDotPlot - Rapid and interactive visualization of complex repeats.
  • sTELLeR - Detecting transposable elements in long-read genomes.
  • TeloSerachLR - telomere search using long sequencing reads.
  • telescope - Single locus resolution of Transposable ELEment expression.
  • teloscope - A universal telomere annotation tool for genome assemblies.
  • TE-seq - A Transposable Element Annotation and RNA-Seq Pipeline.
  • HiTE - Transposable Elements detection
  • TEtrimmer - Manual curation of TEs.
  • TEnest - TEnest.
  • MCHelper - Curates transposable element libraries.
  • anianns: Ani augmented Annotation of satellite arrays.
  • TRF-mod: TRF-mod is a modified version of TRF with the identical algorithm.
  • srf: tellite Repeat Finder
  • pacvar - a pipeline for analyzing long-read PacBio whole genome and repeat expansion sequencing data

Protein coding gene prediction

  • ensembl-anno
  • MAKER
  • AMAW: Automated MAKER2 Annotation Wrapper
  • Genomeannotator: genomeannotator is a nextflow pipeline for the annotation of metazoan genomes. While nothing within the pipelines makes it particularily specific to this taxonomic group, it has only been tested (and developed) for this purpose
  • Funannotate
  • AnnotaPipeline
  • BRAKER: EASEL (Efficient, Accurate, Scalable Eukaryotic modeLs), a tool for improvement of eukaryotic genome annotation
  • EASEL
  • GALBA
  • Helixer: Helixer is a tool for structural genome annotation. It utilizes Deep Neural Networks and a Hidden Markov Model to directly provide primary gene models in a gff3 file.
  • TAGADA: Transcript And Gene Assembly, Deconvolution, Analysis
  • GeneForge
  • Hayai-Annotation: A functional gene prediction tool that integrates orthologs and gene ontology for network analysis in plant species.

Orthology inference

  • Orthofinder
  • TOGA: Tool to infer orthologs from genome alignments.
  • OrthoMCL
  • FastOMA: FastOMA is a scalable software package to infer orthology relationship.

SVs

  • SVbyEye A visual tool to characterize structural variation among whole genome assemblies.
  • longcallD

Sequence alignment

  • FastGA - A Fast Genome Aligner.
  • Accelign - Fast GPU-accelerated sequence alignments.

Pangenome

  • gfatools - Processing pangenome alignments.
  • Gretl - Gretl—variation graph evaluation TooLkit.
  • Pandagma - A tool for identifying pan-gene sets and gene families at desired evolutionary depths and accommodating whole genome duplications.
  • Pangene - Constructing a pangenome gene graph.
  • mumemto - finding multi-MUMs and MEMs in pangenomes.
  • multi-MUMs - Improved pangenomic classification accuracy with chain statistics.
  • varigraph - Pangenome graph-based variant genotyper for diploid and polyploid genomes.
  • PVGwfa: Multi-Level Parallel Sequence-to-Graph Alignment Tool

Protein families / domains

DNA/Protein structure/modelling

Related lists

  • gget - querying of genomic reference databases.
  • MSAplot - MSA visualization.
  • tangermeme - Implementations of FIMO and TOMTOM.
  • dSQ - Submit Job Arrays with dSQ
  • xsra: A performant and storage-efficient CLI tool to extract sequences from an SRA archive with support for FASTA, FASTQ, and BINSEQ outputs.
  • eccLib: Parsing GTF and FASTA files using the eccLib Library
  • PHast: PHast -- Perfect Hashing with fast evaluation
  • SimdSketch: A SIMD-accelerated library to compute two types of sketches:

Vizualization tools collection

Other

  • CLUES2 - program to infer selection coefficients, evaluate the statistical evidence for selection, and reconstruct historic allele frequencie

About

This page contains list pipelines or analysis packages

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •