blindschleiche

A collection of bioinformatics / sequence utilities needed for my research, and hopefully useful for yours.

Install

pip install blindschleiche
# or for the current main branch:
# pip install git+https://github.com/kdm9/blindschleiche.git

Usage

USAGE: blsl <subtool> [options...]


Where <subtool> is one of:

  deepclust2fa:         Split a .faa by the clusters diamond deepclust finds
  ebiosra2rl2s:         INTERNAL: MPI Tübingen tool. Make a runlib-to-sample map table from ebio sra files
  equalbestblast:       Output only the best blast hits.
  esearchandfetch:      Use the Entrez API to search for and download something. A CLI companion to the NCBI search box
  falen:                Tabulate the lengths of sequences in a FASTA file
  farename:             Rename sequences in a fasta file sequentially
  fastasanitiser:       Sanitise fasta IDs to something sane, then back again
  fqstat:               Estimate stats from a fastq file based on the first kilobytes of the file, keeping high accuracy
  galhist:              Make a summary histogram of git-annex-list output
  genebed:              Extract a BED file of genes from a GFF
  genigvjs:             Generate a simple IGV.js visualisation of some bioinf files.
  gffattr2tsv:          Extract the attributes column into a TSV
  gffcat:               Concatenate GFF3 files, resepcting header lines and FASTA sections
  gffcsqify:            Format a reasonably compliant GFF for use with bcftools csq
  gffparse:             Format a GFF sanely
  gfftagsane:           Sanitise a messy gff attribute column to just simple tags 
  gg2k:                 Summarise a table with GreenGenes-style lineages into a kraken-style report.
  ildemux:              Demultiplex modern illumina reads from read headers.
  ilsample:             Sample a fraction of read pairs from an interleaved fastq file
  jsonl2csv:            Parse jsonlines into a C/TSV
  liftoff-gff3:         Obtain an actually-useful GFF3 from Liftoff by fixing basic GFF3 format errors
  mask2bed:             The inverse of bedtools maskfasta: softmasked fasta -> unmasked fasta + mask.bed
  n50:                  Calculate N50 and total length of a set of contigs
  nstitch:              Combine R1 + R2 into single sequences, with an N in the middle
  pairs:                Handle paired-end reads, with various transformations.
  pairslash:            Add an old-style /1 /2 pair indicator to paired-end fastq files (DEPRECATED, use blsl pairs)
  pansn-rename:         Add, remove, or modify PanSN-style prefixes to contig/chromosome names in references
  regionbed:            Make a bed/region file of genome windows
  shannon-entropy:      Calculate Shannon's entropy (in bits) at each column of one or more alignments
  tabcat:               Concatenate table (c/tsv) files, adding the filename as a column
  telogrep:             Search contigs for known telomere repeats
  tidyqc:               What if MultiQC was in the tidyverse? (and much worse)
  uniref-acc2taxid:     Make a ncbi-style acc2taxid.map file for a uniref fasta
  vcfparallel:          Parallelise a bcf processing pipeline across regions
  vcfreport:            vcfreport: Prepare a basic html report about a VCF file
  vcfstats:             Use bcftools to calculate various statistics, outputing an R-ready table
  help:                 Print this help message


Use blsl subtool --help to get help about a specific tool

Why the name Blindschleiche?

They're awesome animals
Their English name is Slow Worm, which is appropriate for this set of low-performance tools in Python.
All tools implemented in Python must be named with a snake pun, and they're kinda a snake (not really, they're legless lizards)

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
.github/workflows		.github/workflows
blsl		blsl
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

blindschleiche

Install

Usage

Why the name Blindschleiche?

About

Uh oh!

Releases 10

Uh oh!

Languages

kdm9/blindschleiche

Folders and files

Latest commit

History

Repository files navigation

blindschleiche

Install

Usage

Why the name Blindschleiche?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 10

Uh oh!

Languages