Python-based applications use isoform.py for common functions.
cmpiso- compares collections of isoforms in GFFgeniso- generates isoforms and their probabilitiesmodelbuilder- creates the various model filesoptiso- optimizes model parameters with a genetic algorithmrun_apc- builds Makefile to runisoformerandoptisoon apc set
The genomikon repo contains a couple of faster implementations in the
isoformer directory.
isoformer- this is the same asgenisobut ~100x fasterisocounter- as above, but only counting, not calculating probabilitiesisorandom- counting isoforms in random sequences
conformity.py- compares outputs ofgenisoandisoformeroptiso-mp- multi-processing version with some odd bugsspeedo.py- compares speeds ofgenisoandisoformersummary.py- creates TSV of the apc set
Data collection is described in datacore2024/project_splicing. The 1045 genes
of the smallgenes dataset.
See the models directory for standard models and modelbuilder for how to
build the models.
There are 19.938 billion RNASeq_splice records in WormBase. As a rough estimate of intron frequency, divide intron counts by 20 billion.