-
Notifications
You must be signed in to change notification settings - Fork 85
Description
Hello!
I have RNAseq and ISOseq from a wide array of tissues and I noticed some peculiar behaviour with stringtie --merge. Specifically, It looks like for some (but not all) genes with expression limited to 1-2 tissues, stringite will not generate a final gene model with stringtie merge. Please find two examples below of Lin28b and Trpv5,Trpv6
One potential consideration is that both of these loci are duplicated on another region of the same chromosme, but my understanding is that stringite --merge does not care about MapQ or TPM and only relies o the GFF files within each tissue? the "full_annotation.gff" file is combining gene models with stringite, braker, toga, and liftoff using Mikado, which is how there is an empty "stringtie.merged.gtf" but an annotated final file. This being said, the ISO-seq data is the only source where we have proper UTRs for these genes, so it would be super valuable if stringtie kept the gene models.
There are other tissue specific genes that were properly merged, so this is not a behaviour that happens every time:
This being said, I've messed with -c and -f within each tissue's annotation and stringite --merge consistently fails to produce a final annotation for lin28b and Trpv5
I would very much appreciate any insight you may have with this and I'm happy to share files if you'd like to take a closer look.
Best!
Dustin