-
Notifications
You must be signed in to change notification settings - Fork 21
Open
Labels
algorithmIssue requires algorithmic improvementIssue requires algorithmic improvementweird resultsSomething looks odd in the resulting filesSomething looks odd in the resulting files
Description
Dear @andrewprzh ,
Thansk for you continue efforts on this great project, but I have another problem regarding with IsoQuant output files. This is my cmd
isoquant.py --reference ref.fa --genedb ref.gtf --complete_genedb --bam 'multiple bam files' --labels 'multiple labels' --data_type nanopore --stranded forward --output output --prefix all --threads 32 --sqanti_output --model_construction_strategy sensitive_ont --report_novel_unspliced false
I find many transcripts were classified as "novel_not_in_catalog" in novel_vs_known.SQANTI-like.tsv, but these transcripts were assigned "novel_gene_***" gene_id attributes in transcript_models.gtf.
transcript242.NC_016195.1.nnic NC_016195.1 - 370 2 novel_not_in_catalog KEH30_p11 unassigned_transcript_945 1554 1 -55 -3 -55 -3 extra_intron_novel;terminal_site_match_left_precise FALSE True NA NA NA NA NA NA NA False NA NA NA NA NA NA NA NA 5471 7021 NA 0.35 TTAAACACTCAGCCATTTTA NA NA NA NA NA NA NA NA NA
transcript248.NC_016195.1.nnic NC_016195.1 - 1059 2 novel_not_in_catalog KEH30_p11 unassigned_transcript_945 1554 1 -50 -3 -50 -3 extra_intron_novel;terminal_site_match_left_precise FALSE True NA NA NA NA NA NA NA False NA NA NA NA NA NA NA NA 5471 7021 NA 0.35 TTAAACACTCAGCCATTTTA NA NA NA NA NA NA NA NA NA
NC_016195.1 IsoQuant gene 5468 7100 . - . gene_id "novel_gene_NC_016195.1_249"; transcripts "2";
NC_016195.1 IsoQuant transcript 5468 7100 . - . gene_id "novel_gene_NC_016195.1_249"; transcript_id "transcript242.NC_016195.1.nnic"; similar_reference_id "unassigned_transcript_945"; alternatives "extra_intron_novel:5783-7045,tes_match_precise:3"; exons "2";
NC_016195.1 IsoQuant exon 7046 7100 . - . gene_id "novel_gene_NC_016195.1_249"; transcript_id "transcript242.NC_016195.1.nnic"; exon_number "1"; exon_id "90";
NC_016195.1 IsoQuant exon 5468 5782 . - . gene_id "novel_gene_NC_016195.1_249"; transcript_id "transcript242.NC_016195.1.nnic"; exon_number "2"; exon_id "91";
NC_016195.1 IsoQuant transcript 5468 7095 . - . gene_id "novel_gene_NC_016195.1_249"; transcript_id "transcript248.NC_016195.1.nnic"; similar_reference_id "unassigned_transcript_945"; alternatives "extra_intron_novel:6477-7045,tes_match_precise:3"; exons "2";
NC_016195.1 IsoQuant exon 7046 7095 . - . gene_id "novel_gene_NC_016195.1_249"; transcript_id "transcript248.NC_016195.1.nnic"; exon_number "1"; exon_id "92";
NC_016195.1 IsoQuant exon 5468 6476 . - . gene_id "novel_gene_NC_016195.1_249"; transcript_id "transcript248.NC_016195.1.nnic"; exon_number "2"; exon_id "93";
I took a look at these transcripts in IGV browser, and they realy were transcripts of gene KEH30_p11, not novel genes.
I am using IsoQuant-3.10.0, why would these wrong gene_id assignments happen? It realy confuses me.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
algorithmIssue requires algorithmic improvementIssue requires algorithmic improvementweird resultsSomething looks odd in the resulting filesSomething looks odd in the resulting files