Skip to content

Fix collapse_annotation.py KeyError on UCSC GTFs missing transcript_type#71

Open
dksenthil wants to merge 1 commit intoc3g:devfrom
dksenthil:fix/collapse-gtf-ucsc
Open

Fix collapse_annotation.py KeyError on UCSC GTFs missing transcript_type#71
dksenthil wants to merge 1 commit intoc3g:devfrom
dksenthil:fix/collapse-gtf-ucsc

Conversation

@dksenthil
Copy link

Summary

  • UCSC GTFs (e.g., mm10) lack gene_type/transcript_type attributes that the GTEx pipeline's collapse_annotation.py expects, causing a KeyError in the rnaseqc2 step
  • Added a local patched copy (genpipes/tools/collapse_annotation.py) that defaults missing types gracefully and warns about UCSC annotation limitations
  • Updated genpipes/bfx/gtex_pipeline.py to use the local script instead of the module-provided one

Test plan

  • Run rnaseq pipeline with mm10 UCSC GTF and verify collapse_gtf step completes without KeyError
  • Verify warning message is printed for UCSC GTFs missing type attributes
  • Confirm no regression with GENCODE/Ensembl GTFs that already have gene_type/transcript_type

UCSC GTFs (e.g., mm10) lack gene_type/transcript_type attributes that the GTEx pipeline collapse_annotation.py expects, causing a KeyError in the rnaseqc2 step. Added a local patched copy that defaults missing types gracefully and warns about UCSC annotation limitations. Updated the bfx wrapper to use the local script instead of the module-provided one.
@MareikeJaniak
Copy link
Collaborator

Thanks for doing this, Senthil. We might want to think about adding the script to c3g_tools instead of having it in the genpipes tools directory. The genpipes tools directory is more for tools that are genpipes specific, whereas script like collapse_annotation.py are usually housed under c3g_tools. What do you think, Paul?

@paulstretenowich
Copy link
Collaborator

I agree with Mareike.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants