Skip to content

Conversation

@olgabot
Copy link
Collaborator

@olgabot olgabot commented Nov 4, 2016

If a chromosome was in the GTF annotation file but not in the genome fasta file, then outrigger validate would fail with the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-19-73c99f62ab80> in <module>()
----> 1 check_splice_sites.read_splice_sites(bed, genome, fasta)

/home/obotvinnik/workspace-git/outrigger/outrigger/validate/check_splice_sites.pyc in read_splice_sites(bed, genome, fasta, direction)
     61         records = SeqIO.parse(f, 'fasta')
     62         records = pd.Series([str(r.seq) for r in records],
---> 63                             index=[b.name for b in bed])
     64     # import pdb; pdb.set_trace()
     65     return records

/home/obotvinnik/anaconda/envs/outrigger/lib/python2.7/site-packages/pandas/core/series.pyc in __init__(self, data, index, dtype, name, copy, fastpath)
    241                                        raise_cast_failure=True)
    242 
--> 243                 data = SingleBlockManager(data, index, fastpath=True)
    244 
    245         generic.NDFrame.__init__(self, data, fastpath=True)

/home/obotvinnik/anaconda/envs/outrigger/lib/python2.7/site-packages/pandas/core/internals.pyc in __init__(self, block, axis, do_integrity_check, fastpath)
   4045         if not isinstance(block, Block):
   4046             block = make_block(block, placement=slice(0, len(axis)), ndim=1,
-> 4047                                fastpath=True)
   4048 
   4049         self.blocks = [block]

/home/obotvinnik/anaconda/envs/outrigger/lib/python2.7/site-packages/pandas/core/internals.pyc in make_block(values, placement, klass, ndim, dtype, fastpath)
   2662                      placement=placement, dtype=dtype)
   2663 
-> 2664     return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
   2665 
   2666 # TODO: flexible with index=None and/or items=None

/home/obotvinnik/anaconda/envs/outrigger/lib/python2.7/site-packages/pandas/core/internals.pyc in __init__(self, values, ndim, fastpath, placement, **kwargs)
   1794 
   1795         super(ObjectBlock, self).__init__(values, ndim=ndim, fastpath=fastpath,
-> 1796                                           placement=placement, **kwargs)
   1797 
   1798     @property

/home/obotvinnik/anaconda/envs/outrigger/lib/python2.7/site-packages/pandas/core/internals.pyc in __init__(self, values, placement, ndim, fastpath)
    108             raise ValueError('Wrong number of items passed %d, placement '
    109                              'implies %d' % (len(self.values),
--> 110                                              len(self.mgr_locs)))
    111 
    112     @property

ValueError: Wrong number of items passed 153907, placement implies 153920

This is a result of the number of actual sequences calculated to be fewer than the number of events going in, so now only the found sequences are reported

@olgabot
Copy link
Collaborator Author

olgabot commented Nov 4, 2016

Todos:

  • Add test
  • Add test data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants