Skip to content

[Enhancement suggestion]: Enable parallelisation for peak comparison #12

@sof202

Description

@sof202

Enhancement suggestion

Currently, you would need to create a wrapper script with a high array count to easily parallelise the peak comparison tool. Peak comparing multiple regions is embarrassingly parallel but the tool doesn't lend well to this. The main problem is that each run through reads the exact same data (and data reading is the bottleneck here).

It would be good to allow a vector of start and end values instead of individual values. From here, you'd only need to read in files once (and if they are all on one chromosome, you can continue to use grep and other faster tools to reduce file size).

If you were to allow vectors of start and end values then you can create arrays of BedBase objects which should allow for parallelisation within python.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions