Skip to content

Can Hostile Output Full FASTQ Headers for Medaka Compatibility? #63

@xiaoli-dong

Description

@xiaoli-dong

We are using Hostile for dehosting across several projects. Recently, we encountered an issue with medaka_consensus version 2.0.1.

Upon investigation, I found that medaka_consensus relies on the basecalling model information embedded in the FASTQ headers. However, the FASTQ files produced by Hostile during dehosting seem to strip this information, retaining only the read IDs.

For example, a typical original header looks like this:
@d776c6f5-9501-41e3-8631-4966c9c35566 runid=1a64aa91730686f5bb6ec4c17cbd38ed80b8e9dd sampleid=no_sample read=20450 ch=259 start_time=2023-03-07T13:23:06Z model_version_id=dna_r10.4.1_e8.2_260bps_sup@v3.5.2 barcode=barcode14

Is it possible for Hostile to retain the full original header in the dehosted FASTQ output, rather than outputting only the read IDs? This would ensure compatibility with tools like Medaka that rely on full header metadata.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions