-
Notifications
You must be signed in to change notification settings - Fork 20
Open
Description
DSRC may corrupt data if it contains repetitive sequence.
Steps to reproduce:
cd /tmp
mkdir dsrc-repro-3
cd dsrc-repro-3
git clone https://github.com/refresh-bio/DSRC
cd DSRC
make -f Makefile.c++11 bin
cd ..
wget http://kirill.med.u-tokai.ac.jp/data/temp/dsrc-repro-3.fastq.gz
gzip -dc <dsrc-repro-3.fastq.gz >3.fastq
./DSRC/bin/dsrc c -t1 3.fastq 3.dsrc
./DSRC/bin/dsrc d -t1 3.dsrc 3d.fastq
cmp 3.fastq 3d.fastq
Test data is 260 MB raw, but only 37 MB to download in gzipped form. Reduced from an 6 GB dataset. (Should be possible to reduce more).
There are no crashes or error messages, but the decompressed file is different from the original one. Output of the last "cmd" command:
3.fastq 3d.fastq differ: byte 259819080, line 1027394
Let me know if you need other information or tests.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels