-
Notifications
You must be signed in to change notification settings - Fork 3
Test_data_examples
- test data can be downloaded from short_reads
paired fastq files
-
test_R1.fastq
-
test_R2.fastq
test host genome
-
test.host.fasta
conda activate Eukfinder
Eukfinder read_prep --r1 test_R1.fastq --r2 test_R2.fastq \
-n 10 --hcrop 10 -l 15 -t 15 --wsize 40 --qscore 25 --mlen 40 --mhlen 40\
-o read_prep -i "adaptor_file_location" --hg test.host.fasta
-
Two paired fastq files one unpaired fastq file
-
read_prep_p.1.fastq
-
read_prep_p.2.fastq
-
-
one unpaired fastq file
- read_prep_un.fastq
-
Two centrifuge result files for paired and unpaired reads
-
read_prep_centrifuge_P
-
read_prep_centrifuge_UP
-
read_prep output files can also be downloaded from prep output
-
use the 5 output files from Eukfinder read_prep above
-
Two paired fastq files one unpaired fastq file
-
read_prep_p.1.fastq
-
read_prep_p.2.fastq
-
-
one unpaired fastq file
- read_prep_un.fastq
-
Two centrifuge result files for paired and unpaired reads
-
read_prep_centrifuge_P
-
read_prep_centrifuge_UP
-
-
conda activate Eukfinder
Eukfinder short_seqs --r1 read_prep_p.1.fastq --r2 read_prep_p.2.fastq --un read_prep_un.fastq \
-o shortread_test -n 10 -z 10 -t T --max_m 100 \
-e 0.01 --pid 60 --cov 30 --mhlen 50 \
--pclass read_prep_centrifuge_P --uclass read_prep_centrifuge_UP
-
Located in Directory Eukfinder_results
-
Up to six fastq files are possible (bacterial, archaeal, eukaryotic, viral, unknown and eukaryotic+unknown).
Note this example includes only Archaea, Bacteria, Eukaryote and unknown sequences, so no Misc.fasta file will be created- shortread_test.Arch.fq (contigs classified as archaea)
- shortread_test.Bact.fq (contigs classified as bacteria)
- shortread_test.Euk.fq (contigs classified as eukaryote)
- shortread_test.Unk.fq (unclassified contigs)
- shortread_test.EUnk.fq (combined eukaryote classified or unclassified contigs)
Example output results can be downloaded from here: short_seqs output
- test data can be downloaded from longreads.fastq
- longreads.fastq (this is example longread sequence data)
conda activate Eukfinder
python eukfinder.py long_seqs -l longreads.fastq -o longreads_test \
-n 48 -z 6 -t False \
-e 0.01 --pid 60 --cov 30 --mhlen 100
## Output files
-
Located in Directory Eukfinder_results
-
Up to six fastq files possible (bacterial, archaeal, eukaryotic, viral, unknown and eukaryotic+unknown).
Note this example includes only Bacteria, Eukaryote and unknown sequences, so no Arch.fq or Misc.fq file will be created- longreads_test.Bact.fq (contigs classified as bacteria)
- longreads_test.Euk.fq (contigs classified as eukaryote)
- longreads_test.Unk.fq (unclassified contigs)
- longreads_test.EUnk.fq (contigs classified as eukaryote or unclassified )
Example output files can be downloaded from here: long_seqs output