Skip to content

tzSchErrt1 EAR#284

Merged
diegomics merged 4 commits intoERGA-consortium:mainfrom
phuongdoand:phuongdoand-tzSchErrt1_EAR
Feb 13, 2026
Merged

tzSchErrt1 EAR#284
diegomics merged 4 commits intoERGA-consortium:mainfrom
phuongdoand:phuongdoand-tzSchErrt1_EAR

Conversation

@phuongdoand
Copy link
Contributor

Assembly review request

  • ToLID: tzSchErrt1
  • Species: Schizoporella errata
  • Project: ERGA-BGE
  • Affiliation: Genoscope

@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Nov 20, 2025

Hi @phuongdoand, thanks for sending the EAR of Schizoporella errata.
I added the corresponding tag to the PR and will contact a supervisor and a reviewer ASAP.

@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Nov 20, 2025

Hi @diegomics, do you agree to supervise this assembly?
Please reply to this message only with OK to give acknowledge.

@diegomics
Copy link
Collaborator

ok

@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Nov 20, 2025

*****
EAR Reviewer Selection Process
Date: 2025-11-20 15:22

All Eligible Candidates:

Github ID     | Full Name             | Institution | Total Reviews | Last Review | Active | Working PRs | Calling Score | Adjusted Score
-----------------------------------------------------------------------------------------------------------------------------------------
gitcruz       | Fernando Cruz         | CNAG        | 13            | 2025-11-17  | Y      | 0           | 1024          | 1074          
tommathers    | Tom Mathers           | Sanger      | 7             | 2025-10-16  | Y      | 0           | 1018          | 1068          
gbdias        | Guilherme Dias        | SciLifeLab  | 4             | 2025-11-19  | Y      | 0           | 1017          | 1067          
DomAbsolon    | Dom Absolon           | Sanger      | 8             | 2025-11-04  | Y      | 0           | 1017          | 1067          
talioto       | Tyler Alioto          | CNAG        | 8             | 2025-10-21  | Y      | 1           | 1028          | 1058          
jesgomez      | Jessica Gomez Garrido | CNAG        | 11            | 2025-11-10  | Y      | 1           | 1026          | 1056          
MartinPippel  | Martin Pippel         | SciLifeLab  | 2             | 2025-10-06  | Y      | 1           | 1019          | 1049          
SarahPelan    | Sarah Pelan           | Sanger      | 6             | 2025-09-16  | Y      | 1           | 1019          | 1049          
joannacollins | Jo Collins            | Sanger      | 6             | 2025-10-10  | Y      | 1           | 1019          | 1049          
diegomics     | Diego De Panis        | IZW         | 11            | 2025-09-22  | Y      | 0           | 992           | 1037          
tbrown91      | Tom Brown             | IZW         | 11            | 2025-07-29  | Y      | 1           | 995           | 1020          

Selected reviewer: Fernando Cruz (gitcruz)
The decision was based on:
- different institution ('CNAG')
- active ('Y')
- working on 0 PR(s) currently
- highest adjusted calling score in this particular selection (1074)

@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Nov 20, 2025

Hi @gitcruz, do you agree to review this assembly?
Please reply to this message only with Yes or No by 26-Nov-2025 at 20:22 CET

@gitcruz
Copy link
Collaborator

gitcruz commented Nov 21, 2025

No

@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Nov 21, 2025

@gitcruz Ok thank you, I will look for the next reviewer on the list :)

@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Nov 21, 2025

*****
EAR Reviewer Selection Process
Date: 2025-11-21 09:29

All Eligible Candidates:

Github ID     | Full Name             | Institution | Total Reviews | Last Review | Active | Working PRs | Calling Score | Adjusted Score
-----------------------------------------------------------------------------------------------------------------------------------------
tommathers    | Tom Mathers           | Sanger      | 7             | 2025-10-16  | Y      | 0           | 1018          | 1068          
gbdias        | Guilherme Dias        | SciLifeLab  | 4             | 2025-11-19  | Y      | 0           | 1017          | 1067          
DomAbsolon    | Dom Absolon           | Sanger      | 8             | 2025-11-04  | Y      | 0           | 1017          | 1067          
talioto       | Tyler Alioto          | CNAG        | 8             | 2025-10-21  | Y      | 1           | 1028          | 1058          
jesgomez      | Jessica Gomez Garrido | CNAG        | 11            | 2025-11-10  | Y      | 1           | 1026          | 1056          
gitcruz       | Fernando Cruz         | CNAG        | 13            | 2025-11-17  | Y      | 1           | 1024          | 1054          
MartinPippel  | Martin Pippel         | SciLifeLab  | 2             | 2025-10-06  | Y      | 1           | 1019          | 1049          
SarahPelan    | Sarah Pelan           | Sanger      | 6             | 2025-09-16  | Y      | 1           | 1019          | 1049          
joannacollins | Jo Collins            | Sanger      | 6             | 2025-10-10  | Y      | 1           | 1019          | 1049          
diegomics     | Diego De Panis        | IZW         | 11            | 2025-09-22  | Y      | 0           | 992           | 1037          
tbrown91      | Tom Brown             | IZW         | 11            | 2025-07-29  | Y      | 1           | 995           | 1020          

Selected reviewer: Tom Mathers (tommathers)
The decision was based on:
- different institution ('Sanger')
- active ('Y')
- working on 0 PR(s) currently
- highest adjusted calling score in this particular selection (1068)

@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Nov 21, 2025

Hi @tommathers, do you agree to review this assembly?
Please reply to this message only with Yes or No by 27-Nov-2025 at 14:29 CET

@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Nov 27, 2025

@tommathers Time is out! I will look for the next reviewer on the list :)

@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Nov 27, 2025

*****
EAR Reviewer Selection Process
Date: 2025-11-27 21:17

All Eligible Candidates:

Github ID     | Full Name             | Institution | Total Reviews | Last Review | Active | Working PRs | Calling Score | Adjusted Score
-----------------------------------------------------------------------------------------------------------------------------------------
additive3     | Jo Wood               | Sanger      | 8             | 2025-10-17  | Y      | 0           | 1017          | 1062          
jesgomez      | Jessica Gomez Garrido | CNAG        | 12            | 2025-11-26  | Y      | 1           | 1026          | 1056          
gitcruz       | Fernando Cruz         | CNAG        | 13            | 2025-11-17  | Y      | 1           | 1025          | 1055          
MartinPippel  | Martin Pippel         | SciLifeLab  | 2             | 2025-10-06  | Y      | 1           | 1021          | 1051          
gbdias        | Guilherme Dias        | SciLifeLab  | 4             | 2025-11-19  | Y      | 1           | 1019          | 1049          
SarahPelan    | Sarah Pelan           | Sanger      | 6             | 2025-09-16  | Y      | 1           | 1019          | 1049          
joannacollins | Jo Collins            | Sanger      | 6             | 2025-10-10  | Y      | 1           | 1019          | 1049          
tommathers    | Tom Mathers           | Sanger      | 7             | 2025-10-16  | Y      | 1           | 1018          | 1048          
DomAbsolon    | Dom Absolon           | Sanger      | 8             | 2025-11-04  | Y      | 1           | 1017          | 1047          
talioto       | Tyler Alioto          | CNAG        | 8             | 2025-10-21  | Y      | 2           | 1029          | 1039          
diegomics     | Diego De Panis        | IZW         | 11            | 2025-09-22  | Y      | 0           | 992           | 1037          
tbrown91      | Tom Brown             | IZW         | 11            | 2025-07-29  | Y      | 1           | 995           | 1020          

Selected reviewer: Jo Wood (additive3)
The decision was based on:
- different institution ('Sanger')
- active ('Y')
- working on 0 PR(s) currently
- highest adjusted calling score in this particular selection (1062)

@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Nov 27, 2025

Hi @additive3, do you agree to review this assembly?
Please reply to this message only with Yes or No by 04-Dec-2025 at 02:17 CET

@additive3
Copy link
Collaborator

additive3 commented Nov 28, 2025 via email

@erga-ear-bot erga-ear-bot bot requested a review from additive3 November 28, 2025 13:35
@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Nov 28, 2025

Thanks for agreeing!
I appointed you as the EAR reviewer.
I will track this as one of your Working PRs until you finish this review.
Please check the Wiki if you need to refresh something. (and remember that you must download the EAR PDF to be able to click on the link to the contact map file!)
Contact the PR assignee for any issues.

@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Dec 5, 2025

Ping @diegomics,
One week without any movements on this PR!

1 similar comment
@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Dec 13, 2025

Ping @diegomics,
One week without any movements on this PR!

@tbrown91
Copy link
Collaborator

@additive3 @diegomics Please review. I will have a look now

@tbrown91
Copy link
Collaborator

There are lots of instances where unloc sequences look an awful lot like haplotigs, but don't show the normal drop in coverage (9_unloc_2, 1_unloc_2, 2_unloc_4). I think these would also explain the high number of 2-copy kmers in the merqury spectra, so they should be removed

@erga-ear-bot erga-ear-bot bot removed the STALLED label Jan 12, 2026
@diegomics
Copy link
Collaborator

Thanks for the ping Tom, this one slipped out. I'll take a look today

@additive3
Copy link
Collaborator

additive3 commented Jan 13, 2026

There is a lot of retained haplotypic duplication.
Some chromosomes are full of it, lots that is quite hard to disentangle with haplotype switching and being in the repetitive blocks..
Regarding coverage, lots of the issue is hard to see as in these repeat blocks, but could the sample be multiple individuals?

Chr1 First 5.4Mb needs removing as haptingand 5.4-6.96Mb needs flipping..some frag reordering required also, 12-20Mb also mixed hap, 40-42.3Mb hap dup, 47Mb-end needs flipping and repeats are hap dup. unloc 1, 2, 4 also hap dup...3 unsure

Chr2 6.8-12.7Mb aly hap, 17.14-17.26Mb frag need (re)moving from location, Unloc_1 localises to 27.6Mb it contains unique seq missing, also requires hap dup removing. Unloc_2 haptig indicates possible missing sequence from chrm @21MB, thing the other unloc are also haptigs

Chr3 287kb initial frag needs removing, 8.1-16.5Mb mixed hap region, would suggest 19.1-20.2 needs flipping, suggestion of there being hap dup along with unloc_3, suggestion that unloc 1 needs placing but could just be other hap representation. I'd remove all the other unloc.

Chr5 2.4-3.26Mb needs some rearragement, possibly haptigs

Chr6 20-20.9Mb needs removing as hap dup, haptig issues at 24.2Mb, haplotype assembly issues at 26.6Mb.

Chr7 hap issues 3.4Mb, 29.89-29.96Mb haptig needs removing

Chr8 8.1-8.26Mb haptig needs removing, 16.94-17Mb haptig needs removing, 22.49-22.55Mb alt hap likely should be removed, 25.33-25.42Mb needs flipping also suggestion of hap errors, would probably remove unloc_1

Chr9 Unloc_1 localises to 15.74Mb and contains unique sequence, also reveals some haptig that needs removing, unloc_2 is haptig and needs removing

Chr11 6.73Mb hap dup needs removing, 8.69-11.2Mb retained hap errors also needs some rearragement, 17.72-end needs removing as haptig, unloc_1 is haptig

A lot of the tail of small contig look suspisiously like contamination and shows in kmer profile and blob plot

@additive3
Copy link
Collaborator

Ping @phuongdoand @diegomics @tbrown91

@diegomics
Copy link
Collaborator

Sorry the delay. Thanks for taking a look guys.
@phuongdoand maybe you are showing the contact map without the contamination and haplotypic duplications removed?

@phuongdoand
Copy link
Contributor Author

@tbrown91 I agree that many of the unloc sequences show segments with a haplotype-like signal. However, they don't show a significant coverage drop, which makes it difficult to determine if it's a real haplotype or just a normal duplication.

@diegomics No, the sequences were demultiplexed and decontaminated before manual curation, so the plot should be "clean."

@additive3 Thank you for the detailed comment, lots of the unloc I placed them as unloc since there is something like a hap-like signal, but the coverage is normal, which I am uncertain about the call to remove them as hap dup.

  • Chr1:
    • 5.4-6.96MB need to be reversed, I agree
    • 40-42.3Mb hap dup, yes, the signal looks like a hap, but there is no drop in coverage
    • unloc 1,2,4, I agree, since at first check I suspected them as hap but again there is no drop in coverage, which makes me uncomfortable to cut them out as hap dup, but will do in the next version to check for the stats
  • Chr2:
    • 6.8-12.7Mb aly hap, i'm sorry but what is an aly hap?
    • 17.14-17.26Mb frag need (re)moving from location, done, will be removed
  • Chr3:
    • 287kb initial frag needs removing, agree
    • 8.1-16.5Mb mixed hap region, yes it does look like one, but i am not sure how to deal with this type of data?
    • 19.1-20.2 needs flipping, agree
    • unloc 2, hap
    • unloc 3, hap
    • unloc 4, not quite sure
  • Chr5: 2.4-3.26Mb needs some rearragement, yes I agree that the signal is a bit weird
  • Chr6: 20-20.9Mb needs removing as hap dup, will be removed
  • Chr7: 29.89-29.96Mb haptig needs removing, agree, will be removed
  • Chr8:
    • 8.1-8.26Mb haptig needs removing, agree, will be removed
    • 16.94-17, agree, there is a slight coverage drop as well, will be removed
    • 22.49-22.55Mb, will be removed
    • 25.33-25.42Mb, there is a slight coverage drop as well, I will remove it
    • unloc 1 point to a very complex region at 22Mb, I will remove it
  • Chr9:
    • Unloc_1 localises to 15.74Mb, agree, there is a small hap dup at the end after localized as well, I will remove it.
    • unloc_2, no coverage drop, but I will remove
  • Chr11:
    • 6.73-6.95 will be removed as hap dup
    • 17.72-end needs removing as haptig, unloc_1 is haptig, both will be removed as hap dup

For the other positions, I will take a closer look to see if there is anything else we could do with them.

@diegomics
Copy link
Collaborator

@phuongdoand please share the savestate

@phuongdoand
Copy link
Contributor Author

Hi @diegomics , here is the current save state.
tzSchErrt1_hr.pretext.map.savestate_4.zip

@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Jan 29, 2026

Attention @additive3, the EAR PDF was updated.

@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Jan 29, 2026

Attention @additive3, the EAR PDF was updated.

@additive3
Copy link
Collaborator

@phuongdoand @diegomics

Did you produce an updated contact map file?

@additive3
Copy link
Collaborator

@phuongdoand

lots of the unloc I placed them as unloc since there is something like a hap-like signal, but the coverage is normal, which I am uncertain about the call to remove them as hap dup.

Most of these retained hapdup regions show reduced/haploid coverage

@phuongdoand
Copy link
Contributor Author

@additive3 Yes, I made some changes to the contact map and updated it in the EAR report.

@additive3
Copy link
Collaborator

additive3 commented Feb 3, 2026

Hi @phuongdoand @diegomics

Ok, I've been through this and removed/marked up haplotypic duplication... it's a bit of a hack job and a compromise because there are clear flips between the two haplotypes and some of the resulting chromosome scaffolds are a mash of both haplotypes in places.. 11 being the worst.
From what I have seen in Bryozoan assemblies there these large indel between the haplotypes are common

The tail of small contigs are also predominantly contamination. I've tagged up some of it but not all contig that need to be removed.

Savestate attached:
tzSchErrt1_hr.pretext.map.savestate_jmdw.txt

@phuongdoand
Copy link
Contributor Author

phuongdoand commented Feb 4, 2026

Hi @additive3 ,
Thank you for your reply. I tried to load the savestate with both the contact map (before and after my modification), but it doesn't seem to be working for both of them. Could you please take another look?

@additive3
Copy link
Collaborator

Hi @phuongdoand

I have a feeling it is because you (according to the EAR) are using a very old PretextView version and I've added tags that are not found...

Ideally I would use the latest version but, let me see if I can redo the save and make it compatible.

@phuongdoand
Copy link
Contributor Author

There might be some mistakes with the EAR report PDF generation, we are actually using PretextView 1.0.3, but yes, I will update it to 1.0.5 and recheck it.

@additive3
Copy link
Collaborator

@phuongdoand
Apologies its an error on my part, i renamed a file which is causing your issue. I'll fix and update.

1.0.3/1.0.5 shouldn't throw errors.

@additive3
Copy link
Collaborator

Hi @phuongdoand

Hopefully this save state works!...once again sorry.
tzSchErrt1_hr.pretext.map.savestate_jmdwV2.txt

just to reiterate, I've not tagged all the contamination, but you can infer with Hi-C contact/coverage what contigs are the contamination.

Let me know if access problems still arise :)

@phuongdoand
Copy link
Contributor Author

@additive3
Thank you, it does work now, I will check the modifications and let you know if there is somethings needed further discussion.

Remove 20 haplotigs and 560 scaffolds that are contaminated (mostly Chlorophyta)
@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Feb 13, 2026

Attention @additive3, the EAR PDF was updated.

@phuongdoand
Copy link
Contributor Author

Hi @additive3 , I have updated the EAR report after removing the haptotigs tagged by you. Also we have run MetaCC v1.2.0 to cluster the scaffold with Hi-C reads and remove 560 scaffolds that are either Chlorophyta (Algae) or Apicomplexa.

@additive3
Copy link
Collaborator

Hi @phuongdoand
Thanks for making these changes and nice to see that the contaminats could be identified.
I'm happy with this assembly, nice work congratulations.

@erga-ear-bot
Copy link
Contributor

erga-ear-bot bot commented Feb 13, 2026

Thanks @additive3 for the review.
I will add a new reviewed species for you to the table when @diegomics merges the PR ;)

Congrats on the assembly @phuongdoand!
Please make sure that the fasta file to upload to ENA is generated based on the final reviewed version of the assembly.

After @diegomics confirmation, you can start with the assembly submission to save time.
The PR will be merged only when the final version of the EAR pdf is available.

@diegomics diegomics merged commit aa636a9 into ERGA-consortium:main Feb 13, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants