-
Notifications
You must be signed in to change notification settings - Fork 52
Add onsite module #626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Add onsite module #626
Conversation
This PR is against the
|
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThis PR replaces the LUCIPHOR tool with ONSITE for phosphorylation site localization across the entire workflow. Changes include adding a new ONSITE module, updating configuration parameters, modifying workflow subcomponents to reference ONSITE outputs, and revising documentation to reflect the new tool. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
README.mdconf/modules/modules.configconf/modules/verbose_modules.configdocs/output.mdmodules/local/openms/onsite/main.nfmodules/local/openms/onsite/meta.ymlnextflow.confignextflow_schema.jsonsubworkflows/local/dda_id/main.nfsubworkflows/local/id/main.nfsubworkflows/local/phospho_scoring/main.nfsubworkflows/local/phospho_scoring/meta.yml
🔇 Additional comments (14)
docs/output.md (1)
23-23: LGTM!Documentation correctly updated to reflect the new onsite tool for modification localization.
conf/modules/verbose_modules.config (1)
160-162: LGTM!The verbose module configuration has been correctly updated to reference the ONSITE module and publish to the appropriate directory.
subworkflows/local/phospho_scoring/meta.yml (1)
11-11: LGTM!The component reference has been correctly updated from luciphor to onsite in the phospho_scoring subworkflow metadata.
subworkflows/local/dda_id/main.nf (1)
171-171: LGTM!The output reference has been consistently updated to use
id_onsite, matching the change insubworkflows/local/id/main.nf. This ensures uniform handling of ONSITE outputs across both DDA_ID and ID workflows.README.md (1)
39-39: The onsite repository exists and is publicly accessible.The repository bigbio/onsite exists and was last updated August 27, 2025, confirming the URL referenced in the documentation is valid and publicly available.
subworkflows/local/id/main.nf (1)
67-67: PHOSPHO_SCORING subworkflow correctly emitsid_onsiteoutput.Verified that the PHOSPHO_SCORING subworkflow (subworkflows/local/phospho_scoring/main.nf) emits the
id_onsiteoutput at line 28. The output reference in line 67 of the current file is valid.conf/modules/modules.config (1)
112-125: Parameter definition verified and properly configured.The
params.onsite_debugparameter is defined innextflow.configwith default value0and fully documented innextflow_schema.jsonas an integer type with description "Debug level for onsite step. Increase for verbose logging and keeping temp files." No action required.subworkflows/local/phospho_scoring/main.nf (1)
6-28: LGTM! Clean migration from LUCIPHOR to ONSITE.The workflow correctly replaces LUCIPHOR with ONSITE across both execution branches (multi-engine and single-engine paths). The channel naming, version tracking, and emit declarations are all consistently updated to reference ONSITE outputs.
modules/local/openms/onsite/main.nf (4)
54-67: Target modifications logic is well-structured.The code correctly handles the target modifications for LucXor:
- Parses comma-separated modifications from
params.mod_localization- Adds decoy modification when enabled
- Provides sensible defaults when no modifications are specified
13-16: Output declarations are correct.The output patterns correctly use wildcards to match algorithm-specific filenames, and the emit name
ptm_in_id_onsitealigns with the workflow's expectations.
95-98: Verify the version extraction command matches the onsite tool's output format.The command at line 97 extracts version using
grep -oP 'version \\K[0-9.]+'. This pattern assumesonsite --versionoutputs text containing the word "version" followed by dot-separated numbers. If the tool's output format differs (e.g., "onsite X.Y.Z" without the "version" prefix), the regex will fail and the fallback "0.0.1" will always be recorded.
32-32: Add validation to ensuremeta.dissociationmethodis populated when using SDRF input.When using SDRF, the
DissociationMethodcolumn is read without validation. If this column is missing, null, or contains an unrecognized value that doesn't match the known dissociation types (CID, HCD, ETD, ECD),meta.dissociationmethodcould become null. Althoughparams.onsite_fragment_methodhas a default fallback to 'CID' at line 32, other modules like luciphor and msgf usemeta.dissociationmethoddirectly (see lines 44 and 80 in their respective main.nf files), making them susceptible to null values. Add a null-check or default value in the input channel creation to ensure this field is always populated.nextflow.config (1)
147-156: Parameter configuration is well-structured.The onsite parameters are properly defined with sensible defaults. The algorithm, fragmentation method, and tolerance settings align with the module implementation.
nextflow_schema.json (1)
433-504: Schema updates are comprehensive and consistent.The JSON schema properly defines all onsite parameters with:
- Correct types and constraints (enums, number ranges)
- Default values matching
nextflow.config- Appropriate help text and UI metadata
- Proper field visibility settings
| container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? | ||
| 'https://depot.galaxyproject.org/singularity/pyonsite:0.0.1--pyhdfd78af_0' : | ||
| 'quay.io/biocontainers/pyonsite:0.0.1--pyhdfd78af_0' }" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Search for pyonsite references in the codebase
find . -type f -name "*.nf" | head -20Repository: bigbio/quantms
Length of output: 956
🏁 Script executed:
# Check if the file exists and show its content
if [ -f "modules/local/openms/onsite/main.nf" ]; then
cat -n "modules/local/openms/onsite/main.nf" | head -20
fi
# Also search for any pyonsite references in the repo
rg "pyonsite" -t nextflow 2>/dev/null || echo "No pyonsite references found with ripgrep"Repository: bigbio/quantms
Length of output: 920
The pyonsite:0.0.1--pyhdfd78af_0 container does not appear to exist in public registries and will cause runtime failures.
Verify that this container is available in both Quay.io (quay.io/biocontainers/pyonsite:0.0.1--pyhdfd78af_0) and the Galaxy Singularity depot before deploying. If the package is in Bioconda, ensure it has been released and the BioContainers images have been built. If not yet released, this process module will fail at execution time.
| name: onsite | ||
| description: PTM site localization using onsite (AScore, PhosphoRS, or LucXor). | ||
| keywords: | ||
| - onsite | ||
| - AScore | ||
| - PhosphoRS | ||
| - LucXor | ||
| - modification | ||
| - PTM | ||
| tools: | ||
| - onsite: | ||
| description: | | ||
| Mass spectrometry post-translational modification localization tool | ||
| homepage: https://github.com/bigbio/onsite | ||
| documentation: https://github.com/bigbio/onsite | ||
| input: | ||
| - meta: | ||
| type: map | ||
| description: Groovy Map containing sample information | ||
| - mzml_file: | ||
| type: file | ||
| description: Input spectrum file. | ||
| pattern: "*.mzML" | ||
| - id_file: | ||
| type: file | ||
| description: Protein/peptide identifications file | ||
| pattern: "*.idXML" | ||
| output: | ||
| - meta: | ||
| type: map | ||
| description: Groovy Map containing sample information | ||
| - out_id_ptmscores: | ||
| type: file | ||
| description: Protein/peptide identifications file with scored PTM sites | ||
| pattern: "*.idXML" | ||
| - log: | ||
| type: file | ||
| description: log file | ||
| pattern: "*.log" | ||
| - version: | ||
| type: file | ||
| description: File containing software version | ||
| pattern: "versions.yml" | ||
| authors: | ||
| - "@weizhongchun" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify onsite module implementation exists
echo "=== Checking if onsite module main.nf exists ==="
fd -t f 'main.nf' modules/local/openms/onsite/
echo -e "\n=== Verifying output channel names in onsite module ==="
if [ -f modules/local/openms/onsite/main.nf ]; then
rg -n 'emit:' modules/local/openms/onsite/main.nf
echo -e "\n=== Full onsite main.nf content ==="
cat modules/local/openms/onsite/main.nf
else
echo "Module file not found"
fi
echo -e "\n=== Checking if phospho_scoring subworkflow exists ==="
fd -t f 'main.nf' subworkflows/local/phospho_scoring/
echo -e "\n=== Verifying phospho_scoring subworkflow uses correct output names ==="
if [ -f subworkflows/local/phospho_scoring/main.nf ]; then
rg -n 'ONSITE' subworkflows/local/phospho_scoring/main.nf
else
echo "Subworkflow file not found"
fiRepository: bigbio/quantms
Length of output: 4894
Update meta.yml output names to match main.nf emit declarations.
The meta.yml file declares output out_id_ptmscores, but the corresponding main.nf module emits the channel as ptm_in_id_onsite. Update the meta.yml to declare:
ptm_in_id_onsiteinstead ofout_id_ptmscores
The phospho_scoring subworkflow correctly uses ONSITE.out.ptm_in_id_onsite, confirming the main.nf implementation is correct and the meta.yml documentation is out of sync.
🤖 Prompt for AI Agents
modules/local/openms/onsite/meta.yml lines 1-45: the outputs section declares
out_id_ptmscores but the module actually emits ptm_in_id_onsite; update the
outputs key name to ptm_in_id_onsite (keep the same type, description and
pattern values), replacing the out_id_ptmscores entry so the meta.yml matches
main.nf and the phospho_scoring subworkflow.
Summary by CodeRabbit
New Features
Configuration
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.