-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
When we had the original discussions about a new workflow format to replace the de-facto standard ocrd process syntax in the core implementation, there was a general understanding that the spec must not fall short of the following features (met by the implementation):
- declarative – workflows contain no program code (thus are easy to understand and maintain)
- universal – workflows can be formulated independent of the installation details (e.g. paths)
- pure – workflows can be formulated independent of the data (e.g. paths)
- well-defined – workflows can be validated without actually running them
However, as it stands, the Workflow Format spec does not seem to meet these criteria. It raises questions …
- Why is it necessary to specify the
venv(and even de/activate it) and theworkspace/metspath in NF? - Why are
readsandoutsformulated as absolute paths in NF (instead of just fileGrp names)? - Where is the actual METS perspective (instead of just filesystem artifacts like output directories, which can be empty or incomplete or simply not reflected in the METS at all)? Shouldn't it be possible to formulate Channels and Processes in a way that progress gets reflected by the actual METS/PAGE results?
- Why do output fileGrps have to be explicitly (manually) named (instead of using NF's pipe operator)?
- How do you continue processing a workflow on a workspace after it has failed earlier or ran another workflow with some shared steps earlier (i.e. incremental processing)?
- How can you monitor job status and access job logs? (Is the NF call meant to be combined by
-with-toweror-with-reportarguments? If so, how does the caller get to know which job is which during runtime?)
I understand that you tried to apply Nextflow to the OCR-D CLI directly. But currently I don't see a benefit over running the shell scripts directly (from a custom Workflow executor in core).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels