Skip to content

Draft: pythonic run_module(**kwargs) with auto-grouping#18

Draft
mathysgrapotte wants to merge 5 commits intomasterfrom
feat/pythonic-api
Draft

Draft: pythonic run_module(**kwargs) with auto-grouping#18
mathysgrapotte wants to merge 5 commits intomasterfrom
feat/pythonic-api

Conversation

@mathysgrapotte
Copy link
Owner

@mathysgrapotte mathysgrapotte commented Jan 31, 2026

This PR is a draft / first pass at a more Pythonic nf-core module API.

Goal

  • Enable the call style:
    • result = pynf.run_module("<module_id>", param1=..., param2=...)
  • Where param* are module inputs, and py-nf auto-maps them onto Nextflow input channels.

Summary of code changes (by file)

src/pynf/__init__.py

  • Re-purposed pynf.run_module(...) to mean: run an nf-core module via a keyword-arg API.
    • New signature:
      • run_module(module_id: str, *, executor="local", docker=False, cache_dir=..., github_token=None, params=None, verbose=False, force_download=False, **inputs) -> ModuleResult
  • Added ModuleResult (a NamedTuple) as the “tuple-like” return type:
    • output_files: list[str]
    • workflow_outputs: list[dict]
    • report: dict
    • raw: NextflowResult
  • Implementation details:
    • Uses api.get_module_inputs(...) to introspect expected input channels
    • Calls auto_group_inputs(...) to convert **inputsExecutionRequest.inputs (list-of-dicts)
    • Runs the module using existing run_nfcore_module(...)
    • Wraps the underlying NextflowResult into ModuleResult for easy unpacking + still exposing raw

Design choice: ModuleResult keeps the “pythonic tuple” feel without losing access to the full NextflowResult.

src/pynf/_core/pythonic_api.py (new)

  • Adds the auto-grouping algorithm that maps keyword args onto channels.
  • Core behavior:
    • Unknown kwarg names → error (lists valid names)
    • If a kwarg name matches multiple channels, we try a constrained backtracking assignment
    • If there is no unambiguous assignment → error (no silent guessing)
    • After assignment, validate that every channel has all required param names

Design choice: this is “constrained magic” — ambiguity is an error, because guessing would be too surprising.

tests/test_pythonic_api_grouping.py (new)

  • Unit tests for grouping logic without running Nextflow.
  • Uses monkeypatch to stub:
    • pynf.api.get_module_inputs(...)
    • pynf.run_nfcore_module(...)
  • Covers:
    • single-channel success
    • unknown kwarg error
    • ambiguous mapping error

template/pynf/__init__.pyi

  • Updated stubs to match the new public API:
    • ModuleResult
    • pythonic run_module(...) signature

template/pynf/_core/pythonic_api.pyi (new)

  • Stub for grouping helpers.

README.md

  • Updated examples that previously used run_module("path/to/script.nf") to use run_script(...).
    • (Since run_module now means “run nf-core module by id”.)

Design / tradeoffs

  • Strictness over surprising heuristics:
    • If **inputs can’t be mapped uniquely to the module input channels, we raise and tell users to pass explicit grouped inputs (via ExecutionRequest.inputs).
  • Returns are tuple-like but still rich:
    • You can unpack ModuleResult, but you can also inspect raw (the underlying NextflowResult).
  • No execution changes:
    • This PR only adds a nicer front-door and grouping; it reuses the existing execution machinery.

Follow-ups / open questions

  • Should we keep the old run_module(path: Path) name somewhere (e.g. run_script alias), or is renaming acceptable?
  • Should we expose an explicit run_module_groups(module_id, groups=[...]) helper as the non-magic escape hatch?
  • Docker config currently defaults to quay.io when docker=True; we might want a more explicit docker_config= option here.

@mathysgrapotte
Copy link
Owner Author

Pushed a readability-focused refactor of the kwargs auto-grouping (src/pynf/_core/pythonic_api.py).

Main simplifications:

  • Removed the backtracking / recursive assignment logic.
  • New rule: if the module schema itself is ambiguous (same input name appears in multiple channels), we raise immediately and require explicit ExecutionRequest.inputs.
  • Straight mapping: name -> channel index, then fill groups[index][name].
  • Kept the same validations (unknown keys, missing required inputs).

This trades away “clever” mapping in favor of explicit + predictable behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant