Skip to content

[Core feature] map_task to support ContainerTask #6277

@litaifang

Description

@litaifang

Motivation: Why do you think this is important?

Being able to use any arbitrary docker to execute a task (i.e., ContainerTask) is an important feature in many bioinformatic workflow schemes. However, ContainerTask is currently not supported by map tasks.
This and this were never merged so this capability never made it into flyte.
It'd be great to be able to use map_task to spin out ContainerTasks in parallel.

Goal: What should the final outcome look like, ideally?

Something like this should work as intended:

@dynamic
def map_task_of_container_task(...) -> None:
    inner_task = ContainerTask(
        name="inner-task-bring-your-own-container",
        input_data_dir="/var/inputs",
        output_data_dir="/var/outputs",
        inputs=kwtypes(...),
        outputs=kwtypes(...),
        image="docker.io/ubuntu:latest",
        command=["sh", "-c", ...],
    )
    map_task(inner_task)(map_input=[...])

Describe alternatives you've considered

Alternatively I can just loop over the ContainerTask in a dynamic, but it doesn't have the benefits of a map_task.

Propose: Link/Inline OR Additional context

There were two pull requests attempting to add this feature but were never merged: this and this.

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions