Skip to content

A suggestion on your furrr implementation #4

@DavisVaughan

Description

@DavisVaughan

Thanks a lot for incorporating furrr! It's really great to see it get some love and use in other packages. I had a suggestion on "best practices" of using future based packages, hopefully you find it useful.

I suggest you remove the future::plan() call from featureSelection(). It is best practice to let the user supply the plan, and the developer only worries about what code is parallelized, not how it is parallelized.

The reason for this is that you are inherently limiting the user by setting plan(multiprocess) to only be able to use their local computer for parallel feature selection. future can do much more than this, like run on EC2 or a remote cluster. Ideally, this is what you'd have:

# by default, future_map() runs sequentially if you don't specify any plan
featureSelection(...)

# runs in parallel on your local computer
plan(multiprocess)
featureSelection(...)

# runs in parallel sharded over a cluster somewhere
plan(cluster)
featureSelection(...)

# runs in parallel on multiple ec2 instances
plan(cluster, workers = ec2_ip_addresses)
featureSelection(...)

# sends x, y, and z each to a node of the cluster AND runs in parallel on those cluster nodes
plan(list(cluster, multiprocess))
map(list(x,y,z), featureSelection(.x))

See how many fun things you can do if you let the user specify the plan?

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions