Flexecutor is a Python framework for building and optimizing serverless workflows. It extends Lithops by providing a high-level workflow abstraction (DAG-based) and integrating multiple state-of-the-art smart provisioning strategies. This allows users to deploy parallel workloads on serverless platforms without manually tuning resource configurations.
git clone https://github.com/CLOUDLAB-URV/flexecutor.git
cd flexecutor
pip install .- Cloud-agnostic: Runs on any Lithops-supported backend (AWS Lambda, Azure Functions, GCP Functions, IBM Code Engine, Kubernetes, etc.).
- DAG Workflow Model: Define multi-stage workflows declaratively.
- Automatic Data Handling: Input and output data are managed through object storage, with customizable distribution strategies.
- Smart Provisioning: Choose resource configurations based on time, cost, or a trade-off.
- Extensible Design: Integrate new provisioning strategies without modifying workflow logic.
| Concept | Description |
|---|---|
FlexData |
Defines input/output data and how it is distributed to workers. |
Stage |
Unit of computation that processes inputs and produces outputs. |
DAG |
Workflow composed of dependent stages. |
DAGExecutor |
Executes, profiles, trains, optimizes and runs workflows. |
Scheduler |
Algorithm that selects resource configurations (workers, CPU, memory). |
A Flexecutor application consists of three main components:
- User functions - Python callables that perform your computation
- Data definitions -
FlexDataobjects specifying input/output locations - Workflow graph -
DAGandStageobjects defining execution order
Create Python functions that accept a StageContext parameter.
This context provides access to input files, output paths, and parameters (get_input_paths(), next_output_path(), get_param()).
Wrap your main function with the @flexorchestrator decorator.
This initializes the execution environment and configures the bucket and execution path.
Define your input and output data using FlexData.
StrategyEnum.SCATTER: split files across workersStrategyEnum.BROADCAST: distribute all files to every worker
Build Stage objects that bind user functions to input/output FlexData.
Each stage needs:
- A
stage_id - A function
- Lists of input and output
FlexData - Optional parameters
Create a DAG container and add your stages.
Use the >> operator to define dependencies.
Choose a smart provisioning strategy depending on your optimization goal.
You will pass the scheduler instance to the DAGExecutor.
See Caerus, Ditto, Orion and Jolteon classes and review the input data of each scheduler.
Flexecutor provides a provisioning loop that evaluates execution configurations, learns performance models, and selects the optimal configuration automatically. The loop consists of:
- Profile: Run the workflow under different configurations and collect metrics.
- Train: Learn a performance model from the profiling data.
- Optimize: Compute the best configuration according to the scheduler objective.
- Execute: Run the workflow using the optimal configuration.
Example:
config_space = [
{"process": {"workers": 4, "cpu": 1, "memory": 1024}},
{"process": {"workers": 16, "cpu": 1, "memory": 1024}},
]
executor.profile(config_space)
executor.train()
executor.optimize()
executor.execute()
executor.shutdown()Call shutdown() on the executor before end.
Flexecutor is part of the accepted paper: “Flexecutor: Out-of-the-Box Smart Provisioning for Serverless Workflows”
@inproceedings{molina2025flexecutor,
title = {Flexecutor: Out-of-the-Box Smart Provisioning for Serverless Workflows},
author = {Molina-Gim{\'e}nez, Enrique and Barcelona-Pons, Daniel and Iacoponelli, Octavio H. and Garc{\'i}a-L{\'o}pez, Pedro},
booktitle = {Proceedings of the 11th Workshop of Serverless Computing (WoSC'25)},
year = {2025},
location = {Nashville, TN, USA},
publisher = {ACM},
address = {New York, NY, USA},
pages = {1--6},
doi = {10.1145/3774899.3775013},
url = {https://github.com/CLOUDLAB-URV/flexecutor}
}