Skip to content

Conversation

@kagrawa2
Copy link

@kagrawa2 kagrawa2 commented Mar 24, 2025

As part of this PR, we are migrating Task_1 challenge to be based on Workflow API Interface. For more details on Workflow API , refer this : https://openfl.readthedocs.io/en/latest/about/features_index/workflowinterface.html

We are running experiment with LocalRuntIme (https://openfl.readthedocs.io/en/latest/about/features_index/workflowinterface.html#localruntime)

Python(3.10 - 3.13) is now the supported version and OpenFL is also upgraded to 1.7.1.
Accordingly we have upgraded GaNDLF to version 0.1.0

Open Issues :

  1. Running experiment with "Ray" backend fails in the Federated Flow "end" step.
  2. Data Loaders returned from GaNDLF cannot be passed as private attributes of collaborators due to the deep-copy issue.
  3. Currently pre trained model is not compatible with GaNDLF version 0.1.0 ( Because of this issue : Corrected forward operations order mlcommons/GaNDLF#897 (comment) )

Testing :

Tested the changes locally by running the experiment with single process on CPU.
Testing the changes locally by running the experiment with single process on GPU.

Significant speedup observed with Ray backend on 4 GPUs.
Execution time reduced by ~75% compared to TaskRunner.

image

@kagrawa2 kagrawa2 changed the title Migrating TaskRunner based Task_1 challenge experiment to Workflow API Migrating TaskRunner based FeTS Task_1 Challenge to Workflow API Mar 24, 2025
@psfoley psfoley self-requested a review March 24, 2025 20:48
@sarthakpati
Copy link
Member

Thanks for the PR!

@Linardos: would it be possible for you to run through this branch with your existing FeTS Challenge setup to check if it works as expected? On a related note, we should put together a few unit tests for this.

@bandatarunkumar bandatarunkumar force-pushed the upgrade_openfl branch 2 times, most recently from ec06ddb to 74cc059 Compare March 25, 2025 13:32
@kagrawa2 kagrawa2 force-pushed the upgrade_openfl branch 4 times, most recently from a66ddb0 to 472c5a8 Compare March 27, 2025 07:59
kagrawa2 and others added 13 commits March 27, 2025 01:01
Signed-off-by: Agrawal, Kush <kush.agrawal@intel.com>
Signed-off-by: Agrawal, Kush <kush.agrawal@intel.com>
Signed-off-by: Agrawal, Kush <kush.agrawal@intel.com>
Signed-off-by: Agrawal, Kush <kush.agrawal@intel.com>
Signed-off-by: Agrawal, Kush <kush.agrawal@intel.com>
Signed-off-by: Agrawal, Kush <kush.agrawal@intel.com>
Added workspace directory changes
Signed-off-by: Agrawal, Kush <kush.agrawal@intel.com>
Signed-off-by: Agrawal, Kush <kush.agrawal@intel.com>
Signed-off-by: Agrawal, Kush <kush.agrawal@intel.com>
Signed-off-by: Agrawal, Kush <kush.agrawal@intel.com>
Signed-off-by: Tarunkumar, Banda <tarunkumar.banda@intel.com>
Signed-off-by: Agrawal, Kush <kush.agrawal@intel.com>
Signed-off-by: Agrawal, Kush <kush.agrawal@intel.com>
@kminhta kminhta requested a review from Linardos April 14, 2025 15:10
@kminhta
Copy link
Collaborator

kminhta commented Apr 14, 2025

Thanks for this @kagrawa2 !

Regarding

Data Loaders returned from GaNDLF cannot be passed as private attributes of collaborators due to the deep-copy issue

Do you happen to have any insights into what is specifically causing this issue? For simulation purposes, it is likely fine, but this seems like it could be a big issue in the long run if participants can potentially access another participant's dataloader.

I've also seen deep copy issues come up in the past (unrelated to GaNDLF), so it may be worth tracking and addressing at some point in a more generic sense, too

Copy link
Collaborator

@kminhta kminhta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After an offline sync, this looks good on my end on a technical side (migrating task runner to workflow api). Regarding the opens, 1-2 are under investigation and 3 is an expected issue that's easily resolvable as required.

I think it'll be contingent on compatibility with @Linardos 's challenge setup. Also, agree that it would be good to set up some tests - let us know if there are any specific one you want us to help create (cc @sarthakpati)

@sarthakpati
Copy link
Member

Thanks, I will sync offline with @Linardos to expedite the test + merge.

@sarthakpati
Copy link
Member

Hi @kagrawa2: this PR needs a bit of TLC to fix the merge conflicts after #201

@kagrawa2
Copy link
Author

Hi @kagrawa2: this PR needs a bit of TLC to fix the merge conflicts after #201

@sarthakpati I will look into it and rebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants