-
-
Notifications
You must be signed in to change notification settings - Fork 11
job: Manage job lifecycles individually instead of as groups #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
bc2c578 to
a57ae1d
Compare
pippolo84
approved these changes
Dec 15, 2025
Member
pippolo84
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a style nit left inline. 💯
Refactor job groups so registry owns the lifecycle. This ensures that jobs are always started in the order they're added and not in the order the job groups are created. - move lifecyle hooks onto the registry type and keep start/stop logic centralized - drop group-level context/wait-group management and route Group.Add through the registry - adjust job implementations/tests and all callers to the simplified API
"make test-race" was failing due to racy access to `times`. Use a mutex. Signed-off-by: Jussi Maki <jussi.maki@isovalent.com>
Looks like this hadn't been updated when an indirect dependency became a direct dependency. Signed-off-by: Jussi Maki <jussi@isovalent.com>
a57ae1d to
6c85d8c
Compare
dylandreimerink
approved these changes
Dec 20, 2025
Member
dylandreimerink
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat, I like re-using the lifecycles instead of the custom implementation, nice cleanup.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The job scheduler used to register entire job.Groups as lifecycle hooks via module-private providers (cilium#pkg/hive/hive.go). That meant a group (and any “dynamic” jobs attached to it) could be started before all of its dependencies were known, so a goroutine might run prior to the component it relies on. Or a job might be stopped before other jobs depending on it were stopped.
For example:
The
jgis constructed by the first invoke and this causes it to be appended to the hive lifecycle.Assuming
Bwasn't constructed yet, the second invoke will construct it and potentially itsconstructor adds
Bs start hooks to lifecycle. Nowjgwill start afterA's start hooks butbefore
Bs start hooks. This means there's a chance that the job that usesBwill execute beforeB's start hook is called.
The refactor fixes that by:
Together these changes eliminate the race where a job could start before its dependency, while also giving more precise lifecycle logging.