Skip to content

Comments

Add node neighbor link updater controller#9

Open
MitchLewis930 wants to merge 1 commit intopr_049_beforefrom
pr_049_after
Open

Add node neighbor link updater controller#9
MitchLewis930 wants to merge 1 commit intopr_049_beforefrom
pr_049_after

Conversation

@MitchLewis930
Copy link

PR_049

This controller is responsible for monitoring nodes that request a neighbor
link update.
It provides a single place in the application where all node neighbor link
are updates are performed as insertNeighbor are sprinkled in the code base.
Additionally given the central location we can provide additional
module health signals should neighbor link updates failed.

Signed-off-by: Fernand Galiana <fernand.galiana@gmail.com>
@MitchLewis930 MitchLewis930 requested a review from Copilot January 31, 2026 01:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new controller for managing node neighbor link updates in the Cilium datapath. The main purpose is to centralize and provide health monitoring for node neighbor link operations, replacing the previous goroutine-based approach with a queue-based controller system.

Changes:

  • Added a new controller StartNodeNeighborLinkUpdater that processes node neighbor link updates from a managed queue
  • Modified NodeNeighborRefresh to accept a refresh parameter for timestamp-based rate limiting
  • Replaced direct insertNeighbor goroutine calls with an enqueue mechanism for better monitoring and error handling

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/option/config.go Added blank line for formatting consistency
pkg/node/manager/queue_test.go Added comprehensive test coverage for the new queue data structure
pkg/node/manager/manager.go Implemented the core queue-based controller logic and Enqueue method
pkg/node/manager/cell.go Added interface definitions for the new enqueuer functionality
pkg/datapath/types/node.go Updated NodeNeighborRefresh signature to include refresh parameter
pkg/datapath/types/config.go Added NodeNeighborEnqueuer interface definition
pkg/datapath/linux/node_test.go Updated test instantiations to include mock enqueuer
pkg/datapath/linux/node_linux_test.go Updated test cases to use new signature and mock enqueuer
pkg/datapath/linux/node.go Replaced direct neighbor insertion with enqueue calls and updated error handling
pkg/datapath/linux/fuzz_test.go Added mock enqueuer implementation for fuzzing tests
pkg/datapath/linux/datapath.go Wired up NodeManager dependency for enqueuer
pkg/datapath/fake/types/node.go Updated fake implementation signature
pkg/datapath/cells.go Added NodeManager dependency injection
daemon/cmd/status.go Updated client implementation signature
daemon/cmd/daemon_main.go Initialized the new neighbor link updater controller

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if n == nil {
log.WithFields(logrus.Fields{
logfields.LogSubsys: "enqueue",
}).Warn("Skipping nodeNeighbor insert: No node given")
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function logs a warning when n is nil but continues to push the nil node to the queue anyway. This will cause errors downstream in StartNodeNeighborLinkUpdater where it checks for nil nodes. Add an early return after the warning log.

Suggested change
}).Warn("Skipping nodeNeighbor insert: No node given")
}).Warn("Skipping nodeNeighbor insert: No node given")
return

Copilot uses AI. Check for mistakes.
Comment on lines +877 to +880
if e == nil || e.node == nil {
errs = errors.Join(errs, fmt.Errorf("invalid node spec found in queue: %#v", e))
break
}
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop breaks immediately when encountering a nil entry or nil node, which prevents processing remaining valid entries in the queue. The loop should use continue instead of break to skip invalid entries and process remaining valid ones, or check isEmpty() to determine when to exit.

Suggested change
if e == nil || e.node == nil {
errs = errors.Join(errs, fmt.Errorf("invalid node spec found in queue: %#v", e))
break
}
if e == nil {
break
}
if e.node == nil {
errs = errors.Join(errs, fmt.Errorf("invalid node spec found in queue: %#v", e))
continue
}

Copilot uses AI. Check for mistakes.
Comment on lines 1383 to +1390
n.mutex.Lock()
isInitialized := n.isInitialized
n.mutex.Unlock()
if !isInitialized {
if !n.isInitialized {
n.mutex.Unlock()
// Wait until the node is initialized. When it's not, insertNeighbor()
// is not invoked, so there is nothing to refresh.
return nil
}

return n.insertNeighbor(ctx, &nodeToRefresh, true)
n.mutex.Unlock()
Copy link

Copilot AI Jan 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mutex lock/unlock pattern here is unnecessarily split. Consider using defer n.mutex.Unlock() immediately after the lock to simplify the control flow and prevent potential future errors if additional code paths are added.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants