fix: disconnect peers after pong timeout#424
Conversation
Allow other peers to come in instead. I think the timeout of 60s is enough to get rid of them if they fail to pong.
📝 WalkthroughWalkthroughThe changes enhance peer ping management in the network maintenance loop. The Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@dash-spv/src/network/manager.rs`:
- Around line 890-893: The maintenance loop calls pool.remove_peer(&addr) on
ping-timeout peers but doesn't decrement the cached connected_peer_count or emit
network events, causing counter drift and stale event state; modify the
maintenance loop closure to capture network_event_sender, and after successful
removal invoke the same cleanup path used in start_peer_reader: call
connected_peer_count.fetch_sub(1, Ordering::SeqCst) (or the existing decrement
helper), send NetworkEvent::PeerDisconnected with the peer id and then send
NetworkEvent::PeersUpdated so consumers stay in sync; ensure you reference
pool.remove_peer, connected_peer_count, network_event_sender, and mirror the
logic from start_peer_reader to avoid double-removal.
| for addr in peers_to_disconnect { | ||
| log::warn!("Disconnecting peer {} - ping timeout", addr); | ||
| pool.remove_peer(&addr).await; | ||
| } |
There was a problem hiding this comment.
connected_peer_count is never decremented for ping-timeout disconnections, causing counter drift.
When pool.remove_peer is called here, the peer reader task will find the peer gone from the pool (Line 354) and break, but since remove_peer already returned None at that point (Line 637), the fetch_sub on connected_peer_count (Line 640) is skipped. Over time the cached counter will overcount connected peers, breaking is_connected() and peer_count().
You should also emit NetworkEvent::PeerDisconnected / PeersUpdated here to keep event consumers in sync, matching the cleanup in start_peer_reader (Lines 638–654).
Proposed fix
for addr in peers_to_disconnect {
log::warn!("Disconnecting peer {} - ping timeout", addr);
- pool.remove_peer(&addr).await;
+ if pool.remove_peer(&addr).await.is_some() {
+ connected_peer_count.fetch_sub(1, Ordering::Relaxed);
+ let count = connected_peer_count.load(Ordering::Relaxed);
+ let addresses = pool.get_connected_addresses().await;
+ let best_height = pool.get_best_height().await;
+ let _ = network_event_sender.send(NetworkEvent::PeerDisconnected {
+ address: addr,
+ });
+ let _ = network_event_sender.send(NetworkEvent::PeersUpdated {
+ connected_count: count,
+ addresses,
+ best_height,
+ });
+ }
}This requires capturing network_event_sender into the maintenance loop closure (it isn't currently captured).
🤖 Prompt for AI Agents
In `@dash-spv/src/network/manager.rs` around lines 890 - 893, The maintenance loop
calls pool.remove_peer(&addr) on ping-timeout peers but doesn't decrement the
cached connected_peer_count or emit network events, causing counter drift and
stale event state; modify the maintenance loop closure to capture
network_event_sender, and after successful removal invoke the same cleanup path
used in start_peer_reader: call connected_peer_count.fetch_sub(1,
Ordering::SeqCst) (or the existing decrement helper), send
NetworkEvent::PeerDisconnected with the peer id and then send
NetworkEvent::PeersUpdated so consumers stay in sync; ensure you reference
pool.remove_peer, connected_peer_count, network_event_sender, and mirror the
logic from start_peer_reader to avoid double-removal.
Allow other peers to come in instead. I think the timeout of 60s is enough to get rid of them if they fail to pong.
Summary by CodeRabbit