Skip to content

[draft] [v3.29] fix for issues on ipv6 enabled clusters#864

Draft
sknat wants to merge 13 commits intorelease/v3.29.0from
nsk-dhcp6-fix-v329
Draft

[draft] [v3.29] fix for issues on ipv6 enabled clusters#864
sknat wants to merge 13 commits intorelease/v3.29.0from
nsk-dhcp6-fix-v329

Conversation

@sknat
Copy link
Collaborator

@sknat sknat commented Jan 19, 2026

[WIP] cherry-pick of fixes from master for issues on ipv6 enabled clusters

This superseeds #834

@sknat sknat force-pushed the nsk-dhcp6-fix-v329 branch from 74020c2 to df3305a Compare January 19, 2026 16:02
@aritrbas aritrbas force-pushed the nsk-dhcp6-fix-v329 branch 6 times, most recently from b18bb49 to b163a1d Compare January 26, 2026 08:18
Aritra Basu and others added 8 commits February 3, 2026 02:14
Signed-off-by: Aritra Basu <aritrbas@cisco.com>
Signed-off-by: Aritra Basu <aritrbas@cisco.com>
Signed-off-by: Aritra Basu <aritrbas@cisco.com>
Signed-off-by: Aritra Basu <aritrbas@cisco.com>
Signed-off-by: Aritra Basu <aritrbas@cisco.com>
This patch removes the nodeIP from the tap0 interface in VPP.
With this patch, for each uplink interface eth0 with IP 192.168.0.1/24
we create a corresponding tap0 set up the following way:

* In VRF:0
  * we create the af_packet interface with IP 192.168.0.1/24
  * we receive 192.168.0.1/32 locally, traffic to 192.168.0.1 without listeners
    will end up in punt
* In the punt table
  * we route 192.168.0.1/24 via tap0 192.168.0.1
* In linux
  * tap0 has the 192.168.0.1/24 address
  * tap0 will respond to ARPs as VPP has arp proxy enabled
* In a host-tap-eth0-v4 VRF
  * we place the tap0 interface
  * we give it the 169.254.0.1/32 address, overridable with CALICOVPP_TAP0_ADDR
  * we enable IP6 without setting an address
  * we add a static neighbor for 192.168.0.1 to the MAC of the linux side of the tap
* If we specify a rule in redirectToHostRules (e.g. for DNS in kind)
  * we will have the classifier entry redirect to tap0 192.168.0.1

Signed-off-by: Nathan Skrzypczak <nathan.skrzypczak@gmail.com>
IPv6 gateway traffic (DHCPv6/ICMP) fails when VPP takes over the uplink.
- Without gateway ND proxy, host NS for the default gateway is dropped by VPP
  with "neighbor solicitations for unknown targets" error due to missing /128
  target entry in the tap FIB.

Fix:
- Enable ND proxy for the gateway on the tap so the host can resolve the
  gateway via VPP.

Signed-off-by: Aritra Basu <aritrbas@cisco.com>
Signed-off-by: Nathan Skrzypczak <nathan.skrzypczak@gmail.com>
Aritra Basu added 5 commits February 4, 2026 19:04
Configure ip6tables mangle rule to set hop limit to 2 for DHCPv6 OUTPUT
traffic from client (sport 546) to server (dport 547). This prevents VPP
from dropping DHCPv6 SOLICIT/REQUEST packets when it decrements hop-limit
by 1 during forwarding. Since clients generate SOLICIT/REQUEST with
hop-limit=1, without this rule VPP drops the packet (ip6 ttl <= 1)
with ICMP time exceeded, causing DHCPv6 lease negotiation to fail.

The rule is checked for existence before adding to prevent duplicates
since ip6tables does not auto-dedupe rules. The rule is also cleaned
up during configuration restoration.

Signed-off-by: Aritra Basu <aritrbas@cisco.com>
Link-local addresses are not routable. When synchronizing Linux
routes to VPP's uplink interface, filter out link-local addresses
so that they are not added to VPP's main VRF routing table.

Signed-off-by: Aritra Basu <aritrbas@cisco.com>
Capture ID_NET_NAME_* properties before VPP driver unbind and restore them
via udev rules after VPP creates host-facing tap/tun interface. This is
needed for IAID generation by DHCPv6 client in systemd-networkd to be
consistent across VPP lifecycle on the node.

Key changes:
- Repurpose BEFORE_IF_READ hook to capture udev properties before driver unbind
- Move SetInterfaceNames() before HookBeforeIfRead so interface names are available
- Store ID_NET_NAME_* values and MAC address while interface still has original driver
- Create udev rules for the interface to restore ID_NET_NAME_* values after VPP runs
- Cleanup udev rules on VPP shutdown
- BEFORE_IF_READ → capture, VPP_RUNNING → create, VPP_DONE_OK/ERRORED → cleanup
- Add EnableUdevNetNameRules config knob in CalicoVppDebugConfigType (default: true)
  - Allows disabling udev net name rules generation (if needed). When disabled, skips
    captureHostUdevProps(), createUdevNetNameRules() and removeUdevNetNameRules()

Signed-off-by: Aritra Basu <aritrbas@cisco.com>
IPv6 ping between nodes fails with "l3 mac mismatch" error in VPP's
ethernet-input node. Packets arriving on tap0 with destination MAC
set to the infrastructure gateway's MAC are dropped.

- IPv4 (ARP Proxy): Host sends ARP request, VPP responds with its own
  tap interface MAC. All subsequent IPv4 packets use VPP's MAC as the
  destination, passing VPP's L3 MAC filter check.

- IPv6 (ND Proxy + Neighbor Advertisement): While VPP's ND proxy responds
  to Neighbor Solicitations with the tap interface MAC, the host also
  receives Neighbor Advertisement (NA) packets from the real gateway.
  These NA packets contain the Target Link-Layer Address Option (TLLAO)
  with the real gateway's MAC address. The host overwrites its neighbor
  cache with this information and sends IPv6 packets to the real gateway
  MAC instead of VPP's tap MAC.

Capture the gateway's MAC address from Linux neighbor cache before VPP
takes over the interface, then add it as a secondary MAC address on the
tap interface using VPP's existing sw_interface_add_del_mac_address API.

VPP's ethernet-input node accepts packets with either the primary MAC
or any configured secondary MAC addresses, allowing traffic to flow
regardless of which MAC address the host learned (from ND proxy or NA).

This is a control plane only fix that requires no VPP patches.

Signed-off-by: Aritra Basu <aritrbas@cisco.com>
In dual-stack or IPv6-enabled clusters, the agent can crash when it attempts
to announce or withdraw a BGP path for an IPv6 address, but the nodes does
not have a corresponding IPv6 address configured in HostMetadata.

Before this change, common.MakePath() returned a generic error ('no ip6
address for node'). That error was wrapped by the routing_server and
propagated back to tomb, causing the routing watcher to stop and the
main process to tear down (ending in a fatal gRPC server error).

Changes:
- Added sentinel errors ErrNoNodeIPv4 and ErrNoNodeIPv6 in common.go
- Added helper function IsMissingNodeIP() to detect these specific errors
- Updated MakePath() to return sentinel errors (including for SRv6 next-hop)
- Updated routing_server and prefix_watcher to treat missing-node-IP as a
  non-fatal condition: log a warning indicating we skip announce/withdraw,
  returning nil so tomb does not enter Dying state

This prevents the agent from crashing with a clear warning log for operators.

Signed-off-by: Aritra Basu <aritrbas@cisco.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant