calicovppctl: add BPF filtering support to trace/pcap/dispatch commands#859
calicovppctl: add BPF filtering support to trace/pcap/dispatch commands#859
Conversation
sknat
left a comment
There was a problem hiding this comment.
thanks a lot for putting this together ! Looks neat 😄
I have a bit of a concern on this feature of the caller erroring while running the CLI and failing cleanup. This was probably fine for trace / pcap / dispatch but BPF filter is going to impact performance
| useBPF = true | ||
| defer func() { | ||
| printColored("blue", "Clearing BPF filter...") | ||
| err := clearBPFFilter(k, validatedNode, true) |
There was a problem hiding this comment.
I am a bit worried of us running into an nondeterministic state
if we fail to run the cleanup function.
Could we look at adding an http server that would provide a backend for BPF filters ?
this CLI would then call something like
kubectl exec -it xxxx -c agent -- curl localhost:9999/api/bpftrace?key=value
The backend could then handle timeouts (if for some reason the caller disconnects) and user conflicts (e.g. two users calling at the same time)
Finally the backend could rely on the binary API, which is expectedly more stable than the CLI
There was a problem hiding this comment.
Ack, will work on it.
There was a problem hiding this comment.
Since adding an HTTP server to the agent will involve creating a new image, have come up with an alternate solution in the second commit to add concurrency and cleanup enhancements:
- Capture operations (
trace/pcap/dispatch) are serialized per VPP pod using an in-pod lock file (/tmp/calicovppctl.lock). - Forced cleanup option (
calicovppctl capture clear -node <node>) has been added to deal with situations when a capture fails midway to clear traces, stoppcap traceandpcap dispatch trace, clear BPF filters and restore default filter functions and remove hanging in-pod lock file to restore the VPP instance to a clean state.
10449e0 to
fed6073
Compare
- Add CLI flags: -srcip, -dstip, -srcport, -dstport, -protocol - Implement BPF filter building and application using VPP CLI - Handle empty capture files gracefully - Support BPF filtering for trace, pcap, and dispatch commands Signed-off-by: Aritra Basu <aritrbas@cisco.com>
- Serialize capture operations (trace/pcap/dispatch) per VPP pod using an in-pod lock file (/tmp/calicovppctl.lock), preventing parallel captures from multiple clients - Provide clear error output when a capture is already running - Add forced cleanup option: `calicovppctl capture clear -node <node>` - clears trace - stops `pcap trace` and `pcap dispatch trace` - clears BPF filters and restores default filter functions - removes hanging in-pod lock file Signed-off-by: Aritra Basu <aritrbas@cisco.com>
fed6073 to
0ca6ae3
Compare
sknat
left a comment
There was a problem hiding this comment.
looks neat, thanks a lot, this will be super useful !
| useBPF = true | ||
| defer func() { | ||
| printColored("blue", "Clearing BPF filter...") | ||
| err := clearBPFFilter(k, validatedNode, true) |
This PR has 2 commits:
The first commit adds BPF filtering support to
calicovppctltrace,pcapanddispatchcommands:-srcip,-dstip,-srcport,-dstportand-protocolThe second commit adds concurrency and cleanup enhancements to
calicovppctl:trace/pcap/dispatch) are serialized per VPP pod using an in-pod lock file (/tmp/calicovppctl.lock), preventing parallel captures from multiple clients with error messages.calicovppctl capture clear -node <node>) has been added to deal with situations when a capture fails midway due to different scenarios such as the process being killed withSIGKILL, network disconnects duringkubectl exec, system crash or container restart and abrupt terminal closure while a capture is underway. This clears traces, stopspcap traceandpcap dispatch trace, clears BPF filters and restores default filter functions and removes hanging in-pod lock file.