Conversation
Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>
…ncurrent eBPF loading Signed-off-by: Matthias Bertschy <matthias.bertschy@gmail.com>
e16440e to
aec8c19
Compare
📝 WalkthroughWalkthroughThe PR disables host sensor by default in configuration and introduces a TracerManager type to orchestrate tracer registration, retrieval, and lifecycle management with memory-conscious startup sequencing, error resilience, and procfs tracer initialization priority. Changes
Sequence Diagram(s)sequenceDiagram
participant tm as TracerManager
participant tf as TracerFactory
participant procfs as Procfs Tracer
participant other as Other Tracers
participant ctx as Context
tm->>tf: Initialize all tracers
activate tf
tf-->>tm: Return tracer instances
deactivate tf
tm->>procfs: Start procfs tracer first
activate procfs
procfs-->>tm: Startup complete
deactivate procfs
tm->>tm: Wait for procfs scan interval
loop For each remaining tracer
tm->>other: Start tracer
activate other
other-->>tm: Started (or error logged)
deactivate other
tm->>tm: Wait 2 seconds (memory-conscious delay)
end
note over tm: All tracers started with staggered timing<br/>to reduce kernel eBPF load contention
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pkg/containerwatcher/v2/tracer_manager.go (2)
141-161:⚠️ Potential issue | 🟠 MajorUnconditional 30s wait even when procfs tracer is absent or disabled.
The
selectblock at lines 154–158 waits forProcfsScanInterval(default 30s) regardless of whether the procfs tracer was found or started. If the procfs tracer doesn't exist in the map or is disabled, this is a pointless 30-second stall on every startup.Move the wait inside the conditional block so it only applies when the tracer was actually started.
Proposed fix
func (tm *TracerManager) startProcfsTracer(ctx context.Context) error { if tracer, exists := tm.GetTracer(utils.ProcfsEventType); exists { delete(tm.tracers, utils.ProcfsEventType) if tracer.IsEnabled(tm.cfg) { logger.L().Info("Starting procfs tracer before other tracers") if err := tracer.Start(ctx); err != nil { return fmt.Errorf("starting procfs tracer: %w", err) } + // Wait for procfs scan to complete before starting eBPF tracers. + // This delay ensures container/process data is ready for enrichment. + select { + case <-time.After(tm.cfg.ProcfsScanInterval): + case <-ctx.Done(): + return ctx.Err() + } } } - // Wait for procfs scan to complete before starting eBPF tracers. - // This delay ensures container/process data is ready for enrichment. - select { - case <-time.After(tm.cfg.ProcfsScanInterval): - case <-ctx.Done(): - return ctx.Err() - } - return nil }
15-29:⚠️ Potential issue | 🟡 Minor
thirdPartyTracershas no registration method and remains unused.The
thirdPartyTracersslice is initialized in the constructor but never populated. All tracers, including third-party tracers, are registered viaRegisterTracer()which adds them to thetracersmap. This makes thethirdPartyTracersfield dead code, asstopThirdPartyTracers()iterates over an always-empty slice. Either remove this unused field or populate it during tracer registration if separate handling of third-party tracers is intended.
🧹 Nitpick comments (4)
pkg/containerwatcher/v2/tracer_manager.go (4)
76-118: Non-deterministic tracer startup order due to map iteration.
tm.tracersis amap[utils.EventType]TracerInterface, and Go map iteration order is randomized. This means the startup sequence of tracers (beyond the procfs tracer) will vary between runs. If the staggered startup is meant to optimize memory usage patterns, the lack of a stable order could make performance characteristics harder to reason about and reproduce.Consider using an ordered data structure (e.g., a
[]TracerInterfaceslice) for startup ordering, or sort the map keys before iterating.
96-117: Unnecessary 2s delay after the last tracer starts.The delay
selectis executed after every tracer start including the last one, adding an unnecessary 2-second wait at the end ofStartAllTracers. You could skip the delay for the last tracer or move the delay to the beginning of each iteration (skip on first).
49-51:GetAllTracersexposes internal mutable state.Returning the internal
tm.tracersmap directly allows callers to mutate it (add/delete entries). If this is intentional for the factory pattern, that's fine, but it's worth noting the coupling. A read-only copy or a dedicated registration callback would be safer.
123-133:StopAllTracersdiscards all errors except the last.When multiple tracers fail to stop, only
lastErris returned. Consider usingerrors.Join(Go 1.20+) to aggregate all shutdown errors, making debugging easier when multiple tracers fail simultaneously.Proposed fix
func (tm *TracerManager) StopAllTracers() error { tm.stopThirdPartyTracers() - var lastErr error + var errs []error for _, tracer := range tm.tracers { if err := tracer.Stop(); err != nil { - lastErr = err + errs = append(errs, err) } } - return lastErr + return errors.Join(errs...) }
Summary by CodeRabbit
Release Notes
New Features
Chores