-
Notifications
You must be signed in to change notification settings - Fork 161
Add Kubernetes autoscaling support #418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Kubernetes autoscaling support #418
Conversation
- Implement HPA resources for 8 services (fuzzer-bot, coverage-bot, build-bot, pov-reproducer, tracer-bot, patcher, seed-gen, task-downloader) - Add autoscaling configuration to service values files (disabled by default) - Update deployment templates to support conditional replica counts when HPA is enabled - Adjust resource requests for better HPA accuracy (fuzzer-bot, coverage-bot, pov-reproducer, tracer-bot) - Add autoscaling documentation to global values.yaml Resolves trailofbits#348
dguido
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Review: Add Kubernetes autoscaling support
Thanks for adding HPA support! The implementation is solid, but there are a few items to address before merging.
🔴 Blocking: Resource Request Increases Need Justification
The PR silently increases resource requests, which will affect cluster capacity even when HPA is disabled:
| Service | CPU Request | Memory Request |
|---|---|---|
| coverage-bot | 250m → 500m | 256Mi → 6Gi (24x increase) |
| fuzzer-bot | — | 256Mi → 1536Mi (6x) |
| pov-reproducer | 100m → 500m | 1Gi → 3Gi (3x) |
| tracer-bot | — | 256Mi → 1536Mi (6x) |
The PR description mentions "adjust resource requests for better HPA accuracy" but these are significant changes that need justification:
Action required: Either:
- Add justification for these values (OOM observations, profiling data, etc.), or
- Split resource request changes into a separate PR so they can be reviewed independently
🟡 Suggestions
Consider a shared HPA template
All 8 hpa.yaml files are nearly identical (272 lines of duplication). Consider a helper template in _helpers.tpl:
{{- define "common.hpa" -}}
{{- if and .Values.enabled .Values.autoscaling.enabled }}
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
...
{{- end }}
{{- end }}Not blocking, but would reduce maintenance burden.
Add memory-based scaling for memory-intensive services
Only pov-reproducer has targetMemoryUtilizationPercentage. Given fuzzer-bot, coverage-bot, and tracer-bot are memory-intensive, they might benefit from memory-based scaling too.
Document the scale-down timing differences
Some services use 600s stabilization (coverage-bot, tracer-bot, patcher), others use 300s. A brief comment in the values files explaining why would help future maintainers.
🟢 What looks good
- Uses
autoscaling/v2(current stable API) ✓ - Correctly gates replica count with
{{- if not .Values.autoscaling.enabled }}✓ - Disabled by default (safe rollout) ✓
- Condition
and .Values.enabled .Values.autoscaling.enabledis correct ✓ - Well-structured behavior policies with stabilization windows ✓
Optional improvements (not blocking)
- Consider adding PodDisruptionBudgets for critical services when scaling down
- Update deployment docs with metrics-server setup instructions for different cluster types
- Revert resource request increases to original values - Extract shared HPA template into _helpers.tpl to reduce duplication - Add memory-based scaling for memory-intensive services - Document scale-down stabilization window choices
|
Hi @ArsalanAnwer0, Could you please explain the use case you are targeting and how you've seen these changes benefit that? When we've talked about this in the past it was mostly as a way of scaling up and down build-bot and potentially patcher. We believe a metric based on pending work would more accurately reflect when scaling needs to happen. I.e. if there are three pending patch requests, we could scale to three patchers to work in parallel. When they complete, the patcher count could go down to 1 (or 0?). Same for builder, it is only used at the beginning of the processing but then you could benefit from having more builder and scale them down once all builds are complete. Fuzzers could potentially benefit from this current change, but you would likely scale up to max as soon as you start processing (at least for well-written harnesses). The main benefit there would be to scale down when idle. Also, when scaling down one would need to be carful to not kill a pod that's doing long-running useful work (e.g. patcher) but instead one of the idle ones. Another thing that complicates this is that a lot of the work is done by docker-in-docker and not e.g. build-bot itself. So, the CPU usage of the build-bot doesn't accurately reflect the load on the system. Happy to hear your thoughts on this and your experience with auto-scaling for Buttercup. |
Implement HorizontalPodAutoscaler (HPA) for 8 services to enable dynamic scaling based on CPU/memory utilization.
Changes
buttercup.hpa) in_helpers.tplto eliminate duplicationConfiguration
Autoscaling is disabled by default. Enable per service in values.yaml:
Stabilization Windows
Prerequisites
Requires metrics server installed in the cluster.
Resolves #348