-
Notifications
You must be signed in to change notification settings - Fork 20
SHIP-0046: Resources override in Build and BuildRuns #290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add stepResources field in Build and BuildRun APIs, to override resources for specific steps in a BuildStrategy or ClusterBuildStrategy. Signed-off-by: Anchita Borah <anborah@redhat.com>
adambkaplan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
This is an excellent 1st SHIP, and definitely meets the requirements for a provisional proposal. We could upgrade this status to "implementable", however I think to do so warrants further discussion that would block merge. I don't think we should do that - a separate PR encourages contribution from others.
My top of mind for making this "implementable" and ready for merge:
- How should we use of Tekton's
computResourcesAPI or Kubernetes pod resources? Should these be an implementation detail, or a separate feature? - Should we add features to the shp CLI, or leave this as a pure "YAML-only" developer experience?
- Do we need to add anything to the operator? For example, does our implementation logic need to change depending on the version of Tekton or Kubernetes is deployed?
I think it's fine to add these as "open questions" in the proposal document.
|
|
||
| ## Summary | ||
|
|
||
| This feature introduces a `stepResources` field in the Build and BuildRun APIs, enabling developers to `override` CPU, memory, or ephemeral storage requirements for specific steps defined in a BuildStrategy or ClusterBuildStrategy, without needing to duplicate and modify the strategy itself. Currently, resource requirements are hardcoded at the strategy level, for which either we need to duplicate strategies for different resource needs or request platform admins to modify shared strategies. With this change, users can specify per-step resource overrides directly in their Build or BuildRun specs, and the controller will merge these overrides at runtime when generating the Tekton TaskRun with BuildRun overrides taking precedence over Build overrides, which in turn override strategy defaults. This follows the existing Shipwright override patterns for parameters, volumes, and env., maintaining API consistency while enabling fine-grained, per-build resource customization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: split to multiple lines (I recommend after 100 characters). Not all systems wrap text on a single line.
| ### Goals | ||
|
|
||
| - Users can specify step-level resource overrides in Build spec and BuildRun spec, without creating duplicate BuildStrategies or requesting changes to shared ClusterBuildStrategies. | ||
| - When resources are specified at multiple levels, BuildRun overrides take precedence over Build overrides, which take precedence over strategy defaults. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 - this lines up with our existing precedence patterns.
| stepResources: | ||
| - name: build | ||
| resources: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 - I think it makes sense to put this API in the strategy section.
|
|
||
| ## Drawbacks | ||
|
|
||
| - `stepResources` will be placed under `spec.strategy` while similar override fields (`paramValues`, `volumes`, `env`) are at the `spec` level. This inconsistency may confuse users. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to call out here, but I think your instincts to place this in strategy is the right choice, since the step names are directly coupled to the selected build strategy.
| ## Drawbacks | ||
|
|
||
| - `stepResources` will be placed under `spec.strategy` while similar override fields (`paramValues`, `volumes`, `env`) are at the `spec` level. This inconsistency may confuse users. | ||
| - Allowing arbitrary overrides may lead to builds failing in unexpected ways (e.g., setting memory too low causes OOMkilled, or too high wastes cluster resources). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise good to call out expected failure modes, including "silent" failures such as underutilization.
|
|
||
| ## Alternatives | ||
|
|
||
| - Pass step resource overrides directly to Tekton's `TaskRun.spec.taskSpec.stepSpecs[].computeResources` instead of implementing Shipwright-native abstraction. Trade-off: Shipwright's abstraction is broken and creates inconsistency. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯
| - Users create their own namespace-scoped `BuildStrategy` with custom resources. Trade-off: It is cumbersome as user needs to duplicate the BuildStrategy to add their custom resource requirements. | ||
| - Inject default resources into ClusterBuildStrategies at deployment time via operator. Trade-off: Only sets global defaults, no per-build control. | ||
| - Use `LimitRange` to set default resources across namespace. Trade-off: Applies to entire namespace, not build-specific. | ||
| - Use Kubernetes native pod-level resource specification.`Kubernetes v1.34 [beta]`(enabled by default). Resources are shared across all containers in pod. Trade-off: Requires PodLevelResources feature gate enabled; build steps run sequentially so no concurrent sharing benefit. No per-step control. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Congrats - you have convinced me the value of having per-step control!
One of the challenges of using Tekton is that from the perspective of Kubernetes, all containers in a Tekton TaskRun pod execute simultaneously. This causes the scheduler to assign pods to a node whose capacity is equal to the sum of requested resources across all steps. See Tekton's article on compute resources for details.
Use of Tekton's computeResources and/or Kubernetes Pod Level Resources could be left as an implementation detail, or a follow up feature.
| - Feature works with both BuildStrategy and ClusterBuildStrategy. | ||
| - Invalid step names in `stepResources` rejects the Build with clear errors. | ||
| - Existing builds without stepResources continue to work unchanged (The feature is backward compatible). Builds that do not specify `stepResources` continue to use strategy-defined defaults with no behavioral change. | ||
| - The Tekton TaskRun created by the controller contains the merged resource values in `spec.taskSpec.steps[].computeResources`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave this out as a goal, and move it to a detail in the implementation notes.
| - TBD | ||
| creation-date: 2026-01-23 | ||
| last-updated: 2026-01-23 | ||
| status: provisional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This meets the criteria for an implementable proposal 🎉 , however to get it "over the finish line" we would need more community discussion and feedback. I would rather merge as "provisional" now and have a follow-up PR where those details are ironed out.
| - Existing builds without stepResources continue to work unchanged (The feature is backward compatible). Builds that do not specify `stepResources` continue to use strategy-defined defaults with no behavioral change. | ||
| - The Tekton TaskRun created by the controller contains the merged resource values in `spec.taskSpec.steps[].computeResources`. | ||
|
|
||
| ### Non-Goals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider the resources for the Git clone and output image tasks as being in scope, or out of scope as a non-goal?
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adambkaplan The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Changes
Add stepResources field in Build and BuildRun APIs, to override resources for specific steps in a BuildStrategy or ClusterBuildStrategy.
Fixes shipwright-io/build#1894
Submitter Checklist
See the contributor guide
for details on coding conventions, github and prow interactions, and the code review process.
Release Notes