Skip to content

Conversation

@anchi205
Copy link

@anchi205 anchi205 commented Jan 23, 2026

Changes

Add stepResources field in Build and BuildRun APIs, to override resources for specific steps in a BuildStrategy or ClusterBuildStrategy.

Fixes shipwright-io/build#1894

Submitter Checklist

  • Includes tests if functionality changed/was added
  • Includes docs if changes are user-facing
  • Set a kind label on this PR
  • Release notes block has been filled in, or marked NONE

See the contributor guide
for details on coding conventions, github and prow interactions, and the code review process.

Release Notes

Add stepResources field in Build and BuildRun APIs, to override resources for specific steps in a BuildStrategy or ClusterBuildStrategy.

Signed-off-by: Anchita Borah <anborah@redhat.com>
@pull-request-size pull-request-size bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 23, 2026
Copy link
Member

@adambkaplan adambkaplan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

This is an excellent 1st SHIP, and definitely meets the requirements for a provisional proposal. We could upgrade this status to "implementable", however I think to do so warrants further discussion that would block merge. I don't think we should do that - a separate PR encourages contribution from others.

My top of mind for making this "implementable" and ready for merge:

  1. How should we use of Tekton's computResources API or Kubernetes pod resources? Should these be an implementation detail, or a separate feature?
  2. Should we add features to the shp CLI, or leave this as a pure "YAML-only" developer experience?
  3. Do we need to add anything to the operator? For example, does our implementation logic need to change depending on the version of Tekton or Kubernetes is deployed?

I think it's fine to add these as "open questions" in the proposal document.


## Summary

This feature introduces a `stepResources` field in the Build and BuildRun APIs, enabling developers to `override` CPU, memory, or ephemeral storage requirements for specific steps defined in a BuildStrategy or ClusterBuildStrategy, without needing to duplicate and modify the strategy itself. Currently, resource requirements are hardcoded at the strategy level, for which either we need to duplicate strategies for different resource needs or request platform admins to modify shared strategies. With this change, users can specify per-step resource overrides directly in their Build or BuildRun specs, and the controller will merge these overrides at runtime when generating the Tekton TaskRun with BuildRun overrides taking precedence over Build overrides, which in turn override strategy defaults. This follows the existing Shipwright override patterns for parameters, volumes, and env., maintaining API consistency while enabling fine-grained, per-build resource customization.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: split to multiple lines (I recommend after 100 characters). Not all systems wrap text on a single line.

### Goals

- Users can specify step-level resource overrides in Build spec and BuildRun spec, without creating duplicate BuildStrategies or requesting changes to shared ClusterBuildStrategies.
- When resources are specified at multiple levels, BuildRun overrides take precedence over Build overrides, which take precedence over strategy defaults.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 - this lines up with our existing precedence patterns.

Comment on lines +96 to +98
stepResources:
- name: build
resources:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 - I think it makes sense to put this API in the strategy section.


## Drawbacks

- `stepResources` will be placed under `spec.strategy` while similar override fields (`paramValues`, `volumes`, `env`) are at the `spec` level. This inconsistency may confuse users.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to call out here, but I think your instincts to place this in strategy is the right choice, since the step names are directly coupled to the selected build strategy.

## Drawbacks

- `stepResources` will be placed under `spec.strategy` while similar override fields (`paramValues`, `volumes`, `env`) are at the `spec` level. This inconsistency may confuse users.
- Allowing arbitrary overrides may lead to builds failing in unexpected ways (e.g., setting memory too low causes OOMkilled, or too high wastes cluster resources).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise good to call out expected failure modes, including "silent" failures such as underutilization.


## Alternatives

- Pass step resource overrides directly to Tekton's `TaskRun.spec.taskSpec.stepSpecs[].computeResources` instead of implementing Shipwright-native abstraction. Trade-off: Shipwright's abstraction is broken and creates inconsistency.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

- Users create their own namespace-scoped `BuildStrategy` with custom resources. Trade-off: It is cumbersome as user needs to duplicate the BuildStrategy to add their custom resource requirements.
- Inject default resources into ClusterBuildStrategies at deployment time via operator. Trade-off: Only sets global defaults, no per-build control.
- Use `LimitRange` to set default resources across namespace. Trade-off: Applies to entire namespace, not build-specific.
- Use Kubernetes native pod-level resource specification.`Kubernetes v1.34 [beta]`(enabled by default). Resources are shared across all containers in pod. Trade-off: Requires PodLevelResources feature gate enabled; build steps run sequentially so no concurrent sharing benefit. No per-step control.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats - you have convinced me the value of having per-step control!

One of the challenges of using Tekton is that from the perspective of Kubernetes, all containers in a Tekton TaskRun pod execute simultaneously. This causes the scheduler to assign pods to a node whose capacity is equal to the sum of requested resources across all steps. See Tekton's article on compute resources for details.

Use of Tekton's computeResources and/or Kubernetes Pod Level Resources could be left as an implementation detail, or a follow up feature.

- Feature works with both BuildStrategy and ClusterBuildStrategy.
- Invalid step names in `stepResources` rejects the Build with clear errors.
- Existing builds without stepResources continue to work unchanged (The feature is backward compatible). Builds that do not specify `stepResources` continue to use strategy-defined defaults with no behavioral change.
- The Tekton TaskRun created by the controller contains the merged resource values in `spec.taskSpec.steps[].computeResources`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave this out as a goal, and move it to a detail in the implementation notes.

- TBD
creation-date: 2026-01-23
last-updated: 2026-01-23
status: provisional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This meets the criteria for an implementable proposal 🎉 , however to get it "over the finish line" we would need more community discussion and feedback. I would rather merge as "provisional" now and have a follow-up PR where those details are ironed out.

- Existing builds without stepResources continue to work unchanged (The feature is backward compatible). Builds that do not specify `stepResources` continue to use strategy-defined defaults with no behavioral change.
- The Tekton TaskRun created by the controller contains the merged resource values in `spec.taskSpec.steps[].computeResources`.

### Non-Goals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider the resources for the Git clone and output image tasks as being in scope, or out of scope as a non-goal?

@openshift-ci
Copy link

openshift-ci bot commented Jan 27, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adambkaplan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[FEATURE] Override Strategy Step Resources in Build and BuildRuns

2 participants