Skip to content

chore: add unit test for jobsets#3176

Open
upsaurav12 wants to merge 2 commits intokubeflow:masterfrom
upsaurav12:builder-jobset
Open

chore: add unit test for jobsets#3176
upsaurav12 wants to merge 2 commits intokubeflow:masterfrom
upsaurav12:builder-jobset

Conversation

@upsaurav12
Copy link

What this PR does / why we need it:
This PR adds unit test for builder module under jobset folder, which includes testing for Initialize and Trainer.
I have written two tests TestInitialize and TestTrainer which covers the test cases such as edge cases or conditions.

Which issue(s) this PR fixes (optional, in Fixes #<issue number>, #<issue number>, ... format, will close the issue(s) when PR gets merged):
Fixes #3152

Signed-off-by: upsaurav12 <sauravup041103@gmail.com>
Copilot AI review requested due to automatic review settings February 4, 2026 19:01
@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign terrytangyuan for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds unit tests for the builder module in the jobset plugin to improve test coverage as part of issue #3152. The PR introduces two test functions (TestInitializer and TestTrainer) that test the Initializer and Trainer methods of the Builder type.

Changes:

  • Added TestInitializer function with 3 test cases for dataset and model initializers
  • Added TestTrainer function with 2 test cases for trainer configuration

Comment on lines +306 to +310
tests := []struct {
name string
jobset *jobsetv1alpha2ac.JobSetApplyConfiguration
trainJob *trainer.TrainJob
}{
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test structure doesn't follow the established convention in this codebase. Other tests in pkg/runtime/framework/plugins use map[string]struct for test cases instead of []struct. For reference, see pkg/runtime/framework/plugins/jobset/jobset_test.go:51 and pkg/runtime/framework/plugins/mpi/mpi_test.go:48.

Copilot uses AI. Check for mistakes.
},
},
{
name: "ancestor is dataset initializer",
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All three test cases have identical names "ancestor is dataset initializer". The first test case should be "ancestor is dataset initializer", the second should be "ancestor is dataset initializer with storage uri env", and the third should be "ancestor is model initializer".

Suggested change
name: "ancestor is dataset initializer",
name: "ancestor is model initializer",

Copilot uses AI. Check for mistakes.
},
},
{
name: "ancestor is dataset initializer",
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both test cases have identical names "ancestor is dataset initializer". The first should be "ancestor is dataset initializer" and the second should be "ancestor is trainer" to reflect what they're actually testing.

Suggested change
name: "ancestor is dataset initializer",
name: "ancestor is trainer",

Copilot uses AI. Check for mistakes.
Comment on lines +294 to +299
builder.isRunLauncherAsNode(&runtime.Info{
Labels: map[string]string{
"hello_world": "1234",
"hello_world_1": "3456",
},
})
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The isRunLauncherAsNode method is being called without checking its return value or verifying its behavior. This test appears incomplete - it doesn't assert anything about the method's behavior and doesn't relate to the Initializer test's purpose. Either add proper assertions or remove this line.

Copilot uses AI. Check for mistakes.
}

for _, test := range tests {
t.Run(t.Name(), func(t *testing.T) {
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is calling t.Name() instead of using test.name. This will result in all test cases having the same test name in the output, making it impossible to identify which test case failed. Use test.name instead.

Suggested change
t.Run(t.Name(), func(t *testing.T) {
t.Run(test.name, func(t *testing.T) {

Copilot uses AI. Check for mistakes.
Comment on lines +265 to +302
for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
builder := NewBuilder(test.jobset)
builder.Initializer(test.trainJob)
rj := builder.Spec.ReplicatedJobs[0]

if *rj.Replicas != 1 {
t.Fatalf("expected replicas=1, got %d", *rj.Replicas)
}

containers := rj.Template.Spec.Template.Spec.Containers
env := containers[0].Env

found := false
for _, e := range env {
if *e.Name == jobsetplgconsts.InitializerEnvStorageUri {
found = true
}
}

if !found {
t.Fatalf("storage uri env not injected")
}

envFrom := containers[0].EnvFrom
if len(envFrom) == 0 {
t.Fatalf("expected secret envFrom injected")
}

builder.isRunLauncherAsNode(&runtime.Info{
Labels: map[string]string{
"hello_world": "1234",
"hello_world_1": "3456",
},
})
})

}
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TestInitializer test doesn't test the Model Initializer path, which is handled by the third test case but the test logic doesn't verify model-specific behavior (it only checks dataset initializer behavior for all three cases). Consider adding proper assertions for the Model Initializer case or testing both paths correctly.

Copilot uses AI. Check for mistakes.
Comment on lines 36 to 475
func TestInitializer(t *testing.T) {
tests := []struct {
name string
jobset *jobsetv1alpha2ac.JobSetApplyConfiguration
trainJob *trainer.TrainJob
}{
{
name: "ancestor is dataset initializer",
jobset: &jobsetv1alpha2ac.JobSetApplyConfiguration{
Spec: &jobsetv1alpha2ac.JobSetSpecApplyConfiguration{
ReplicatedJobs: []jobsetv1alpha2ac.ReplicatedJobApplyConfiguration{
{
Template: &v1.JobTemplateSpecApplyConfiguration{
Spec: &v1.JobSpecApplyConfiguration{
Completions: ptr.To(int32(2)),
Template: &corev.PodTemplateSpecApplyConfiguration{
Spec: &corev.PodSpecApplyConfiguration{
Containers: []corev.ContainerApplyConfiguration{
{
Env: []corev.EnvVarApplyConfiguration{
{
Name: ptr.To("example_1"),
},
},
EnvFrom: []corev.EnvFromSourceApplyConfiguration{
{
SecretRef: &corev.SecretEnvSourceApplyConfiguration{
Optional: ptr.To(false),
},
},
},
Name: ptr.To(constants.DatasetInitializer),
},
},
},
},
},
ObjectMetaApplyConfiguration: &metav1.ObjectMetaApplyConfiguration{
Labels: map[string]string{
constants.LabelTrainJobAncestor: constants.DatasetInitializer,
},
},
},
Name: ptr.To("test_replica_name_1"),
GroupName: ptr.To("test_group_name_1"),
Replicas: ptr.To(int32(1)),
},
},
},
},
trainJob: &trainer.TrainJob{
Spec: trainer.TrainJobSpec{
Labels: map[string]string{
"hello_world": "1234",
"hello_world_1": "3456",
},
RuntimeRef: trainer.RuntimeRef{
Name: "",
APIGroup: ptr.To(""),
Kind: ptr.To(""),
},
Initializer: &trainer.Initializer{
Model: &trainer.ModelInitializer{
StorageUri: ptr.To("example_1"),
},
Dataset: &trainer.DatasetInitializer{
StorageUri: ptr.To("example_1"),
SecretRef: &corev1.LocalObjectReference{
Name: "",
},
},
},
Trainer: &trainer.Trainer{
Image: ptr.To(""),
},
},
},
},
{
name: "ancestor is dataset initializer",
jobset: &jobsetv1alpha2ac.JobSetApplyConfiguration{
Spec: &jobsetv1alpha2ac.JobSetSpecApplyConfiguration{
ReplicatedJobs: []jobsetv1alpha2ac.ReplicatedJobApplyConfiguration{
{
Template: &v1.JobTemplateSpecApplyConfiguration{
Spec: &v1.JobSpecApplyConfiguration{
Completions: ptr.To(int32(2)),
Template: &corev.PodTemplateSpecApplyConfiguration{
Spec: &corev.PodSpecApplyConfiguration{
Containers: []corev.ContainerApplyConfiguration{
{
Env: []corev.EnvVarApplyConfiguration{
{
Name: ptr.To(jobsetplgconsts.InitializerEnvStorageUri),
},
},
EnvFrom: []corev.EnvFromSourceApplyConfiguration{
{
SecretRef: &corev.SecretEnvSourceApplyConfiguration{
Optional: ptr.To(false),
},
},
},
Name: ptr.To(constants.DatasetInitializer),
},
},
},
},
},
ObjectMetaApplyConfiguration: &metav1.ObjectMetaApplyConfiguration{
GenerateName: ptr.To(""),
Labels: map[string]string{
constants.LabelTrainJobAncestor: constants.DatasetInitializer,
},
},
},
Name: ptr.To("test_replica_name_1"),
GroupName: ptr.To("test_group_name_1"),
Replicas: ptr.To(int32(2)),
},
},
},
},
trainJob: &trainer.TrainJob{
Spec: trainer.TrainJobSpec{
Labels: map[string]string{
"hello_world": "1234",
"hello_world_1": "3456",
},
RuntimeRef: trainer.RuntimeRef{
Name: "",
APIGroup: ptr.To(""),
Kind: ptr.To(""),
},
Initializer: &trainer.Initializer{
Model: &trainer.ModelInitializer{
StorageUri: ptr.To("uri_example"),
},
Dataset: &trainer.DatasetInitializer{
StorageUri: ptr.To("uri_example"),
SecretRef: &corev1.LocalObjectReference{
Name: "",
},
},
},
Trainer: &trainer.Trainer{
Image: ptr.To(""),
},
},
},
},
{
name: "ancestor is dataset initializer",
jobset: &jobsetv1alpha2ac.JobSetApplyConfiguration{
Spec: &jobsetv1alpha2ac.JobSetSpecApplyConfiguration{
ReplicatedJobs: []jobsetv1alpha2ac.ReplicatedJobApplyConfiguration{
{
Template: &v1.JobTemplateSpecApplyConfiguration{
Spec: &v1.JobSpecApplyConfiguration{
Completions: ptr.To(int32(2)),
Template: &corev.PodTemplateSpecApplyConfiguration{
Spec: &corev.PodSpecApplyConfiguration{
Containers: []corev.ContainerApplyConfiguration{
{
Env: []corev.EnvVarApplyConfiguration{
{
Name: ptr.To("example_1"),
},
},
EnvFrom: []corev.EnvFromSourceApplyConfiguration{
{
SecretRef: &corev.SecretEnvSourceApplyConfiguration{
Optional: ptr.To(false),
},
},
},
Name: ptr.To(constants.ModelInitializer),
},
},
},
},
},
ObjectMetaApplyConfiguration: &metav1.ObjectMetaApplyConfiguration{
GenerateName: ptr.To(""),
Labels: map[string]string{
constants.LabelTrainJobAncestor: constants.ModelInitializer,
},
},
},
Name: ptr.To("test_replica_name_1"),
GroupName: ptr.To("test_group_name_1"),
Replicas: ptr.To(int32(2)),
},
},
},
},
trainJob: &trainer.TrainJob{
Spec: trainer.TrainJobSpec{
Labels: map[string]string{
"hello_world": "1234",
"hello_world_1": "3456",
},
RuntimeRef: trainer.RuntimeRef{
Name: "",
APIGroup: ptr.To(""),
Kind: ptr.To(""),
},
Initializer: &trainer.Initializer{
Model: &trainer.ModelInitializer{
StorageUri: ptr.To("example_1"),
SecretRef: &corev1.LocalObjectReference{
Name: "",
},
},
Dataset: &trainer.DatasetInitializer{
StorageUri: ptr.To("example_1"),
SecretRef: &corev1.LocalObjectReference{
Name: "",
},
},
},
Trainer: &trainer.Trainer{
Image: ptr.To(""),
},
},
},
},
}

for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
builder := NewBuilder(test.jobset)
builder.Initializer(test.trainJob)
rj := builder.Spec.ReplicatedJobs[0]

if *rj.Replicas != 1 {
t.Fatalf("expected replicas=1, got %d", *rj.Replicas)
}

containers := rj.Template.Spec.Template.Spec.Containers
env := containers[0].Env

found := false
for _, e := range env {
if *e.Name == jobsetplgconsts.InitializerEnvStorageUri {
found = true
}
}

if !found {
t.Fatalf("storage uri env not injected")
}

envFrom := containers[0].EnvFrom
if len(envFrom) == 0 {
t.Fatalf("expected secret envFrom injected")
}

builder.isRunLauncherAsNode(&runtime.Info{
Labels: map[string]string{
"hello_world": "1234",
"hello_world_1": "3456",
},
})
})

}
}

func TestTrainer(t *testing.T) {
tests := []struct {
name string
jobset *jobsetv1alpha2ac.JobSetApplyConfiguration
trainJob *trainer.TrainJob
}{
{
name: "ancestor is dataset initializer",
jobset: &jobsetv1alpha2ac.JobSetApplyConfiguration{
Spec: &jobsetv1alpha2ac.JobSetSpecApplyConfiguration{
ReplicatedJobs: []jobsetv1alpha2ac.ReplicatedJobApplyConfiguration{
{
Template: &v1.JobTemplateSpecApplyConfiguration{
Spec: &v1.JobSpecApplyConfiguration{
Completions: ptr.To(int32(2)),
Template: &corev.PodTemplateSpecApplyConfiguration{
Spec: &corev.PodSpecApplyConfiguration{
Containers: []corev.ContainerApplyConfiguration{
{
Name: ptr.To(constants.DatasetInitializer),
},
},
},
},
},
ObjectMetaApplyConfiguration: &metav1.ObjectMetaApplyConfiguration{
GenerateName: ptr.To(""),
Labels: map[string]string{
constants.LabelTrainJobAncestor: constants.DatasetInitializer,
},
},
},
Name: ptr.To("test_replica_name_1"),
GroupName: ptr.To("test_group_name_1"),
Replicas: ptr.To(int32(2)),
},
},
},
},
trainJob: &trainer.TrainJob{
Spec: trainer.TrainJobSpec{
Labels: map[string]string{
"hello_world": "1234",
"hello_world_1": "3456",
},
RuntimeRef: trainer.RuntimeRef{
Name: "runtime_example",
APIGroup: ptr.To("example_api_group"),
Kind: ptr.To("pods"),
},
Initializer: &trainer.Initializer{
Model: &trainer.ModelInitializer{
StorageUri: ptr.To("model_storage_uri"),
},
Dataset: &trainer.DatasetInitializer{
StorageUri: ptr.To("dataset_storage_uri"),
SecretRef: &corev1.LocalObjectReference{
Name: "",
},
},
},
Trainer: &trainer.Trainer{
Image: ptr.To(""),
},
},
},
},
{
name: "ancestor is dataset initializer",
jobset: &jobsetv1alpha2ac.JobSetApplyConfiguration{
Spec: &jobsetv1alpha2ac.JobSetSpecApplyConfiguration{
ReplicatedJobs: []jobsetv1alpha2ac.ReplicatedJobApplyConfiguration{
{
Template: &v1.JobTemplateSpecApplyConfiguration{
Spec: &v1.JobSpecApplyConfiguration{
Completions: ptr.To(int32(2)),
Template: &corev.PodTemplateSpecApplyConfiguration{
Spec: &corev.PodSpecApplyConfiguration{
Containers: []corev.ContainerApplyConfiguration{
{
Name: ptr.To(constants.Node),
},
},
},
},
},
ObjectMetaApplyConfiguration: &metav1.ObjectMetaApplyConfiguration{
GenerateName: ptr.To(""),
Labels: map[string]string{
constants.LabelTrainJobAncestor: constants.AncestorTrainer,
},
},
},
Name: ptr.To("test_replica_name_1"),
GroupName: ptr.To("test_group_name_1"),
Replicas: ptr.To(int32(2)),
},
},
},
},
trainJob: &trainer.TrainJob{
Spec: trainer.TrainJobSpec{
Labels: map[string]string{
"hello_world": "1234",
"hello_world_1": "3456",
},
RuntimeRef: trainer.RuntimeRef{
Name: "",
APIGroup: ptr.To(""),
Kind: ptr.To(""),
},
Initializer: &trainer.Initializer{
Model: &trainer.ModelInitializer{
StorageUri: ptr.To(""),
SecretRef: &corev1.LocalObjectReference{
Name: "",
},
},
Dataset: &trainer.DatasetInitializer{
StorageUri: ptr.To(""),
SecretRef: &corev1.LocalObjectReference{
Name: "",
},
},
},
Trainer: &trainer.Trainer{
Image: ptr.To(""),
Command: []string{"cmd1", "cmd2", "cmd3"},
Args: []string{"args1", "args2", "arg3"},
ResourcesPerNode: &corev1.ResourceRequirements{
Limits: corev1.ResourceList{},
},
},
},
},
},
}

for _, test := range tests {
t.Run(t.Name(), func(t *testing.T) {
builder := NewBuilder(test.jobset)

builder.Trainer(&runtime.Info{}, test.trainJob)
builder.Build()

c := builder.Spec.ReplicatedJobs[0].
Template.Spec.Template.Spec.Containers[0]

ancestor := test.jobset.Spec.ReplicatedJobs[0].
Template.ObjectMetaApplyConfiguration.Labels[constants.LabelTrainJobAncestor]

if ancestor == constants.AncestorTrainer {
if c.Image == nil {
t.Fatalf("trainer image not injected")
}

if *c.Image != *test.trainJob.Spec.Trainer.Image {
t.Fatalf("image mismatch")
}

if len(c.Command) != len(test.trainJob.Spec.Trainer.Command) {
t.Fatalf("command not applied")
}
} else {
if c.Image != nil {
t.Fatalf("image should not be set for non-trainer job")
}
}
})
}
}
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test doesn't use go-cmp for comparing expected and actual results, which is the established convention in this codebase. All other test files in pkg/runtime/framework/plugins use github.com/google/go-cmp/cmp for comparisons instead of manual assertions. Consider using cmp.Diff to compare expected vs actual states for more informative test failures.

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +42
tests := []struct {
name string
jobset *jobsetv1alpha2ac.JobSetApplyConfiguration
trainJob *trainer.TrainJob
}{
{
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test structure doesn't follow the established convention in this codebase. Other tests in pkg/runtime/framework/plugins use map[string]struct for test cases instead of []struct, which allows better test case identification. For reference, see pkg/runtime/framework/plugins/jobset/jobset_test.go:51 and pkg/runtime/framework/plugins/mpi/mpi_test.go:48.

Copilot uses AI. Check for mistakes.
Signed-off-by: upsaurav12 <sauravup041103@gmail.com>
@google-oss-prow google-oss-prow bot added size/L and removed size/XXL labels Feb 4, 2026
@upsaurav12
Copy link
Author

@andreyvelich i was wondering that why kubeflow/trainers is not using assert library, because it will make more easier to verify things ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add unit tests for core and framework to cover reasonable coverage

1 participant

Comments