Validate single node disk removals #244

martindekov · 2025-09-19T13:03:51Z

Rejecting disk removals when there are attached volumes in Degraded
state on single node harvester set ups.

We also make sure to reject removals when block devices are
in failed state as well.

The logic is split in two parts:

We take the disk selectors of volumes in degraded state and
compare the selectors against the disktags
We take the longhorn node related with the current blockdevice
and through the node we pick the backing image all through indexing

In both scenarios in case there is match while disk is being
disabled, we list to the user which volumes and then images
needs to get fixed.

Signed-off-by: Martin Dekov martin.dekov@suse.com

Problem:
Users can directly delete disks without any warnings explained in issue:
harvester/harvester#3344

Solution:
Reject deletions of disks when there are degraded volumes when there is only single node harvester cluster

Related Issue:

harvester/harvester#3344

Test plan:

Unit testing added

Replicate Degraded Volume Scenario:

Create single node harvester cluster
Attach SATA disks before installation
Once installed add the disks through the Hosts -> Edit Config -> Storage -> Add Disk dropdown
Tag the added disk with new tag for example
Create storage class with 3 replicas and attach the disk Advanced -> Storage Classes -> Create -> Disk Selector -> new (same as disk tag above)
Create virtual machine and attach volume to use that storage class
Since replicas are 3 while we have only 1 volume we end up with Degraded volume
Try to delete the disk - see rejection

Replicate Failed Backing Image scenario:

Follow steps 1 to 4 above
In images tab click Create button to create a new image
While creating pick file image and click Storage Class tab and pick storage class created from above - "new"
While image is uploaded refresh the page multiple times to provoke EOF upload error
Try deleting the disk from Hosts page - check error

Note: make sure the virtualmachineimage has block device as backend before testing so when image is created corresponding blockdevice is created as well

pkg/webhook/blockdevice/validator.go

w13915984028

FYI.

pkg/webhook/blockdevice/validator.go

deploy/charts/harvester-node-disk-manager/templates/rbac.yaml

pkg/webhook/blockdevice/validator.go

Addressing feedback from Moritz including the following: * adding longhorn volume resource to the ndm clusterrole * adding virtualmachineimage resource to the clusterrole * remove redundant usage of slice after logic was changed Signed-off-by: Martin Dekov <martin.dekov@suse.com>

Signed-off-by: Martin Dekov <martin.dekov@suse.com>

Adding fake implementations and unit test skeleton for the pipeline runs. For now there is only one test which has minimal coverage for the purposes of validating the tests work as expected and cover the basic part of the code. Test cases will be added in incoming commit. Signed-off-by: Martin Dekov <martin.dekov@suse.com>

Adding unit test cases covering mostly the new logic. Removed slice allocation size since it produced empty values which would've led to new logic to avoid empty values so for simplicity removal was the best choice. Also augmented the instantiaton of the objects to allow empty slices and extend test coverage when caches are empty. Signed-off-by: Martin Dekov <martin.dekov@suse.com>

m-ildefons

LGTM.

m-ildefons · 2025-09-30T07:32:52Z

scripts/unit-test

+cd $(dirname $0)/..
+
+echo Running unit tests
+go test -v -cover ./pkg/...


Just a nit, you don't have to correct this:

If you want coverage output from the containerized build (i.e. using make unit-test), then you need to do some additional things:

specify a coverage output file

go test -v -cover -coverprofile=coverage.out ./pkg/...

do the conversion to the final output format (e.g. HTML) inside the container

go test [...] go tool cover -html=coverage.out -o coverage.html

add the final coverage output file in Dockerfile.dapper where the environment variable DAPPER_OUTPUT is defined, e.g.:

[...] ENV DAPPER_OUTPUT ./bin ./manifests ./pkg ./coverage.html

Then you can do make unit-test, the coverage report will generated and placed outside of the build container so you can analyze it.

Thanks Moritz, I will leave this open. I intentionally made it simple, didn't want to overcrowd this change as it already contains too much different things. Augmenting pipeline to output coverage in the right way is a work on it's own. Also I think there were badges to show coverage on front page etc etc.

We can open an separate issue for this I think in this repo? WDYT?

That's completely fair, in fact I think it would be better to have a separate PR for that. I just assumed that you added the -cover flag to get coverage data.

Feel free to tag me as reviewer when you open that PR.

WebberHuang1118

Please review the discussion related to the Longhorn backing image, thanks.

pkg/webhook/blockdevice/validator.go

Replacing validation of blockdevice deletion from virtualmachineimage to backingimage. Logic rejects disk removals in case there are backingimages attached in non ready state to the blockdevice. We index the caches for both longhorn nodes and backingimages to reach all backingimages related with blockdevices through the longhorn nodes. Additonal changes * Replaces fakes and testing for the new objects * Fixed some RBAC permissions for the new objects * Registered longhorn factory as caches didn't actually return anything * Removed unsued objects and prints * Fixed selection based on Spec.Tags to Status.Tags as in real environment tags are stored in status, not in the spec Signed-off-by: Martin Dekov <martin.dekov@suse.com>

WebberHuang1118

Some of the logic for checking backing images seems incorrect to me; please see my comment.

Also suggest reviewing the Backing Image LEP for the relevant data structures.

Thanks

pkg/webhook/blockdevice/validator.go

WebberHuang1118 · 2025-10-03T08:54:29Z

pkg/webhook/blockdevice/validator.go

+	var failedBackingImages []string
+	for _, backingImage := range backingImages {
+		if backingImage.Status.DiskFileStatusMap[uuid].State != lhv1.BackingImageStateReady {
+			failedBackingImages = append(failedBackingImages, backingImage.Name)
+		}
+	}
+	if len(failedBackingImages) > 0 {
+		failedBackingImageList := strings.Join(failedBackingImages, ",")
+		return fmt.Errorf("the following backingimages: %v attached to blockdevice: %v are in a %s state; make sure state is fixed before disk deletion",
+			failedBackingImageList, old.Name, lhv1.BackingImageStateFailed)
+	}


If I understand correctly, the current logic is: for the disk that’s about to be removed, check whether any backing image has a copy on that disk, and whether the copy is not in the ready state.

However, what we actually need to check is: for the disk that’s about to be removed, determine if it holds the only valid copy of certain backing images.

I think that this:

determine if it holds the only valid copy of certain backing images.

Is part of the UI Warning change in that issue: harvester/harvester#9218 There is no PR yet for this.

The current PR refers to the case where we have a single harvester node and we have bad volumes/backingimages. And all storage classes have replica 1 when harvester consists of a single node covering this requirement by @Vicente-Cheng:

block by webhook - number of replica = 1

IMO if we have single node whatever you do effectively those disks will be attached to that single node, the moment you add more nodes, we don't reach this logic here at all. As per my current understanding In multi node set ups, replication of data becomes more than 1 and then we go to the warning before deletion, but we don't reject it through webhook, but this is to be discussed I think.

Why does this PR only target a single-node environment? If I understand correctly, the original issue #3344 should also cover multi-node environments.

As I see it, this PR is attempting to address #3344, so this PR should not be limited to single-node setups. From my perspective, #3344 and #9218 appear to be duplicates.

Regarding:

The current PR refers to the case where we have a single harvester node and we have bad volumes/backingimages. And all storage classes have replica 1 when harvester consists of a single node covering this requirement by @Vicente-Cheng:

block by webhook - number of replica = 1

Even in a single-node environment, users can attach multiple disks so that a storageClass targets more than one replica.

If I understand correctly, what @Vicente-Cheng means is that the webhook should block node removal if a volume has only one healthy replica on the disk that’s about to be deleted. We should trace this through the route volume → engine → replicas, rather than relying on the selector, since the selector is part of the volume spec (the desired state) and not the actual state we need to verify.

The same logic should also apply to backing images.

I captured our offline discussions in comment to the issue:
harvester/harvester#3344 (comment)

That:

Prevent the user through webhook to delete disk in the hosts page in a single node harvester in case volume is in Degraded state and is attached to the disk through a storage class

In all cases warn the user that when disk is deleted from the hosts page in harvester you should be cautious and evict it first - 1st click warn - second click allow to pass

So the implementation reflects that

WebberHuang1118 · 2025-10-03T09:18:22Z

pkg/webhook/blockdevice/validator.go

+func (v *Validator) validateDegradedVolumes(old *diskv1.BlockDevice) error {
+	volumeList, err := v.volumeCache.List(utils.LonghornSystemNamespaceName, labels.Everything())
+	if err != nil {
+		return err
+	}
+	if len(volumeList) == 0 {
+		return nil
+	}
+	degradedVolumes := []string{}
+	for _, vol := range volumeList {
+		if vol.Status.Robustness == lhv1.VolumeRobustnessDegraded {
+			degradedVolumes = append(degradedVolumes, vol.Name)
+		}
+	}
+	if len(degradedVolumes) == 0 {
+		return nil
+	}
+	selectorDegradedVol := make(map[string][]string)
+	for _, name := range degradedVolumes {
+		pv, err := v.pvCache.Get(name)
+		if err != nil {
+			return err
+		}
+		diskSelector := ""
+		if pv.Spec.CSI != nil {
+			diskSelector = pv.Spec.CSI.VolumeAttributes[utils.DiskSelectorKey]
+		}
+		if len(diskSelector) != 0 {
+			selectorDegradedVol[diskSelector] = append(selectorDegradedVol[diskSelector], pv.Name)
+		}
+	}
+	degradedVolString := ""
+	for _, diskTag := range old.Status.Tags {
+		if val, ok := selectorDegradedVol[diskTag]; ok {
+			degradedVolString += fmt.Sprintf(" %s: %v", diskTag, val)
+		}
+	}
+	if len(degradedVolString) > 0 {
+		return fmt.Errorf("the following tags with volumes:%s attached to disk: %s are in degraded state; evict disk before proceeding",
+			degradedVolString, old.Spec.DevPath)
+	}
+	return nil
+}


May I know why we need to care about the selector? If I understand correctly, the only check required is whether the disk slated for removal contains the sole valid replica of any volume.

Selector is how we trace which volumes are attached to the disk we currently we are trying to remove so:

Tagging a disk

We use that tag to create storage class

We use that storage class to create volumes/vms

And this tag is used as selector across those components. That way If I Delete the disk I trace back the volumes created through the disk -> tag -> sc -> volumes way.

Also both this logic and backingimages logic above are triggered only on single node set ups, so replication can be only 1 in that case.

Even in a single-node environment, users can attach multiple disks so that a storageClass targets more than one replica, so we cannot make that assumption.

martindekov · 2025-10-03T11:04:49Z

@WebberHuang1118 thanks for pointing the HEP, I will definetely go through it!

Generally - current change is handling bad backingimages and volumes when harvester is 1 node meaning replication is 1 as well. Any bad backingimage or volume is actually the last valid one in that single node set up.

Indexing bcacking image by disk UUID taken from the status, not from the spec as per feedback from Webber Signed-off-by: Martin Dekov <martin.dekov@suse.com>

WebberHuang1118 · 2025-10-03T13:37:42Z

Hi @martindekov

The original issue #3344 should also cover multi-node environments.

As I see it, this PR is attempting to address #3344, so the PR shouldn’t be limited to single-node setups. From my perspective, #3344 and #9218 appear to be duplicates.

IMO, we should design a unified solution that applies to both single-node and multi-node environments in this PR. There’s no need to separate it into another issue (#9218) and another PR.

martindekov · 2025-10-06T07:53:36Z

To re-iterate, generally we agreed the solution for all node set ups is warning, not webhook rejection. Webhook rejection is for the corner case where we cannot really have replicated storage even if we set the affinity rule as per the comment:
harvester/harvester#3344 (comment)

The issue - harvester/harvester#9218 - refers to the Warning/UI
The issue - harvester/harvester#3344 - was refering to both, but now refers only to webhook/backend work - this PR as it took too much time which I did not anticipate.

So the work between the issues is split - one for the UI and one for Backend.

If we decide all warning work to go to the webhook backend, and leave the front end warning aspect, we also need to find ways to force delete disks in bad state as it will be rejected on API level. This design would further extend the scope of the issue.

WebberHuang1118 · 2025-10-06T10:34:23Z

To re-iterate, generally we agreed the solution for all node set ups is warning, not webhook rejection. Webhook rejection is for the corner case where we cannot really have replicated storage even if we set the affinity rule as per the comment:
harvester/harvester#3344 (comment)

Not every detail becomes clear until the PR is created. IMO, for reviewers, changes or updates to the original design or proposal are sometimes necessary.

The issue - harvester/harvester#9218 - refers to the Warning/UI
The issue - harvester/harvester#3344 - was refering to both, but now refers only to webhook/backend work - this PR as it took too much time which I did not anticipate.

So the work between the issues is split - harvester/harvester#9218 and one for harvester/harvester#3344.

I agree that #3344 targets the backend part only. However, based on my code reading, the current implementation only includes a webhook to reject disk removal for preventing data loss in a single-node cluster. This is not sufficient for #3344 for the following reasons:

Multi-node clusters: A volume may have only one healthy replica, and a backing image may have only one valid copy. Therefore, even in a multi-node cluster, a strict webhook block is still required to prevent data loss.
Single-node clusters: Multiple disks can be attached to a single node, so separating the logic between single-node and multi-node environments isn’t necessary and could make the code path more complex. IMO, we should have a unified mechanism that works for all cases.
PR design scope: If the PR sticks with the current design (comment), it won’t fully resolve the root issue #3344. Even if merged, the source issue likely won’t be marked as ready-for-testing, which may cause confusion for the QA team during verification.

TL;DR
I agree that #9218 should focus on GUI warning messages — specifically, when disk removal will not lead to data loss (i.e., all volumes still have healthy replicas and all backing images have ready copies even if the target disk is removed).

Meanwhile, #3344 should focus on the original context — if removing a disk would cause any volume to lose its only healthy replica or any backing image to lose the only ready copy, the webhook should strongly block the operation to prevent data loss, regardless of how many nodes are in the cluster.

If we continue with the current PR design, I believe the context and scope of #3344 should at least be reduced — for example, to [ENHANCEMENT] Delete disk operation should consider data availability for single-node Harvester. However, to be honest, that doesn’t make much sense from my perspective.

Ping @Vicente-Cheng @ihcsim for your thoughts on this topic.

martindekov · 2025-10-13T13:01:14Z

Closing and giving priority to multi node implementation here: #248

Won't update this as it became overcrowded with comments and hard to review

martindekov requested review from Vicente-Cheng and w13915984028 September 19, 2025 13:04

m-ildefons reviewed Sep 22, 2025

View reviewed changes

pkg/webhook/blockdevice/validator.go Outdated Show resolved Hide resolved

pkg/webhook/blockdevice/validator.go Outdated Show resolved Hide resolved

w13915984028 reviewed Sep 23, 2025

View reviewed changes

pkg/webhook/blockdevice/validator.go Outdated Show resolved Hide resolved

pkg/webhook/blockdevice/validator.go Outdated Show resolved Hide resolved

martindekov force-pushed the diskremoval3344 branch 3 times, most recently from d0a8ce6 to 0cc1d7c Compare September 24, 2025 08:57

martindekov changed the title ~~[WIP] Reject disk removals on Degraded volumes~~ [WIP] Validate disk removals Sep 24, 2025

m-ildefons reviewed Sep 24, 2025

View reviewed changes

pkg/webhook/blockdevice/validator.go Outdated Show resolved Hide resolved

martindekov force-pushed the diskremoval3344 branch 21 times, most recently from 0deeccb to a62db7c Compare September 25, 2025 14:21

m-ildefons reviewed Sep 29, 2025

View reviewed changes

deploy/charts/harvester-node-disk-manager/templates/rbac.yaml Show resolved Hide resolved

pkg/webhook/blockdevice/validator.go Outdated Show resolved Hide resolved

martindekov added 2 commits September 29, 2025 17:09

Vendor packages required for fakes

98f3318

Signed-off-by: Martin Dekov <martin.dekov@suse.com>

martindekov dismissed w13915984028’s stale review via 38e105b September 29, 2025 14:27

martindekov force-pushed the diskremoval3344 branch from 38e105b to 00c5d9c Compare September 29, 2025 14:32

martindekov force-pushed the diskremoval3344 branch from 00c5d9c to d1725ba Compare September 29, 2025 14:34

martindekov changed the title ~~Validate disk removals~~ Validate single node disk removals Sep 30, 2025

m-ildefons previously approved these changes Sep 30, 2025

View reviewed changes

WebberHuang1118 requested changes Sep 30, 2025

View reviewed changes

pkg/webhook/blockdevice/validator.go Outdated Show resolved Hide resolved

martindekov dismissed m-ildefons’s stale review via 924f071 October 1, 2025 14:42

martindekov mentioned this pull request Oct 1, 2025

[ENHANCEMENT] Delete disk operation should consider data availability harvester/harvester#3344

Closed

martindekov requested review from WebberHuang1118 and w13915984028 October 1, 2025 15:00

martindekov force-pushed the diskremoval3344 branch from 924f071 to 42148b6 Compare October 1, 2025 15:08

martindekov mentioned this pull request Oct 2, 2025

Add new roles to node-disk-manager-webhook harvester/charts#423

Merged

WebberHuang1118 requested changes Oct 3, 2025

View reviewed changes

WebberHuang1118 reviewed Oct 3, 2025

View reviewed changes

Index backing image UUID by status not spec

323a15c

Indexing bcacking image by disk UUID taken from the status, not from the spec as per feedback from Webber Signed-off-by: Martin Dekov <martin.dekov@suse.com>

martindekov requested a review from ihcsim October 3, 2025 12:38

martindekov mentioned this pull request Oct 13, 2025

Validate disk removals #248

Merged

martindekov closed this Oct 13, 2025

martindekov mentioned this pull request Oct 15, 2025

[DOC][CI] Improve developer experience of node-disk-manager harvester/harvester#9317

Open

Validate single node disk removals #244

Validate single node disk removals #244

Uh oh!

Conversation

martindekov commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

w13915984028 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

m-ildefons left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WebberHuang1118 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

WebberHuang1118 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WebberHuang1118 Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martindekov Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martindekov commented Oct 3, 2025

Uh oh!

WebberHuang1118 commented Oct 3, 2025

Uh oh!

martindekov commented Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WebberHuang1118 commented Oct 6, 2025

Uh oh!

martindekov commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

martindekov commented Sep 19, 2025 •

edited

Loading

WebberHuang1118 left a comment •

edited

Loading

WebberHuang1118 Oct 3, 2025 •

edited

Loading

martindekov Oct 3, 2025 •

edited

Loading

martindekov commented Oct 6, 2025 •

edited

Loading