Skip to content

Integer comparison on qube volume size#775

Closed
ben-grande wants to merge 2 commits intoQubesOS:mainfrom
ben-grande:preload-fix-vol-size
Closed

Integer comparison on qube volume size#775
ben-grande wants to merge 2 commits intoQubesOS:mainfrom
ben-grande:preload-fix-vol-size

Conversation

@ben-grande
Copy link
Contributor

Disposable size is integer and disposable template size is string.

For: QubesOS/qubes-issues#1512
For: QubesOS/qubes-issues#10525
Fixes: QubesOS/qubes-issues#10572
For: #771

@ben-grande ben-grande requested a review from HW42 January 14, 2026 14:59
@ben-grande
Copy link
Contributor Author

ben-grande commented Jan 14, 2026

openQArun TEST=system_tests_dispvm

@qubesos-bot
Copy link

qubesos-bot commented Jan 14, 2026

OpenQA test summary

Complete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=202601221354-4.3&flavor=pull-requests

Test run included the following:

New failures, excluding unstable

Compared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2026011214-devel&flavor=update

Failed tests

7 failures

Fixed failures

Compared to: https://openqa.qubes-os.org/tests/165203#dependencies

6 fixed
  • system_tests_network

  • system_tests_dispvm

    • system_tests: Fail (unknown)
      Tests qubes.tests.integ.dispvm failed (exit code 1), details report...

    • system_tests: Failed (test died)
      # Test died: Some tests failed at qubesos/tests/system_tests.pm lin...

    • TC_20_DispVM_whonix-workstation-18: test_015_preload_race_more (error)
      raise TimeoutError from exc_val... TimeoutError

    • TC_20_DispVM_whonix-workstation-18: test_016_preload_race_less (error)
      raise TimeoutError from exc_val... TimeoutError

Unstable tests

Details

Performance Tests

Performance degradation:

No issues

Remaining performance tests:

No remaining performance tests

@ben-grande
Copy link
Contributor Author

Just would like to note that it also didn't fail openqa last time: #771 (comment), but was broken on real use case, which is sad... it requires some qubesd restart to change the state, which I don't think I can do mid test run.

Anyway, new integration tests introduced for the cases that should discard.

self.volume_config["private"]["size"]
!= appvm.volume_config["private"]["size"]
if int(self.volume_config["private"]["size"]) != int(
appvm.volume_config["private"]["size"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to use .volume_config at all here? Using .volumes["private"].size would avoid this issue (and should remove the hack in your test).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Cause I didn't know...
  2. Now that I changed to it, unit tests became a bit harder because I wasn't creating the volumes, the storage was being mocked. There are a few dispvm tests doing it, will try to use that.

@HW42
Copy link
Contributor

HW42 commented Jan 20, 2026

While strictly speaking out of scope of this PR, what is the intended behavior for the next lines?

        if any(vol for vol in self.volumes.values() if vol.is_outdated()):
            # Volume name is irrelevant. We use any() to return fast.
            differed["volumes"] = ["root"]
            return differed

Either only the "root" volume matters, then the iterator is unnecessary, or the return value is (potentially) wrong.

@ben-grande
Copy link
Contributor Author

While strictly speaking out of scope of this PR, what is the intended behavior for the next lines?

        if any(vol for vol in self.volumes.values() if vol.is_outdated()):
            # Volume name is irrelevant. We use any() to return fast.
            differed["volumes"] = ["root"]
            return differed

Either only the "root" volume matters, then the iterator is unnecessary, or the return value is (potentially) wrong.

The iterator stops on the first is_outdated, but we don't need to know which. I just placed "root" there to inform that it was from a template shutdown (volumes become outdated). It is not that only "root" matters, any matters, so any outdated should report it as such. Reporting as "root" as it just go to logs. If you don't think this is sane, I can return all the outdated volumes instead.

@ben-grande ben-grande force-pushed the preload-fix-vol-size branch from 8dee80e to aee1e29 Compare January 20, 2026 09:59
@ben-grande
Copy link
Contributor Author

openQArun TEST=system_tests_dispvm

@HW42
Copy link
Contributor

HW42 commented Jan 20, 2026

[...]

Either only the "root" volume matters, then the iterator is unnecessary, or the return value is (potentially) wrong.

The iterator stops on the first is_outdated, but we don't need to know which. I just placed "root" there to inform that it was from a template shutdown (volumes become outdated). It is not that only "root" matters, any matters, so any outdated should report it as such. Reporting as "root" as it just go to logs. If you don't think this is sane, I can return all the outdated volumes instead.

I'm fine with returning on the first outdated volume, that also matched the other checks. But then the return value should accurately report which volume is outdated, otherwise things can get very confusing when debugging. Given that we are talking about a DispVM, it's actually a normal case that the private volume of the AppVM template gets outdated without the root volume being outdated. Should be very straight forward to implement:

        for vol_name, vol in self.volumes.items():
            if vol.is_outdated():
                differed["volumes"] = [vol_name]
                return differed

[This should go into a separate PR, just commented here since this jumped my into my eye, while trying to understand the original issue]

self.volume_config["private"]["size"]
!= appvm.volume_config["private"]["size"]
if int(self.volumes["private"].size) != int(
appvm.volumes["private"].size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for int() here. Here we have a proper Volume object, which should always return an int for it's size property.

@ben-grande
Copy link
Contributor Author

[...]

Either only the "root" volume matters, then the iterator is unnecessary, or the return value is (potentially) wrong.

The iterator stops on the first is_outdated, but we don't need to know which. I just placed "root" there to inform that it was from a template shutdown (volumes become outdated). It is not that only "root" matters, any matters, so any outdated should report it as such. Reporting as "root" as it just go to logs. If you don't think this is sane, I can return all the outdated volumes instead.

I'm fine with returning on the first outdated volume, that also matched the other checks. But then the return value should accurately report which volume is outdated, otherwise things can get very confusing when debugging. Given that we are talking about a DispVM, it's actually a normal case that the private volume of the AppVM template gets outdated without the root volume being outdated. Should be very straight forward to implement:

        for vol_name, vol in self.volumes.items():
            if vol.is_outdated():
                differed["volumes"] = [vol_name]
                return differed

[This should go into a separate PR, just commented here since this jumped my into my eye, while trying to understand the original issue]

If the first outdated volume is "private", than the logs can't distinguish between the current check of outdated volume and the prior check of volume size... So if I do your way, I'd prefer returning every volume, which yes, doesn't guarantee that only "private" will appear. And I think it makes sense to include in this PR, if you agree. OpenQA didn't run yet because of temporary build failure.

@ben-grande ben-grande force-pushed the preload-fix-vol-size branch from aee1e29 to 7f126d3 Compare January 20, 2026 11:18
@codecov
Copy link

codecov bot commented Jan 20, 2026

Codecov Report

❌ Patch coverage is 30.00000% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.17%. Comparing base (756ff94) to head (71d268f).
⚠️ Report is 3 commits behind head on main.

Files with missing lines Patch % Lines
qubes/vm/mix/dvmtemplate.py 0.00% 4 Missing ⚠️
qubes/vm/dispvm.py 50.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #775      +/-   ##
==========================================
- Coverage   70.23%   70.17%   -0.07%     
==========================================
  Files          61       61              
  Lines       13975    13977       +2     
==========================================
- Hits         9815     9808       -7     
- Misses       4160     4169       +9     
Flag Coverage Δ
unittests 70.17% <30.00%> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@HW42
Copy link
Contributor

HW42 commented Jan 20, 2026

If the first outdated volume is "private", than the logs can't distinguish between the current check of outdated volume and the prior check of volume size... So if I do your way, I'd prefer returning every volume, [...]

If this your concern then listing every volume doesn't help. If you resize "private" is will return {"volumes": "private"} due to the first check. If you started the DispVMs template AppVM, but not the TemplateVM it's based on, it will return {"volumes": "private"} due to the second check.

So if you want to be able to distinguish those cases in the return value you need to use different keys (for example "volumes_size" and "volume_outdated").

And I think it makes sense to include in this PR, if you agree.

As noted above I would have put it into a separate, but fine with me to keep it here if you like.

The size type on volume_config, depends how the qube is loaded, from
disk or from the store. The volumes have a size property, use that
instead, but still convert to integer, no cost operation.

For: QubesOS/qubes-issues#1512
For: QubesOS/qubes-issues#10525
Fixes: QubesOS/qubes-issues#10572
For: QubesOS#771
@ben-grande ben-grande force-pushed the preload-fix-vol-size branch from 7f126d3 to 958dcec Compare January 21, 2026 04:21
Makes it easier to debug, even for volumes, which may have been for
different reasons, template shutdown or resize.

For: QubesOS/qubes-issues#1512
For: QubesOS/qubes-issues#10525
Fixes: QubesOS/qubes-issues#10572
For: QubesOS#771
@ben-grande ben-grande force-pushed the preload-fix-vol-size branch from 958dcec to 71d268f Compare January 21, 2026 12:03
@ben-grande
Copy link
Contributor Author

ben-grande commented Jan 21, 2026

openQArun TEST=system_tests_dispvm


Didn't create a gitlab or openqa pipeline.

@ben-grande
Copy link
Contributor Author

ben-grande commented Jan 21, 2026

openQArun TEST=system_tests_dispvm

@ben-grande

This comment was marked as outdated.

@ben-grande

This comment was marked as outdated.

@ben-grande
Copy link
Contributor Author

ben-grande commented Jan 22, 2026

openQArun TEST=system_tests_dispvm,system_tests_network,system_tests_network_ipv6

@ben-grande ben-grande mentioned this pull request Jan 22, 2026
@HW42
Copy link
Contributor

HW42 commented Jan 24, 2026

Merged as a7c6042

I updated the commit message to match the current version. Also dropped the "Not needed to return all volumes" comment (only describes what the code does not why. Also why comment the early return here and not in the other cases).

(As a nitpick on the commits: For a fix like this I would have put the changed test into a separate commit.)

The failures we saw on some openQA are not related to this change as far as I can see.

Merged this now to get the fix to users (although it only affected the testing repo). But what I think we want to address in some future change:

  • Look at the code coverage report. Some of the complaints look like real issues.
  • Add a test that would have caught the regression
  • Don't unnecessarily mix types in volumes_config

@HW42 HW42 closed this Jan 24, 2026
@ben-grande
Copy link
Contributor Author

(As a nitpick on the commits: For a fix like this I would have put the changed test into a separate commit.)

I normally put tests in the same commit, it allows automatic git bisect using run-tests to check. Is there a resource that recommends tests to be in separate commits?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Preloaded disposable qubes keep recreating instead of being instantly available

3 participants