Skip to content

Conversation

@smklein
Copy link
Collaborator

@smklein smklein commented Dec 17, 2025

Adds a handful of omdb commands to support bundle collection.

Additionally, expands support bundle integration testing to ensure that commands are
actually valid omdb subcommands, even if they aren't executing against a real system.

Fixes #9509

Comment on lines +72 to +75
pub async fn collect(
collection: &BundleCollection,
dir: &Utf8Path,
) -> anyhow::Result<CollectionStepOutput> {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command is currently being executed unconditionally; I'm interested in tweaking https://github.com/oxidecomputer/omicron/blob/main/nexus/src/app/background/tasks/support_bundle/request.rs#L21-L51 but was interested in feedback here...

  • I could add a category for "omdb" commands. That seems easy. But the downside here is that "omdb commands can check a lot of different data"?
  • So the alternative would be: Should I be changing the set of sub-commands we execute here, based on the BundleRequest? e.g.: should we have a category for "Nexus" (including background tasks, quiescing state, sagas), a category for "Updates" (including mgs-updates, update-status) and continue to have other commands be controlled by other more specific bundle requests?

TL;DR:

  • I could have this be lumped under the broad "omdb" category
  • Or I could split it up into different categories, depending on "what happens to be queried", and let the fact that it's queried by omdb be kinda an implementation detail

@smklein smklein marked this pull request as ready for review December 17, 2025 22:21
[[profile.default.scripts]]
# Build omdb for usage within Nexus integration tests.
# This was initially added for the support bundle integration tests.
filter = 'rdeps(nexus-test-utils) - package(omicron-dev) - package(omicron-live-tests)'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I don't have access to an omdb binary, it's pretty hard to validate that the commands issues by the support bundle are actually valid.

.clone(),
};

// Configure the omdb binary path for tests.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a little sketchy to me - it depends on the changes I made in config/nextest.toml - so if there's something more resilient I could / should be doing here, I'm open to it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on using this -dup pattern? Basically build another copy of omdb for use within nexus's integration tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other option, and maybe the one we use more commonly, is to make the relevant tests live in the omdb crate.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer the -dup option (will look into it) - this is one spot Nexus calls omdb, but now that Nexus expects to "Find omdb at a path, and take that as input argument" it seems totally reasonable for Nexus tests to depend on omdb.

The reverse seems like a mess - omdb depending on the entirety of Nexus' test utils seems like a really heavy dependency for omdb to add?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that does seem to be a pretty large dependency. That's kind of why we did this -dup thing for reconfigurator-cli, because most tests were purely in memory and we didn't want to have to compile all the db code each iteration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that does seem to be a pretty large dependency. That's kind of why we did this -dup thing for reconfigurator-cli, because most tests were purely in memory and we didn't want to have to compile all the db code each iteration.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this a shot in adc037a

Basically:

  • Split up omdb into library/binary
  • Adds omdb-dup as a binary in nexus
  • Find the path to it in the nexus_test macro
  • Remove all the nextest scripts / workspace-based path navigation I was doing previously

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is breaking ls-apis for a couple reasons:

  • I implemented adc037a by creating an omdb-dup binary in the nexus crate, which depends on the omdb crate as a dependency
  • This creates a problematic API dependency:
Nexus Internal Lockstep API (client: nexus-lockstep-client)
    consumed by: omicron-nexus (omicron/nexus) via 1 path
        error: unexpected dependency
    consumed by: omicron-sled-agent (omicron/sled-agent) via 1 path
        status: expected, reason: Only RSS uses this API and it is not part of upgrade

I can't just make the omdb dependency a dev-dependency, because I don't think I have a way to build it as a "dev-dependency [[bin]]" target. I think nexus-reconfigurator-cli-integration-tests gets around this by being built as a totally distinct crate.

I think I might need to do the same here - build all nexus integration tests as their own crate to make this work?


files_checked += 1;

// Validate that the omdb command is valid, even if it can't connect
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I validated this manually - typo-ing the spelling of omdb nexus as omdb nah from the support bundle; I see this as an output, and the assertion gets tripped

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Bundle: More omdb-related stuff

3 participants