-
Notifications
You must be signed in to change notification settings - Fork 28
Description
there are at least two bugs here, one much more obvious than the other.. I was trying to tickle a different but I suspect exists, but have been stymied by these for the time being.
I got here via ./propolis-standalone -s ubuntu.22.04.5.toml, ctrl+c, and ./propolis-standalone -r <snapshot_tar> with a file-backed disk and a mem-async disk, but the disk types shouldn't really matter (both are wedged)
issue one is that NVMe devices now have an is_enabled to avoid locking NvmeCtrl to read ctrl.cc on hot paths. that's great, except when we import a device we don't peel out is_enabled again, so the imported controller has cc declaring it's enabled, but is functionally disabled. attempting to submit I/O (when run via propolis-standalone) then gets you nvme reg r/w failure, error: the submission queue specified (4) is invalid, register: IOQueueDoorBells, offset: 4128 in ring_doorbell.
adding a
if ctrl.ctrl.cc.enabled() {
self.is_enabled.store(true, Ordering::Releae);
}right after
propolis/lib/propolis/src/hw/nvme/mod.rs
Line 1371 in fd36368
| ctrl.import(input, self)?; |
gets you past that and onto the second issue, which I've not debugged at all: I/Os seem to just not get acked (maybe not serviced?) on a migrated controller. it may be relevant that I've also got a local patch for const MAX_NUM_QUEUES: usize = 5; but I've not tested. we'll see how cc43ca0 fares!
edit/ps: #966 didn't trip over this because we use state.ctrl.cc.enabled() on the admin SQ/CQ doorbells, but self.is_enabled on the I/O doorbells, and the rudimentary driver there doesn't actually do I/O. so.. probably worth making it do some I/Os..