Drop `as` conversion in inventory #9541

labbott · 2025-12-18T17:01:30Z

as will silently succed which is not what we want

andrewjstone

Looks good. Thanks for the fix!

davepacheco · 2025-12-18T18:09:19Z

Definitely like trading as for an explicit conversion, but should we not be handling the error? Why is it okay to unwrap() here?

labbott · 2025-12-18T19:46:15Z

Definitely like trading as for an explicit conversion, but should we not be handling the error? Why is it okay to unwrap() here?

This came out of a comment from @andrewjstone #9524 (comment)

To actually hit this panic, we'd either have to have a zone manifest from a sled return a wildly large size from one part of the databse query or the other

omicron/nexus/db-queries/src/db/datastore/inventory.rs

Lines 245 to 267 in 06c0808

    
           // Pull zone manifest zones out of all sled agents. 
        
           let zone_manifest_zones: Vec<_> = collection 
        
               .sled_agents 
        
               .iter() 
        
               .filter_map(|sled_agent| { 
        
                   sled_agent 
        
                       .zone_image_resolver 
        
                       .zone_manifest 
        
                       .boot_inventory 
        
                       .as_ref() 
        
                       .ok() 
        
                       .map(|artifacts| { 
        
                           artifacts.artifacts.iter().map(|artifact| { 
        
                               InvZoneManifestZone::new( 
        
                                   collection_id, 
        
                                   sled_agent.sled_id, 
        
                                   artifact, 
        
                               ) 
        
                           }) 
        
                       }) 
        
               }) 
        
               .flatten() 
        
               .collect();

omicron/nexus/db-queries/src/db/datastore/inventory.rs

Lines 3601 to 3641 in 06c0808

    
           // Load zone_manifest_zone rows. 
        
           let mut zone_manifest_artifacts_by_sled_id = { 
        
               use nexus_db_schema::schema::inv_zone_manifest_zone::dsl; 
        
               let mut by_sled_id: BTreeMap< 
        
                   SledUuid, 
        
                   IdOrdMap<ZoneArtifactInventory>, 
        
               > = BTreeMap::new(); 
        
               let mut paginator = Paginator::new( 
        
                   batch_size, 
        
                   dropshot::PaginationOrder::Ascending, 
        
               ); 
        
               while let Some(p) = paginator.next() { 
        
                   let batch = paginated_multicolumn( 
        
                       dsl::inv_zone_manifest_zone, 
        
                       (dsl::sled_id, dsl::zone_file_name), 
        
                       &p.current_pagparams(), 
        
                   ) 
        
                   .filter(dsl::inv_collection_id.eq(db_id)) 
        
                   .select(InvZoneManifestZone::as_select()) 
        
                   .load_async(&*conn) 
        
                   .await 
        
                   .map_err(|e| { 
        
                       public_error_from_diesel(e, ErrorHandler::Server) 
        
                   })?; 
        
                   paginator = p.found_batch(&batch, &|row| { 
        
                       (row.sled_id, row.zone_file_name.clone()) 
        
                   }); 
        
                   for row in batch { 
        
                       by_sled_id 
        
                           .entry(row.sled_id.into()) 
        
                           .or_default() 
        
                           .insert_unique(row.into()) 
        
                           .expect("database ensures the row is unique"); 
        
                   } 
        
               } 
        
               by_sled_id 
        
           };

we could either say this is unlikely enough to happen or I could attempt to do something like

omicron/nexus/db-queries/src/db/datastore/inventory.rs

Lines 166 to 184 in 06c0808

    
                   let error_values = collection 
        
                       .errors 
        
                       .iter() 
        
                       .enumerate() 
        
                       .map(|(i, message)| { 
        
                           let index = u16::try_from(i).map_err(|e| { 
        
                               Error::internal_error(&format!( 
        
                                   "failed to convert error index to u16 (too \ 
        
                                   many errors in inventory collection?): {}", 
        
                                   e 
        
                               )) 
        
                           })?; 
        
                           Ok(InvCollectionError::new( 
        
                               collection_id, 
        
                               index, 
        
                               message.clone(), 
        
                           )) 
        
                       }) 
        
                       .collect::<Result<Vec<_>, Error>>()?;

and return an internal error?

davepacheco · 2025-12-18T20:29:43Z

That's what I would do (InternalError), but I realize that will require changing at least this function to be fallible and I don't know how big the blast radius is. But in general it doesn't seem great if reading bad data from the database can result in a panic. (Could uploading a TUF repo with bad metadata also trigger this?)

labbott · 2025-12-18T21:12:09Z

That's what I would do (InternalError), but I realize that will require changing at least this function to be fallible and I don't know how big the blast radius is. But in general it doesn't seem great if reading bad data from the database can result in a panic. (Could uploading a TUF repo with bad metadata also trigger this?)

I think you're correct that a bad manifest would potentially trigger this.

I think both of these functions are already fallible so I don't think that would be extra work. I am going to raise the question if we don't just want to unwrap if returning an internal error is better for our purposes than the current behavior we get with as.

andrewjstone · 2025-12-18T21:20:03Z

That's what I would do (InternalError), but I realize that will require changing at least this function to be fallible and I don't know how big the blast radius is. But in general it doesn't seem great if reading bad data from the database can result in a panic. (Could uploading a TUF repo with bad metadata also trigger this?)

I think you're correct that a bad manifest would potentially trigger this.

I think both of these functions are already fallible so I don't think that would be extra work. I am going to raise the question if we don't just want to unwrap if returning an internal error is better for our purposes than the current behavior we get with as.

I think it's definitely better than using as. We could accidentally end up with negative sizes otherwise or extremely large ones depending upon conversion direction. We'd want to know if the values are incorrect.

I think in order of best to worst the preference should be:

return an error, panic, use as

And we should only panic if the data is untrusted, meaning it comes from code that generates it and not based on user input.

andrewjstone · 2025-12-18T21:22:52Z

That's what I would do (InternalError), but I realize that will require changing at least this function to be fallible and I don't know how big the blast radius is. But in general it doesn't seem great if reading bad data from the database can result in a panic. (Could uploading a TUF repo with bad metadata also trigger this?)

I think you're correct that a bad manifest would potentially trigger this.
I think both of these functions are already fallible so I don't think that would be extra work. I am going to raise the question if we don't just want to unwrap if returning an internal error is better for our purposes than the current behavior we get with as.

I think it's definitely better than using as. We could accidentally end up with negative sizes otherwise or extremely large ones depending upon conversion direction. We'd want to know if the values are incorrect.

I think in order of best to worst the preference should be:

return an error, panic, use as

And we should only panic if the data is untrusted, meaning it comes from code that generates it and not based on user input.

Also, I had assumed, perhaps incorrectly that this was trusted data, which is why I figured panicking was fine. If not, I'd suggest returning an error if we can. We've definitely seen problems with sizes of things being incorrect and not discovering it for a while after it was silently causing issues. I think this was on disk sizes but I don't recall. For that reason, panicking is better than silently doing the wrong thing, but if we can not do the wrong thing and not crash, we are in a better place.

`as` will silently succed which is not what we want

labbott requested a review from andrewjstone December 18, 2025 17:01

andrewjstone approved these changes Dec 18, 2025

View reviewed changes

labbott enabled auto-merge (squash) December 18, 2025 17:56

labbott disabled auto-merge December 18, 2025 19:29

Drop as conversion in inventory

c0c79cc

`as` will silently succed which is not what we want

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Drop `as` conversion in inventory #9541

Drop `as` conversion in inventory #9541

Uh oh!

labbott commented Dec 18, 2025

Uh oh!

andrewjstone left a comment

Uh oh!

davepacheco commented Dec 18, 2025

Uh oh!

labbott commented Dec 18, 2025

Uh oh!

davepacheco commented Dec 18, 2025

Uh oh!

labbott commented Dec 18, 2025

Uh oh!

andrewjstone commented Dec 18, 2025

Uh oh!

andrewjstone commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Drop as conversion in inventory #9541

Are you sure you want to change the base?

Drop as conversion in inventory #9541

Uh oh!

Conversation

labbott commented Dec 18, 2025

Uh oh!

andrewjstone left a comment

Choose a reason for hiding this comment

Uh oh!

davepacheco commented Dec 18, 2025

Uh oh!

labbott commented Dec 18, 2025

Uh oh!

davepacheco commented Dec 18, 2025

Uh oh!

labbott commented Dec 18, 2025

Uh oh!

andrewjstone commented Dec 18, 2025

Uh oh!

andrewjstone commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Drop `as` conversion in inventory #9541

Drop `as` conversion in inventory #9541