Skip to content

Conversation

@rainest
Copy link
Contributor

@rainest rainest commented Aug 20, 2025

Add the System.Actions field and populate it with all possible actions during fake discovery. This allows PCS to attempt those actions.

Fake discovery does not directly expose the System struct, so there wasn't an existing place to slot this into config. We'll probably need to revisit that eventually: there are many other fields in SMD/schemas types that aren't yet in the fake discovery type, and I expect there are more unexpected behaviors when they have zero values.

Providing a permissive default on the SMD side is probably good enough for fake discovery, since you can just ignore (actually) unsupported actions by not requesting PCS perform them.

Downstream, this looks like:

  1. PCS's SMD client pulls allowable actions from the associated SMD Redfish info type.
  2. The power status component tracker reworks this into an intermediate type, transforming Redfish power actions into PCS transition types.
  3. The transition runner rejects transitions if they're not in the supported list.

Copy link
Collaborator

@synackd synackd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good, just need to update some comments. I also still need to test this.

Comment on lines 93 to 94
// This struct is a partially duplicated copy of the schemas InventoryDetail type, originally from SMD:
// https://github.com/OpenCHAMI/schemas/blob/9aad17a286c405e1d75463298ec8db553cc4ca12/schemas/inventory.go#L61-L88
// smd.parseRedfishEndpointDataV2 unmarshals the Systems field of a RedfishEndpoint request into one:
// https://github.com/OpenCHAMI/smd/blob/f07680ffb6c41e75945bc32bc7ba948a56afe2e5/cmd/smd/smd-api.go#L2826-L2842
// SMD will have zero values for any fields this currently lacks, and will have unexpected behavior when it tries to
// use them later.

// System represents data that would be retrieved from BMC System data, except
// reduced to a minimum needed for discovery.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The merge of OpenCHAMI/smd#74 and #48 will require the first part of this comment to be revised. The definition will be here (this is a branch so the URI will need to be changed when merged).

Also, this comment block should be connected to the struct comment below it to indicate that the comment describes the struct. While we're at it, could we limit the line width to 80 characters to match the other comment lines? E.g.

// System represents data that would be retrieved from BMC System data, except
// reduced to a minimum needed for discovery. It is a partially duplicated copy
// of ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're just gonna toss these local definitions then? #48 doesn't do that yet, but it looks like the end expectation is we just import the SMD types, and tossing this entirely avoids git confusion.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is the end goal. Looks like #48 needs to be rebased, so we don't have to revise the comment in this PR and instead revise it in that one (which probably makes sense scope-wise). My main concern with my review comment here is the second part of it, which is to merge this comment to the struct comment. With that change, I think we can resolve this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second read I think you maybe just meant the package path.

To clarify, will we still have a separate ochami/pkg/client/smd.System shadow type, or will ochami CLI code use smd/v2/pkg/schemas.InventoryDetail directly? If it's the latter we don't need this comment at all.

If it's the former we should still have it, and ideally use a tag instead of a commit. Do you know how soon we plan to have a release with OpenCHAMI/smd#74 ?

I'd intentionally kept it separate to not include it in the docstring, since it is, for lack of better description, messy--it's halfway to being a TODO to stop using the shadow type if we keep having to duplicate fields into it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second read I think you maybe just meant the package path.

Yep.

To clarify, will we still have a separate ochami/pkg/client/smd.System shadow type, or will ochami CLI code use smd/v2/pkg/schemas.InventoryDetail directly? If it's the latter we don't need this comment at all.

If it's the former we should still have it, and ideally use a tag instead of a commit.

The goal is to move entirely to a package within SMD and move away from the schemas repo completely. That way, SMD owns its own schema and no shadow types are needed. The shadow types were created because a while back it looked like we might transition everything to a schemas repo and the current schemas repo lacked a lot of things the ochami CLI needed, but the paradigm has since shifted.

Do you know how soon we plan to have a release with OpenCHAMI/smd#74 ?

I don't really have an accurate gauge. I think we should merge this soon, so I'm inclined to agree that we can keep a TODO comment around for switching over to the SMD package once it gets merged, at which point in time we can get rid of the SMD shadow types.

I'd intentionally kept it separate to not include it in the docstring, since it is, for lack of better description, messy--it's halfway to being a TODO to stop using the shadow type if we keep having to duplicate fields into it.

That's fair. Since a transition to an SMD client package is underway, I don't think this should be a blocker. If you want to add the TODO and then comment here (so I'm notified) and resolve this, we can move forward with this PR.

@rainest rainest force-pushed the rainest/power-action branch 2 times, most recently from 87882ce to c495ae7 Compare August 27, 2025 22:29
@rainest rainest requested a review from synackd August 27, 2025 22:29
@synackd
Copy link
Collaborator

synackd commented Aug 28, 2025

Do you have instructions I can test this with? Are you using a PCS quadlet or similar deployment?

@synackd synackd added the needs testing Needs more testing before approval label Aug 28, 2025
@rainest
Copy link
Contributor Author

rainest commented Aug 29, 2025

Do you have instructions I can test this with? Are you using a PCS quadlet or similar deployment?

The simple option is we simply ignore the E2E testing here, it's just aligning discover static ComponentEndpoints with ones created by Magellan!

IDK how we want to handle that more broadly for this repo--I wouldn't want it to be responsible for E2E tests anyway, but it could maybe do single-service tests for "the actual service API accepts this request". We'd need to build out more SMD mocks for things that aren't SMD though.

The E2E answer is sorta yes, in that I have something that works, but that it's still in not release-ready form, and this is one of the chicken-egg problem pieces to getting it there. I can invite you to my sandbox repo. The PCS image currently used by that has hacks to deal with #51 and to avoid using Vault.

@synackd
Copy link
Collaborator

synackd commented Sep 2, 2025

Do you have instructions I can test this with? Are you using a PCS quadlet or similar deployment?

The simple option is we simply ignore the E2E testing here, it's just aligning discover static ComponentEndpoints with ones created by Magellan!

I should clarify: I'm not currently running PCS on my test systems and so haven't interacted with it much (we should coordinate on adding it to the release repo at some point). I more wanted some guidance on how to run PCS and run the ochami CLI to test these changes. Are you running the quadlet deployment (e.g. from the release repo) somewhere or a docker-compose deployment from the deployment-recipes repo?

IDK how we want to handle that more broadly for this repo--I wouldn't want it to be responsible for E2E tests anyway, but it could maybe do single-service tests for "the actual service API accepts this request". We'd need to build out more SMD mocks for things that aren't SMD though.

We do need to figure out how to add integration tests at some point. I considered doing SMD mocking, and maybe adding an SMD client package/schema would prevent us from having to ensure the mocks keep up with the SMD API. However, it might be worth a wider discussion on how to go about this org-wide.

The E2E answer is sorta yes, in that I have something that works, but that it's still in not release-ready form, and this is one of the chicken-egg problem pieces to getting it there. I can invite you to my sandbox repo. The PCS image currently used by that has hacks to deal with #51 and to avoid using Vault.

Sure, we can communicate offline about this.

Add the System.Actions field and populate it with all possible actions
during fake discovery. PCS rejects transitions if SMD doesn't recognize
the action for a given system, so this provides a permissive default.

Signed-off-by: Travis Raines <571832+rainest@users.noreply.github.com>
@rainest rainest force-pushed the rainest/power-action branch from c495ae7 to 980850b Compare September 17, 2025 21:33
@rainest
Copy link
Contributor Author

rainest commented Sep 17, 2025

Force-pushed and squashed the review fixes to get past a phantom conflict (I do love when git says there's a conflict for lines that no longer exist in the changeset and forces you to fix it twice before squashing makes it irrelevant 😐 )

@synackd
Copy link
Collaborator

synackd commented Sep 18, 2025

In my environment, I haven't been able to test performing transitions in PCS after the nodes are discovered, but I can verify that they are being sent:

2025-09-18T15:51:12-06:00 INF http.go:124 > Response status: HTTP/2.0 201 Created
2025-09-18T15:51:12-06:00 DBG client.go:245 > POST: https://demo.openchami.cluster:8443/hsm/v2/Inventory/RedfishEndpoints
2025-09-18T15:51:12-06:00 DBG client.go:266 > Request headers:
2025-09-18T15:51:12-06:00 DBG client.go:268 >   User-Agent: [ochami/v0.5.3]
2025-09-18T15:51:12-06:00 DBG client.go:268 >   Authorization: [Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IjdiMTlhY2E4LWNjMTAtNDA1ZC1hYzkyLTdkYTMwYTllY2U4ZSIsInR5cCI6IkpXVCJ9.eyJhdWQiOltdLCJjbGllbnRfaWQiOiJjY2YxZGYxYS01YWU0LTQ3NGQtYjlmYS05NzdhMzA0ZTRlNzYiLCJleHAiOjE3NTgyMzU4NjYsImV4dCI6e30sImlhdCI6MTc1ODIzMjI2NiwiaXNzIjoiaHR0cHM6Ly9yZWRvbmRvLnVzcmMiLCJqdGkiOiJlNGEzYjY4Yi1iODY0LTQ2YmEtOWFmNi1jN2NhYmQwZTQyZDEiLCJuYmYiOjE3NTgyMzIyNjYsInNjcCI6WyJvcGVuaWQiLCJzbWQucmVhZCJdLCJzdWIiOiJjY2YxZGYxYS01YWU0LTQ3NGQtYjlmYS05NzdhMzA0ZTRlNzYifQ.nOeTQYY42tr_QJlGjXZuUgF5Cc1vD5Pd835PPnnasv1TcVEIwRf3R9iMo22tQAJOdsU0DO1YYooxwBLGk53doiVvJKuxkhAkdo2OD4bHW09L4QjHuzddKJO320D_v_6cYiQSXa3cCpaztF9RVinvXrVKgc5W-IZHSdemTIo2GOsnSlFroQHFwphidTrxYEglwwmrj1ct-heCz8p5e-3H-sGd0GArRpbsVw1Xz2ewPvaozJ25qLAOKM8I2QEBysemrSM3udf8nu_u8unvXxYaR3EGMskmyKM_-eHvHR3aWcsj6Lojji1Ojv1X-sItDaz5lJcNn0SNt2aG6QU7KXonZT6GBrcnFdxvhfNNqBInV3ELZ-SG3sCKxJarRFQzCsY2OSlKRiRtB8mSZcMNNWhBws0VS0tY8V7GU4qjSwcL8o-nPg70HnQy8_TEHpB1DHsb4HErf2cOTb0OXr5l5YHwWkXsm0DELLBOOTJaDdvLUwWJuCYgDrD2uvUAEpY-vbsR5UMrquw3YV46K7on-EhalxL25TsMsQUXYtUQbzaQqTAkElJkSfSndoWUxSqwlQtSV0tDjCRGByXVFJp784kM4rwnj3BZQssiztGB8Ftmw_ZOtzhPvWL_z3ZNqr9dEe_-4xsOdl5mGROWPs172L6pkSkabVjka0WgrgQtW9eXeEo]
2025-09-18T15:51:12-06:00 DBG client.go:274 > Request body:
2025-09-18T15:51:12-06:00 DBG client.go:275 > {"ID":"x3001c1s7b70","Type":"NodeBMC","Name":"re99","UUID":"4f5d0f05-a42c-4f29-a8ec-2dd48e3d4db3","MACAddr":"d0:94:66:6b:48:22","IPAddress":"172.16.0.199","DiscoveryInfo":{"LastAttempt":"0001-01-01T00:00:00Z"},"SchemaVersion":1,"Systems":[{"uri":"https://demo.openchami.cluster:8443/redfish/v1/Systems/x3001c1s7b70n0","uuid":"dfbf86d2-4e22-4277-b22f-2443b12452d4","name":"re99","ethernet_interfaces":[{"mac":"d0:94:66:6b:48:28","ip":"172.16.0.99","name":"x3001c1s7b70n0","description":"Interface 0 for re99"}],"actions":["On","ForceOff","GracefulShutdown","GracefulRestart","ForceRestart","Nmi","ForceOn","PushPowerButton","PowerCycle","Suspend","Pause","Resume"]}],"Managers":[{"uri":"https://demo.openchami.cluster:8443/redfish/v1/Managers/x3001c1s7b70","uuid":"4f5d0f05-a42c-4f29-a8ec-2dd48e3d4db3","name":"x3001c1s7b70","ethernet_interfaces":[{"mac":"d0:94:66:6b:48:22","ip":"172.16.0.199","name":"x3001c1s7b70","description":"Interface for BMC x3001c1s7b70"}],"actions":null,"description":"","type":"NodeBMC"}]}
2025-09-18T15:51:12-06:00 DBG client.go:288 > Response status: 201 Created
2025-09-18T15:51:12-06:00 DBG client.go:290 > Response headers:
2025-09-18T15:51:12-06:00 DBG client.go:292 >   Content-Type: [application/json]
2025-09-18T15:51:12-06:00 DBG client.go:292 >   Location: [/hsm/v2/Inventory/RedfishEndpoints/x3001c1s7b70]
2025-09-18T15:51:12-06:00 DBG client.go:292 >   Date: [Thu, 18 Sep 2025 21:51:12 GMT]
2025-09-18T15:51:12-06:00 DBG client.go:292 >   Content-Length: [60]
2025-09-18T15:51:12-06:00 DBG client.go:305 > Response body:
2025-09-18T15:51:12-06:00 DBG client.go:306 > [{"URI":"/hsm/v2/Inventory/RedfishEndpoints/x3001c1s7b70"}]

2025-09-18T15:51:12-06:00 INF http.go:124 > Response status: HTTP/2.0 201 Created

@rainest I don't want complications in my environment to be a blocker. Have you tested this to work on your end?

@rainest
Copy link
Contributor Author

rainest commented Sep 18, 2025

It did indeed actually reboot the machine after in mine.

In the strictest sense the supported list check happens only at transition creation time, and actually sending them happens after. That's kinda pedantic, but you should be able to check that PCS doesn't accept any transition creates without this, and that it does accept them after (even if the BMC's not reachable).

Copy link
Collaborator

@synackd synackd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the transitions get added to SMD:

hmsds=> SELECT * FROM comp_endpoints where id='x3000c0s0b0n0';
-[ RECORD 1 ]---+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id              | x3000c0s0b0n0
type            | Node
domain          | 
redfish_type    | ComputerSystem
redfish_subtype | 
rf_endpoint_id  | x3000c0s0b0
mac             | 
uuid            | e3da88d9-993b-4e51-bd97-c278e9f33ef6
odata_id        | /redfish/v1/Systems/x3000c0s0b0n0
component_info  | {"Name":"re01","Actions":{"#ComputerSystem.Reset":{"ResetType@Redfish.AllowableValues":["On","ForceOff","GracefulShutdown","GracefulRestart","ForceRestart","Nmi","ForceOn","PushPowerButton","PowerCycle","Suspend","Pause","Resume"],"@Redfish.ActionInfo":"/redfish/v1/Systems/x3000c0s0b0n0/ResetActionInfo","target":"/redfish/v1/Systems/x3000c0s0b0n0/Actions/ComputerSystem.Reset"}},"EthernetNICInfo":[{"RedfishId":"","@odata.id":"","Description":"Interface 0 for re01","InterfaceEnabled":true,"MACAddress":"5c:ed:8c:21:ac:08"}]}

and @rainest has confirmed this has worked in his setup. This LGTM and rather than delay this any further waiting for my own PCS setup to be complete, I stamp this with my approval.

@synackd synackd merged commit 1219bee into main Sep 19, 2025
9 checks passed
@synackd synackd deleted the rainest/power-action branch September 19, 2025 22:55
@synackd synackd removed the needs testing Needs more testing before approval label Oct 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants