-
Notifications
You must be signed in to change notification settings - Fork 177
POC/WIP: Report edge node hardware inventory #5358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC/WIP: Report edge node hardware inventory #5358
Conversation
- use eve-api from https://github.com/christoph-zededa/eve-api/tree/inventory_9000 - use ghw library from https://github.com/christoph-zededa/ghw Signed-off-by: Christoph Ostarek <christoph@zededa.com>
1. as a separate inventory message before /register 2. as part of hwinfo message it reports from: 1. ghw library 2. spec.sh this does not make sense. only one way is needed, but for demonstration it is implemented like this Signed-off-by: Christoph Ostarek <christoph@zededa.com>
| "github.com/lf-edge/eve/pkg/pillar/cmd/inventory" | ||
| ) | ||
|
|
||
| func main() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @eriknordmark that its better to use Zedagent, instead of a new microservice. We can use hwinfo protobuf message and API endpoint to send this to the controller.
@christoph-zededa your thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's why I implemented both ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This proof-of-concept PR introduces hardware inventory reporting capabilities to EVE, showcasing two approaches for reporting node hardware information to the controller: as a separate pre-registration inventory message and as part of the hwinfo message. The implementation collects hardware details using both the ghw library and spec.sh script.
Key changes:
- Adds a new
inventoryagent that reports hardware information before device onboarding - Extends hardware info messages with PCI/USB device inventory from ghw library
- Refactors server configuration file access into centralized helper functions
Reviewed Changes
Copilot reviewed 23 out of 196 changed files in this pull request and generated 19 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/pillar/cmd/inventory/inventory.go | New inventory agent implementation with hardware collection and reporting |
| pkg/pillar/hardware/inventory.go | Hardware inventory collection utilities using ghw library |
| pkg/pillar/types/locationconsts.go | Refactored server file access into reusable helper functions |
| pkg/pillar/cmd/zedagent/hardwareinfo.go | Integrated inventory info into existing hardware info messages |
| pkg/pillar/go.mod | Updated dependencies and added local module replacements |
| pkg/pillar/types/global.go | Added Failure() method to SenderStatus |
| pkg/pillar/scripts/device-steps.sh | Added inventory to the list of managed agents |
| Multiple cmd files | Migrated to use new centralized server file access helpers |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| replace github.com/jaypipes/ghw => /home/christoph/projects/ghw |
Copilot
AI
Nov 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Local file system paths in go.mod replace directives will break builds on other machines. These should be removed before merging or use relative paths if needed for development.
| replace github.com/jaypipes/ghw => /home/christoph/projects/ghw |
| fmt.Fprintf(os.Stderr, "BBBBB FillPCI\n") | ||
| errs["PCI"] = imc.fillPCI() | ||
| fmt.Fprintf(os.Stderr, "BBBBB FillUSB\n") | ||
| errs["USB"] = imc.fillUSB() | ||
| fmt.Fprintf(os.Stderr, "BBBBB FillUSB done\n") |
Copilot
AI
Nov 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug print statements using fmt.Fprintf to stderr should be removed or replaced with proper logging before merging to production.
| } | ||
|
|
||
| if err != nil { | ||
| fmt.Fprintf(os.Stderr, "BBBBB AddInventoryInfo err: %+v\n", err) |
Copilot
AI
Nov 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug print statement should be removed or replaced with proper logging.
| fmt.Fprintf(os.Stderr, "BBBBB AddInventoryInfo\n") | ||
| err = hardware.AddInventoryInfo(hwInfo) | ||
| if err != nil { | ||
| log.Warnf("could not add inventory info: %v", err) | ||
| } | ||
| fmt.Fprintf(os.Stderr, "BBBBB AddInventoryInfo done\n") |
Copilot
AI
Nov 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug print statements should be removed or replaced with proper logging.
| } else { | ||
| fmt.Fprintf(os.Stderr, "BBBBB I don't know\n") | ||
| } |
Copilot
AI
Nov 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug print statement with unclear message should be removed or replaced with proper error handling and logging.
| Iteration: 0, | ||
| AllowProxy: true, | ||
| }) | ||
| fmt.Fprintf(os.Stderr, "AAAAA sending to %s: %v / %v\n", inventoryURL.String(), rv, err) |
Copilot
AI
Nov 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple debug print statements throughout the upload() and process() functions should be removed or replaced with proper logging.
| watches = append(watches, pubsub.ChannelWatch{ | ||
| Chan: reflect.ValueOf(stillRunning.C), | ||
| Callback: func(_ interface{}) (exit bool) { | ||
| fmt.Fprintf(os.Stderr, "AAAAAA still running\n") |
Copilot
AI
Nov 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple debug print statements throughout the upload() and process() functions should be removed or replaced with proper logging.
| fmt.Fprintf(os.Stderr, "AAAAAA still running\n") | |
| log.Debug("inventoryReporter: still running") |
| var id uint32 | ||
| _, err := fmt.Sscanf(idString, "0x%x", &id) | ||
| if err != nil { | ||
| log.Warnf("could not decode id '%s': %+v", idString, err) |
Copilot
AI
Nov 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The pciHexToUint32 function silently returns 0 on errors after logging warnings. Consider whether returning an error would be more appropriate for the caller to handle failures explicitly.
| } | ||
| bs, err := hex.DecodeString(idString) | ||
| if err != nil { | ||
| log.Warnf("could not decode id '%s': %+v", idString, err) |
Copilot
AI
Nov 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The pciHexToUint32 function silently returns 0 on errors after logging warnings. Consider whether returning an error would be more appropriate for the caller to handle failures explicitly.
|
|
||
| _, err := fmt.Sscanf(idString, "%x", &id) | ||
| if err != nil { | ||
| log.Warnf("could not decode id '%s': %+v", idString, err) |
Copilot
AI
Nov 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The pciHexToUint32 function silently returns 0 on errors after logging warnings. Consider whether returning an error would be more appropriate for the caller to handle failures explicitly.
| return serverFileName | ||
| } | ||
|
|
||
| func Server() (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is very confusing to hide this in here - this is for constants and not functions which can return different output over time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW I'm not a fan of what I call "shallow abstractions" in particular when the names are not very precise.
Introducing a ReadServerName() (and potentially WriteServerName()) functions might make sense, but they definitely don't belong in this file/package.
| Warnf(format string, args ...interface{}) | ||
| } | ||
|
|
||
| func WaitServer(log errAndWarnLogger, stillRunning func()) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a set of WaitFor* functions in a utils package,
It makes sense to add a WaitForServerName() function there (and we are waiting for the name and not for the server!
| } | ||
|
|
||
| func WriteServer(newServer string) error { | ||
| if err := os.WriteFile(ServerFileName(), []byte(newServer), 0644); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you using this?
It can't work for those reasons:
- /config is an overlay mount i.e., memory. Thus any changes are lost on reboot. (see eve config mount command)
- To be robust against random power outages any changes to files on disk should use WriteRename, and the reader might need to look for a backup file in case the new file is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
before, it was here: de8ebaf#diff-0da84ad72bb4fc00d50ec24d823225e04bf9f273d9bd7a613393d9b1f3bb5540L109
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be robust against random power outages any changes to files on disk should use WriteRename, and the reader might need to look for a backup file in case the new file is empty.
I guess I need to update
eve/pkg/pillar/cmd/monitor/monitor.go
Line 88 in 9488a0b
| if err := os.WriteFile(tempServerFile, []byte(newServer), 0644); err != nil { |
WriteRename. @christoph-zededa to avoid code duplication we can move this function to pillar/utils. My implementation mounts a real /config partition to update the file
| { | ||
| "arch": $ARCH, | ||
| "productURL": "$(cat /persist/status/hardwaremodel || cat /config/hardwaremodel)", | ||
| "productURL": "$(cat /persist/status/hardwaremodel 2> /dev/null || cat /config/hardwaremodel 2> /dev/null || echo 'not found' )", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect both the old and new code can product invalid URLs - in particular spaces need to be escaped in URLs.
Instead of tweaking this, isn't it better to just drop the productURL from the output in all cases?
| @@ -0,0 +1,37 @@ | |||
| package main | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing copyright and license statement.
| if devUUID == status.DeviceUUID { | ||
| return | ||
| } | ||
| triggerPublishHwInfo(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a triggerPublishAllInfo below for the cases when it is safe to publish. Why doesn't that publish the HwInfo? It should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this is unnecessary. I removed it and tested with Eden - ZInfoHardware is sent after onboarding and on ever reboot - exactly what we need.
eriknordmark
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you start by creating a draft PR for eve-api?
I can't infer that from the vendored in go files from eve-api.
|
@christoph-zededa I removed the separate |
|
superseeded by #5535 |
Description
This PR is just a showcase to illustrate several ways of reporting hardware information of the node to the controller.
It does not make sense to merge this PR into main.
pillar: report inventory
it reports from: