-
Notifications
You must be signed in to change notification settings - Fork 56
Add support for dynamic MIG config generation #295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This PR eliminates the need to manually maintain static MIG configuration files by generating them at runtime from hardware. Previously, every new MIG-capable GPU required manual updates to `config-default.yaml`, which was time-consuming and required GPUs to be first available to read MIG profiles. Now, when the `nvidia-mig-manager` systemd service starts (or on every node boot), it runs `nvidia-mig-parted generate-config` to query available MIG profiles via NVML and produces a complete configuration file automatically. This includes per-profile configs (e.g., `all-1g.10gb`, `all-7g.80gb`) as well as the all-balanced config, with proper device-filter handling for heterogeneous GPU systems. The `k8s-mig-manager` pod also generates this config on startup, writes the config to a per-node ConfigMap and uses the config throughout its lifetime instead of requiring `config-default.yaml` to be mounted as volume. The implementation adds a new `generate-config` CLI command and introduces two new packages: `pkg/mig/discovery` for querying MIG profiles from hardware using `go-nvlib`, and `pkg/mig/builder` for constructing the mig-parted config spec. The systemd service has been updated to generate the config on every start, with fallback to a previously generated config if generation fails. Signed-off-by: Rajath Agasthya <ragasthya@nvidia.com>
d89bf26 to
56eefb2
Compare
Some versions of the driver report incorrect MIG profiles for A30 GPUs. The fix will be made on 580 and 590 driver branches, but in the meantime, return a known list of MIG configs when profiles are queried for an A30 GPU. Signed-off-by: Rajath Agasthya <ragasthya@nvidia.com>
Removed redundant condition in device-filter logic and added detailed comments explaining when device-filter is needed. Also simplify setupMigConfig in mig-manager. Signed-off-by: Rajath Agasthya <ragasthya@nvidia.com>
* Remove verbose diagnostic messages and redundant echo statements * Use direct exit code check instead of capturing $? separately * Add warning when falling back to existing config on generation failure * Keep error output visible for debugging (no stderr suppression) The config generation behavior remains the same: * Generate fresh config from hardware on every boot * Fall back to existing config.yaml if generation fails * Exit with error only if no config is available at all Signed-off-by: Rajath Agasthya <ragasthya@nvidia.com>
Change app.kubernetes.io/component label value from "mig-manager" to "nvidia-mig-manager" for consistency with other gpu-operator components. Signed-off-by: Rajath Agasthya <ragasthya@nvidia.com>
Signed-off-by: Rajath Agasthya <ragasthya@nvidia.com>
Signed-off-by: Rajath Agasthya <ragasthya@nvidia.com>
a55784d to
f832ad1
Compare
| output, err = json.MarshalIndent(spec, "", " ") | ||
| if err != nil { | ||
| return fmt.Errorf("error marshaling MIG config to JSON: %v", err) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question -- is there a technical reason for marshaling to json here (at the call site) but not doing the same for the yaml (the marshaling to yaml is not done at the call site)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted a top level build.GenerateConfigYAML function that I could reference directly in main.go to write the generated file: https://github.com/NVIDIA/mig-parted/pull/295/changes#diff-85fd584658fe1f46bd4d96385d360fa875f8f9bd58b7b1d9e66a44636adafe64R341. Otherwise, I'd have to marshal there too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, that is fine. Does it make sense to also have a top level equivalent for json, i.e. build.GenerateConfigJSON?
| fi | ||
| } | ||
|
|
||
| function maybe_add_config_symlink() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question -- I forget the meaning of this symlink. Is it relevant for the default config we generate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure why there was a symlink too. But the code now writes config directly to config.yaml, so we don't need the symlink.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine to remove this code, however we need to update the readme in deployments/systemd, specifically the portion about customizing the default config:
mig-parted/deployments/systemd/README.md
Lines 61 to 64 in a5546be
| Users should only need to customize the `config.yaml` (to add any user-specific | |
| MIG configurations they would like to apply) and the `hooks.sh` and | |
| `hooks.yaml` files (to add any user specific services that need to be shutdown | |
| and restarted when applying a MIG configuration). |
Signed-off-by: Rajath Agasthya <ragasthya@nvidia.com>
e7c5bf3 to
16c1036
Compare
eec822d to
e2fa0df
Compare
Signed-off-by: Rajath Agasthya <ragasthya@nvidia.com>
e2fa0df to
f926d3c
Compare
This PR eliminates the need to manually maintain static MIG configuration files by generating them at runtime from hardware. Previously, every new MIG-capable GPU required manual updates to
config-default.yaml, which was time-consuming and required GPUs to be first available to read MIG profiles. Now, when thenvidia-mig-managersystemd service starts (or on every node boot), it runsnvidia-mig-parted generate-configto query available MIG profiles via NVML and produces a complete configuration file automatically. This includes per-profile configs (e.g.,all-1g.10gb,all-7g.80gb) as well as theall-balanced config, with proper device-filter handling for heterogeneous GPU systems. Thek8s-mig-managerpod also generates this config on startup, writes the config to a per-node ConfigMap and uses the config throughout its lifetime instead of requiringconfig-default.yamlto be mounted as volume.The implementation adds a new
generate-configCLI command and introduces two new packages:pkg/mig/discoveryfor querying MIG profiles from hardware usinggo-nvlib, andpkg/mig/builderfor constructing the mig-parted config spec. The systemd service has been updated to generate the config on every start, with fallback to a previously generated config if generation fails.Guide to Reviewers
Overview
This PR eliminates the need to manually maintain static MIG configuration files. Instead of shipping a pre-built
config.yamlthat must be updated whenever a new GPU is released, we now generate the configuration at runtime by querying the hardware via NVML.Before: Manual maintenance of config-default.yaml for every new MIG-capable GPU.
After: Config is auto-generated from hardware on every boot/service start.
Architecture
flowchart TB subgraph cli [mig-parted CLI] GenerateConfig[generate-config command] end subgraph pkgBuilder [pkg/mig/builder] BuildSpec[GenerateConfigSpec] BuildYAML[GenerateConfigYAML] BuildBalanced[buildAllBalancedConfig] end subgraph pkgDiscovery [pkg/mig/discovery] Discover[DiscoverMIGProfiles] end subgraph nvml [NVML via go-nvlib] VisitDevices[VisitDevices] GetMigProfiles[GetMigProfiles] GetGpuInstanceProfileInfo[GetGpuInstanceProfileInfo] end GenerateConfig --> BuildYAML BuildYAML --> BuildSpec BuildSpec --> Discover BuildSpec --> BuildBalanced Discover --> VisitDevices VisitDevices --> GetMigProfiles GetMigProfiles --> GetGpuInstanceProfileInfoComponent Changes
1. mig-parted CLI
New command:
nvidia-mig-parted generate-configFiles:
2. Systemd Service
The service now generates a fresh config on every start/boot:
If generation fails (e.g., no GPUs, driver not loaded), it falls back to a previously generated config if available.
Files:
3. k8s-mig-manager (Future Work)
The k8s-mig-manager will:
NVML Discovery Logic
The discovery layer uses
go-nvlibto query MIG profiles from hardware.Flow in pkg/mig/discovery/discovery.go:
Key Types:
Config Building Logic
The builder layer converts discovered profiles into a mig-parted config spec.
Flow in pkg/mig/builder/builder.go:
Example output for A100-80GB:
All-Balanced Config
The
all-balancedconfig creates a mix of small, medium, and large MIG instances.Formula in pkg/mig/builder/balanced.go:
Profile selection: For each G-value, we select the base profile (no
+me,+gfxattributes) with the highest MaxCount. This gives the smallest memory footprint option, which is most flexible.Heterogeneous GPU handling: When multiple GPU types exist, each gets its own entry with a
device-filter:Testing
dgxa100mock, CI profile filteringTest data: Profile data in tests matches NVIDIA MIG User Guide.
Files Changed
cmd/nvidia-mig-parted/main.gocmd/nvidia-mig-parted/generateconfig/generate_config.gopkg/mig/discovery/discovery.gopkg/mig/discovery/discovery_test.gopkg/mig/builder/builder.gopkg/mig/builder/builder_test.gopkg/mig/builder/balanced.gopkg/mig/builder/balanced_test.godeployments/systemd/service.shKey Design Decisions
Runtime generation over static files: Eliminates maintenance burden when new GPUs are released.
Fallback behavior: If generation fails, the systemd service uses a previously generated config. This handles edge cases like driver not being loaded.
CI profile filtering: Compute Instance profiles (e.g.,
1c.2g.20gb) are filtered out since they represent subdivisions of GPU instances, not standalone configs.Device-filter for heterogeneous systems: When multiple GPU types exist, configs include
device-filterto target specific devices. This is necessary because different GPUs may have different profile names or max counts.Profile normalization:
+in profile names is converted to.for config names (e.g.,1g.10gb+mebecomesall-1g.10gb.me) since+is not ideal in YAML keys.