-
Notifications
You must be signed in to change notification settings - Fork 54
Migrate reconfigure to golang #216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
0ea344f to
c497223
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is adapted from NVIDIA/vgpu-device-manager#93
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR replaces the existing reconfigure_mig.sh script with a Go-based implementation, centralizing MIG reconfiguration logic and Kubernetes client interactions into a single binary.
- Updated module dependencies (added validator, container-toolkit, and other indirect modules).
- Introduced
toolkit.gofor device-node and nvidia-smi wrapper functions. - Ported the full MIG reconfiguration workflow into Go (
reconfigure_mig.go,main.go,clients.go,cdi.go).
Reviewed Changes
Copilot reviewed 6 out of 411 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| go.mod | Upgraded NVIDIA libraries and added validator & container-toolkit |
| cmd/nvidia-mig-manager/toolkit.go | Implemented createControlDeviceNodes & runNvidiaSMI wrappers |
| cmd/nvidia-mig-manager/reconfigure_mig.go | Ported core MIG reconfiguration logic from shell to Go |
| cmd/nvidia-mig-manager/main.go | Refactored CLI flags into options struct and wired Go flow |
| cmd/nvidia-mig-manager/clients.go | Added Kubernetes client abstractions for stopping/restarting pods |
| cmd/nvidia-mig-manager/cdi.go | Added CDI spec regeneration via nvcdi library |
Comments suppressed due to low confidence (2)
cmd/nvidia-mig-manager/toolkit.go:22
- New functionality in
createControlDeviceNodeslacks unit tests. Consider adding tests to verify the command arguments and execution behavior.
func createControlDeviceNodes(opts *reconfigureMIGOptions) error {
cmd/nvidia-mig-manager/reconfigure_mig.go:43
- [nitpick] Global variable
hostGPUClientServicesStoppedcan lead to shared state across calls. Consider moving it into a struct or using a local variable to improve encapsulation.
hostGPUClientServicesStopped []string
cmd/nvidia-mig-manager/toolkit.go
Outdated
|
|
||
| func runNvidiaSMI(opts *reconfigureMIGOptions) error { | ||
| if opts.DriverRootCtrPath == opts.DevRootCtrPath { | ||
| cmd := exec.Command("chroot", opts.NVIDIASMIPath) //nolint:gosec |
Copilot
AI
Jul 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The chroot invocation is missing the root directory argument; chroot requires both the new root path and the command to run. Consider using exec.Command("chroot", opts.HostRootMount, opts.NVIDIASMIPath) or invoking the binary directly when no chroot is needed.
| cmd := exec.Command("chroot", opts.NVIDIASMIPath) //nolint:gosec | |
| cmd := exec.Command("chroot", opts.HostRootMount, opts.NVIDIASMIPath) //nolint:gosec |
| // ConfigStateLabel is the label that is used to indicate the state of the config being applied. | ||
| ConfigStateLabel string | ||
|
|
||
| // DriverLibrayPath is the path to libnvidia-ml.so.1 in the container. |
Copilot
AI
Jul 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in comment: 'DriverLibrayPath' should be 'DriverLibraryPath'.
| // DriverLibrayPath is the path to libnvidia-ml.so.1 in the container. | |
| // DriverLibraryPath is the path to libnvidia-ml.so.1 in the container. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
|
|
||
| spec, err := cdilib.GetSpec() | ||
| if err != nil { | ||
| return fmt.Errorf("failed to genereate CDI spec for management containers: %v", err) |
Copilot
AI
Jul 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in error message: 'genereate' should be 'generate'.
| return fmt.Errorf("failed to genereate CDI spec for management containers: %v", err) | |
| return fmt.Errorf("failed to generate CDI spec for management containers: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
| } | ||
|
|
||
| cmd = exec.Command("chroot", opts.HostRootMount, "systemctl", "-q", "is-enabled", service) // #nosec G204 -- HostRootMount validated via dirpath, service validated via systemd_service_name. | ||
| if err := cmd.Run(); err != nil { |
Copilot
AI
Jul 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The service-enabled check (is-enabled) is called twice in this function. Consider caching the result to avoid redundant system calls.
| if err := cmd.Run(); err != nil { | |
| isEnabledErr := cmd.Run() | |
| if isEnabledErr != nil { |
ArangoGutierrez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass, looks good
| @@ -0,0 +1,452 @@ | |||
| /* | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this file is simply adapted, not much to say besides the Copilot comments
| func reconfigureMIG(clientset *kubernetes.Clientset, opts *reconfigureMIGOptions) error { | ||
| validate := validator.New(validator.WithRequiredStructEnabled()) | ||
|
|
||
| log.Info("Validating reconfigure MIG options") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we move this log to the Validate function itself? That way, when called from other locations (if) the logging will also be consistent
cmd/nvidia-mig-manager/clients.go
Outdated
| Restart(*kubernetes.Clientset) error | ||
| } | ||
|
|
||
| // An operand is a GPU client that is controlled by a deploy label. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was my understanding that an Operand is a deployment managed/owned by a controller (operator)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but the relevant information here is that it is something that uses the GPU. This is why it needs to be shut down.
cmd/nvidia-mig-manager/clients.go
Outdated
| } | ||
|
|
||
| func stopK8sClients(clientset *kubernetes.Clientset) ([]k8sClient, error) { | ||
| // TODO: We need to add this namespace to the options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
cmd/nvidia-mig-manager/clients.go
Outdated
| var k8sGPUClients []k8sClient | ||
|
|
||
| // We first optionally stop the operands managed by the operator: | ||
| var operands = []*operand{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now hard coded is ok, but should we query the API server filtered with label "app=gpu-operator" and grab all the current operands managed by the operator? that way we don't have to manage this list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the first iteration I wanted to mirror what is being done in reconfigure_mig.sh. Once we have a stable go-based implementation we have much more flexibility.
Also note that the mig manager itself should NOT be stopped in this was as it is busy performing the required updated. The NVIDIA Container Toolkit container is also NOT a GPU client and doesn't need to be stopped.
|
|
||
| func regenerateCDISpec(opts *reconfigureMIGOptions) error { | ||
| log.Info("Generating CDI spec for management containers") | ||
| cdilib, err := nvcdi.New( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the new API 😍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't the new API. This API has existed for as long as the nvcdi package has existed. The "New" API was just cleaning up some internals and REMOVING functions that were largely unused.
| // ConfigStateLabel is the label that is used to indicate the state of the config being applied. | ||
| ConfigStateLabel string | ||
|
|
||
| // DriverLibrayPath is the path to libnvidia-ml.so.1 in the container. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
| func main() { | ||
| o := &options{} | ||
|
|
||
| c := cli.NewApp() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: this is a candidate for V3 migration
cmd/nvidia-mig-manager/clients.go
Outdated
| nodename string | ||
| app string | ||
| app string | ||
| // manager stores a reference to the mig manager managing these clients. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"A manager stores a reference to the manager handling these clients."
cmd/nvidia-mig-manager/clients.go
Outdated
| k8sClient | ||
| } | ||
|
|
||
| // A pod represents a kubernetes pod with a specified app= label. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that in the type above, the "a" is lower case; we should do all lower or all upper, for consistency
Where can I find |
https://github.com/NVIDIA/mig-parted/blob/main/deployments/container/reconfigure-mig.sh |
|
|
||
| spec, err := cdilib.GetSpec() | ||
| if err != nil { | ||
| return fmt.Errorf("failed to genereate CDI spec for management containers: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return fmt.Errorf("failed to genereate CDI spec for management containers: %v", err) | |
| return fmt.Errorf("failed to genereate CDI spec for management containers: %w", err) |
| nvcdi.WithClass("gpu"), | ||
| ) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to create CDI library for management containers: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return fmt.Errorf("failed to create CDI library for management containers: %v", err) | |
| return fmt.Errorf("failed to create CDI library for management containers: %w", err) |
| transformroot.WithTargetDevRoot(opts.DevRoot), | ||
| ) | ||
| if err := transformer.Transform(spec.Raw()); err != nil { | ||
| return fmt.Errorf("failed to transform driver root in CDI spec: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return fmt.Errorf("failed to transform driver root in CDI spec: %v", err) | |
| return fmt.Errorf("failed to transform driver root in CDI spec: %w", err) |
| } | ||
| err = spec.Save("/var/run/cdi/management.nvidia.com-gpu.yaml") | ||
| if err != nil { | ||
| return fmt.Errorf("failed to save CDI spec for management containers: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return fmt.Errorf("failed to save CDI spec for management containers: %v", err) | |
| return fmt.Errorf("failed to save CDI spec for management containers: %w", err) |
cmd/nvidia-mig-manager/clients.go
Outdated
|
|
||
| func (o *pod) delete() error { | ||
| err := o.manager.clientset.CoreV1().Pods(o.manager.Namespace).DeleteCollection( | ||
| context.TODO(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not use context.TODO(). Can we pass the application's primary context and modify this method's signature to include ctx as a param?
| func (m *migManager) getNodeLabelValue(label string) (string, error) { | ||
| node, err := m.clientset.CoreV1().Nodes().Get(context.TODO(), m.NodeName, metav1.GetOptions{}) | ||
| if err != nil { | ||
| return "", fmt.Errorf("unable to get node object: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return "", fmt.Errorf("unable to get node object: %v", err) | |
| return "", fmt.Errorf("unable to get node object: %w", err) |
| } | ||
|
|
||
| func (m *migManager) getNodeLabelValue(label string) (string, error) { | ||
| node, err := m.clientset.CoreV1().Nodes().Get(context.TODO(), m.NodeName, metav1.GetOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's propagate ctx down the method call stack
| func (m *migManager) setNodeLabelValue(label, value string) error { | ||
| node, err := m.clientset.CoreV1().Nodes().Get(context.TODO(), m.NodeName, metav1.GetOptions{}) | ||
| if err != nil { | ||
| return fmt.Errorf("unable to get node object: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return fmt.Errorf("unable to get node object: %v", err) | |
| return fmt.Errorf("unable to get node object: %w", err) |
| node.SetLabels(labels) | ||
| _, err = m.clientset.CoreV1().Nodes().Update(context.TODO(), node, metav1.UpdateOptions{}) | ||
| if err != nil { | ||
| return fmt.Errorf("unable to update node object: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return fmt.Errorf("unable to update node object: %v", err) | |
| return fmt.Errorf("unable to update node object: %w", err) |
About my: "Where can I find reconfigure_mig.sh?" Thanks @ArangoGutierrez -- I was searching quite wildly for Permanent URL: https://github.com/NVIDIA/mig-parted/blob/15f09c791454833c44a78f5b44a1bcd6a62ec6d0/deployments/container/reconfigure-mig.sh |
3d5d277 to
bb6a01a
Compare
cdesiniotis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass. I have not reviewed everything in detail yet.
| } | ||
| if withRebootFlag { | ||
| args = append(args, "-r") | ||
| options = append(options, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question -- do we actually need the conditional here? Or can we always add these options to the options slice?
| if withShutdownHostGPUClientsFlag { | ||
| args = append(args, "-d") | ||
|
|
||
| options = append(options, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question -- is there any value in appending these options here? Can we add them to the list of options during initialization of the options slice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this file is missing a license header
| DevRoot string | ||
|
|
||
| CDIEnabled bool | ||
| NVIDIASMIPath string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's add a comment for this field indicating whether the path is relative to the container or host
| // NVIDIACDIHookPath is the path to the nvidia-cdi-hook executable on the HOST. | ||
| NVIDIACDIHookPath string | ||
|
|
||
| hostNVIDIADir string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's add a comment documenting this field
| configStatePending = "pending" | ||
| configStateRebooting = "rebooting" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question -- shouldn't the "failed" and "success" states be defined as constants as well?
|
|
||
| if opts.WithShutdownHostGPUClients { | ||
| log.Info("Shutting down all GPU clients on the host by stopping their systemd services") | ||
| if err := opts.hostStopSystemdServices(systemdClients); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't appear right to me. systemdClients is declared at the top of this function (line 177) but it is never populated with values. Shouldn't we be stopping all the services present in the opts.HostGPUClientServices slice?
|
|
||
| if opts.WithShutdownHostGPUClients { | ||
| log.Info("Restarting all GPU clients previously shutdown on the host by restarting their systemd services") | ||
| if err := opts.hostStartSystemdServices(systemdClients); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as before. I don't see systemdClients being set to anything.
| return err | ||
| } | ||
| if mustRestart { | ||
| systemdGPUClients = append(systemdGPUClients, service) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay so we are appending to this list here... this is hard to follow. I just read the original bash script and I better understand what this is doing now. Some of my previous comments / questions can be disregarded now.
Question -- should we be passing the systemGPUClients slice by reference instead of by value? I believe the answer is yes. When we append() here, a new slice gets created and the original systemdGPUClients identifier at the call-site will not see the newly created slice.
| func (opts *reconfigureMIGOptions) hostStartSystemdServices(systemdGPUClients gpuClients) error { | ||
| if len(systemdGPUClients) == 0 { | ||
| for _, serviceName := range opts.HostGPUClientServices { | ||
| service := opts.newSystemdService(serviceName) | ||
|
|
||
| if mustRestart, _ := service.shouldRestart(); mustRestart { | ||
| systemdGPUClients = append(systemdGPUClients, service) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // TODO: We should allow restarts to continue on failure. | ||
| if err := systemdGPUClients.Restart(); err != nil { | ||
| return fmt.Errorf("some services failed to start: %w", err) | ||
| } | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question -- instead of maintaining a list of systemdGPUClients that need to be restarted BEFORE calling this function, what if we always construct this list in this function? That would eliminate the need to pass any arguments to hostStopSystemdServices() / hostStartSystemdServices() and make this much more readable in my opinion.
This is potentially out-of-scope for this PR which is simply a 1:1 port of the bash script to Go.
| // nodeLabeller defines an interface for interacting with node labels. | ||
| // | ||
| //go:generate moq -rm -fmt=goimports -out node-labeller_mock.go . nodeLabeller | ||
| type nodeLabeller interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we approach the interface differently? What if we had an interface that captured all the Kubernetes APIServer actions? The interface would have methods where each one is a wrapper over kube client calls
|
|
||
| spec, err := cdilib.GetSpec() | ||
| if err != nil { | ||
| return fmt.Errorf("failed to genereate CDI spec for management containers: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return fmt.Errorf("failed to genereate CDI spec for management containers: %v", err) | |
| return fmt.Errorf("failed to generate CDI spec for management containers: %w", err) |
| nvcdi.WithClass("gpu"), | ||
| ) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to create CDI library for management containers: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return fmt.Errorf("failed to create CDI library for management containers: %v", err) | |
| return fmt.Errorf("failed to create CDI library for management containers: %w", err) |
| transformroot.WithTargetDevRoot(opts.DevRoot), | ||
| ) | ||
| if err := transformer.Transform(spec.Raw()); err != nil { | ||
| return fmt.Errorf("failed to transform driver root in CDI spec: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| return fmt.Errorf("failed to transform driver root in CDI spec: %v", err) | |
| return fmt.Errorf("failed to transform driver root in CDI spec: %w", err) |
| if err := transformer.Transform(spec.Raw()); err != nil { | ||
| return fmt.Errorf("failed to transform driver root in CDI spec: %v", err) | ||
| } | ||
| err = spec.Save("/var/run/cdi/management.nvidia.com-gpu.yaml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move "/var/run/cdi" to a constant?
| return opts.Run(cmd) | ||
| } | ||
|
|
||
| func (opts *reconfigurer) runNvidiaSMI() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of running nvidia-smi, can we use go-nvlib/go-nvml to retrieve the relevant info?
| "k8s.io/apimachinery/pkg/watch" | ||
| ) | ||
|
|
||
| // A gpuClient represents a client that can be stoped or restarted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // A gpuClient represents a client that can be stoped or restarted. | |
| // A gpuClient represents a client that can be stopped or restarted. |
pkg/mig/reconfigure/clients.go
Outdated
|
|
||
| func (o *pod) delete() error { | ||
| err := o.node.clientset.CoreV1().Pods(o.namespace).DeleteCollection( | ||
| context.TODO(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should propagate the cli context here
|
|
||
| func (o *pod) waitForDeletion() error { | ||
| timeout := 5 * time.Minute | ||
| ctx, cancel := context.WithTimeout(context.Background(), timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
propagate CLI context here
| } | ||
|
|
||
| func (o *systemdService) Restart() error { | ||
| cmd := exec.Command("chroot", o.hostRootMount, "systemctl", "start", o.name) // #nosec G204 -- HostRootMount validated via dirpath, service validated via systemd_service_name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a go library instead of shelling out?
Would this work? https://github.com/coreos/go-systemd
bb6a01a to
b324fe6
Compare
This change adds a mig/reconfigure package that implements the functionality added to the vgpu-device-manager. Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
Signed-off-by: Evan Lezar <elezar@nvidia.com>
b324fe6 to
675541a
Compare
This change builds on the proposal from NVIDIA/vgpu-device-manager#93 and migrates the
reconfigure-mig.shscript to golang.Ideally I would like to still do the following:
vgpu-device-managernvidia-mig-partedCLI invocations with direct API callsbut these can be done as follow-ups.