Skip to content

ave6192/libvirt-hooks

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

libvirt-hooks

Custom libvirt hook scripts for specific system management
libvirt hooks overview
https://libvirt.org/hooks.html

QEMU hook script

Executed when a QEMU guest is started, stopped, or migrated.

Custom system management supported for QEMU guest

Shared NVSwitch GPU Partition Automatic Configuration

A custom libvirt hook script for automatic configuration of NVIDIA Fabric Manager Shared NVSwitch GPU Partitions during startup and shutdown of a QEMU guest virtual machine.

Limitation

The QEMU libvirt hook script for the automatic configuration of NVIDIA Fabric Manager Shared NVSwitch GPU Partitions is only supported on NVIDIA HGX H100 and later systems with Autonomous Link Initialization (ALI) hardware feature. For NVIDIA HGX-2 and NVIDIA HGX A100 systems without ALI hardware feature, once the Shared NVSwitch GPU partition is activated, GPU reset should be skipped during guest VM start. If the GPUs get a PCIe reset as part of guest VM launch, the GPU NVLinks will be in an InActive state on the guest VM. Starting the guest VM without a GPU reset might require a modification in the hypervosr VM launch sequence. For details please refer to the NVIDIA Fabric Manager User Guide

Installation Prerequisite
  1. The custom libvirt hook script is a Python script.
    Install Python on the system.

  2. The custom libvirt hook script depends on Fabric Manager Partition Manager.
    Install its dependencies.

    1. Install the NVIDIA Fabric Manager Development package.
      On Ubuntu, this package is named "nvidia-fabricmanager-dev-<version>"
      On RHEL, this package is named "nvidia-fabricmanager-devel-<version>"

    2. Install the JSON CPP development package.
      On Ubuntu, this package is named "libjsoncpp-dev"
      On RHEL, this package is named "jsoncpp-devel
      The EPEL repository must be set up on your system to access this package.

    3. Obtain Fabric Manager Partition Manager source and build it.
      Deploy the binary fmpm in /usr/bin/

  3. Install the NVIDIA Fabric Manager.
    The package is named "nvidia-fabricmanager-<version>"
    The version shall match the version of NVIDIA GPU driver installed on the system.

  4. Configure Fabric Manager to Shared NVSwitch Mode.
    Set FABRIC_MODE=1 in /usr/share/nvidia/nvswitch/fabricmanager.cfg
    Restart Fabric Manager service.

sed -i 's/FABRIC_MODE=./FABRIC_MODE=1/g' /usr/share/nvidia/nvswitch/fabricmanager.cfg
sudo systemctl restart nvidia-fabricmanager.service
  1. Install the NVIDIA Fabric Manager Development package for the Fabric Manager SDK.
    On RHEL, this package is named "nvidia-fabricmanager-devel-<version>"
    On Ubuntu, this package is named "nvidia-fabricmanager-dev-<version>"

Deploy the QEMU libvirt hook script

Deploy the libvirt hook script for QEMU at /etc/libvirt/hooks/qemu

sudo wget 'https://raw.githubusercontent.com/NVIDIA/libvirt-hooks/refs/heads/main/qemu' -O /etc/libvirt/hooks/qemu
sudo chmod +x /etc/libvirt/hooks/qemu
sudo systemctl restart libvirtd

If using the host to manage the NVSwitches, the Fabric Manager runs on the host
and configured to listen on the default interface 127.0.0.1. No change is needed
to the libvirt hook script.

If using a dedicated Service VM to manage the NVSwitches with all NVSwitches passed through
to the Service VM, the Fabric Manager runs on the Service VM and is configured
to listen on the Service VM's network interface xxx.yyy.zzz.www.

Update the libvirt hook script with the IP and port number FM is configured to listen on \

SERVICE_VM_IP = "xxx.yyy.zzz.www"
PORT_NUM="xxx"
sed -i "s/\(FM_IP = \)\"[^\"]*\"/\1\"$SERVICE_VM_IP:$PORT_NUM\"/" /etc/libvirt/hooks/qemu

If the Fabric Manager in the Service VM is configured to listen on the default port number,
skip the PORT_NUM update in the libvirt hook script.

sed -i "s/\(FM_IP = \)\"[^\"]*\"/\1\"$SERVICE_VM_IP\"/" /etc/libvirt/hooks/qemu

License

By downloading or using this software, I agree to the terms of the LICENSE

About

Custom libvirt hook scripts for specific system management

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%