PVE IOMMU Isolation for Passthrough: Difference between revisions

From WikiMLT
m (Text replacement - "mlw-continue" to "code-continue")
 
Line 2: Line 2:


== IOMMU Hardware Isolation at the Proxmox host ==
== IOMMU Hardware Isolation at the Proxmox host ==
Here we will setup IOMMU at [https://ark.intel.com/content/www/us/en/ark/products/81709/intel-xeon-processor-e52670-v3-30m-cache-2-30-ghz.html Intel] based host system. More details (and information about the host's hardware in use) are provided in the manual [[KVM and GPU Passthrough to Windows VM]], here are listed only the steps used to isolate one [[:File:Tesla-K20X-BD-06397-001-v07.pdf|NVIDIA Tesla K20Xm]], that will be used as [[PVE Adopt a KVM Windows Guest|GPU at a Windows Guest]], but this is another story. So let's begin. One important thing in the current case is that the Tesla K20Xm is PCI-E 2.0, so to guarantee a stable work of the server, I was needed to go into its BIOS and set the link speed PCI-E 2.0 for the slot in use. Also the option ''Above 4G decoding'' in the BIOS is enabled.
Here we will setup IOMMU at [https://ark.intel.com/content/www/us/en/ark/products/81709/intel-xeon-processor-e52670-v3-30m-cache-2-30-ghz.html Intel] based host system. More details (and information about the host's hardware in use) are provided in the manual [[QEMU/KVM and GPU Passthrough in Details]], here are listed only the steps used to isolate one [[:File:Tesla-K20X-BD-06397-001-v07.pdf|NVIDIA Tesla K20Xm]], that will be used as [[PVE Adopt a KVM Windows Guest|GPU at a Windows Guest]], but this is another story. So let's begin. One important thing in the current case is that the Tesla K20Xm is PCI-E 2.0, so to guarantee a stable work of the server, I was needed to go into its BIOS and set the link speed PCI-E 2.0 for the slot in use. Also the option ''Above 4G decoding'' in the BIOS is enabled.


== Enable IOMMU ==
== Enable IOMMU ==

Latest revision as of 17:56, 1 November 2022

IOM­MU Hard­ware Iso­la­tion at the Prox­mox host

Here we will set­up IOM­MU at In­tel based host sys­tem. More de­tails (and in­for­ma­tion about the host's hard­ware in use) are pro­vid­ed in the man­u­al QEMU/KVM and GPU Passthrough in De­tails, here are list­ed on­ly the steps used to iso­late one NVIDIA Tes­la K20Xm, that will be used as GPU at a Win­dows Guest, but this is an­oth­er sto­ry. So let's be­gin. One im­por­tant thing in the cur­rent case is that the Tes­la K20Xm is PCI‑E 2.0, so to guar­an­tee a sta­ble work of the serv­er, I was need­ed to go in­to its BIOS and set the link speed PCI‑E 2.0 for the slot in use. Al­so the op­tion Above 4G de­cod­ing in the BIOS is en­abled.

En­able IOM­MU

En­able IOM­MU iso­la­tion by per­form­ing the fol­low­ing steps.

nano /etc/default/grub
#GRUB_CMDLINE_LINUX_DEFAULT="quiet"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"
update-grub
systemctl reboot

Find the de­vices to be iso­lat­ed

Find the de­vices to be iso­lat­ed by us­ing the fol­low­ing script – source.

nano /usr/local/bin/get_iommu_groups.sh && chmod +x /usr/local/bin/get_iommu_groups.sh
#Script
#!/bin/bash
# https://mathiashueber.com/pci-passthrough-ubuntu-2004-virtual-machine/
# change the 9999 if needed
shopt -s nullglob
for d in /sys/kernel/iommu_groups/{0..9999}/devices/*; do
    n=${d#*/iommu_groups/*}; n=${n%%/*}
    printf 'IOMMU Group %s ' "$n"
    lspci -nns "${d##*/}"
done

Run the script and fil­ter the out­put.

get_iommu_groups.sh | grep -iP 'Ethernet controller|NVIDIA'
#Out­put
IOMMU Group 40 02:00.0 3D controller [0302]: NVIDIA Corporation GK110GL [Tesla K20Xm] [10de:1021] (rev a1)
IOMMU Group 43 08:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
IOMMU Group 44 09:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)

We will iso­late IOM­MU Group 40 that con­tains PCI-bus 02:00.0,de­vice Id: 10de:1021.

VI­FO Mod­ules

You'll need to add a few VFIO mod­ules to your Prox­mox sys­tem.

nano /etc/modules
# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.

vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
echo "options vfio_iommu_type1 allow_unsafe_interrupts=1" > /etc/modprobe.d/iommu_unsafe_interrupts.conf
echo "options kvm ignore_msrs=1" > /etc/modprobe.d/kvm.conf

Black­list­ing Dri­vers

nano /etc/modprobe.d/blacklist.conf
# Blacklist NVIDIA Modules - 'lsmod', 'dmesg' and 'lspci -nnv'
blacklist nvidiafb
blacklist nouveau
blacklist nvidia_drm
blacklist nvidia
blacklist rivafb
blacklist rivatv
blacklist snd_hda_intel
blacklist radeon

options nouveau modeset-0
update-initramfs -u
reset 
reboot

Adding GPU to VI­FO

lspci -n -s 02:00
02:00.0 0302: 10de:1021 (rev a1)
nano /etc/modprobe.d/vfio.conf
softdep nouveau pre: vfio-pci
softdep snd_hda_intel pre: vfio-pci
options vfio-pci ids=10de:1021 disable_vga=1
update-initramfs -u
reset 
systemctl reboot

Ver­i­fy the Iso­la­tion

Ver­i­fy the Iso­la­tion of the GPU by the help of the fol­low­ing com­mand. Note the line Ker­nel dri­ver in use: vfio-pci.

sudo lspci -nnv -s 02:00
#Out­put
02:00.0 3D controller [0302]: NVIDIA Corporation GK110GL [Tesla K20Xm] [10de:1021] (rev a1)
        Subsystem: NVIDIA Corporation GK110GL [Tesla K20Xm] [10de:097d]
        Physical Slot: 1
        Flags: fast devsel, IRQ 255, NUMA node 0, IOMMU group 40
        Memory at fa000000 (32-bit, non-prefetchable) [disabled] [size=16M]
        Memory at 23fe0000000 (64-bit, prefetchable) [disabled] [size=256M]
        Memory at 23ff0000000 (64-bit, prefetchable) [disabled] [size=32M]
        Expansion ROM at fb000000 [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Endpoint, MSI 00
        Capabilities: [100] Virtual Channel
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [420] Advanced Error Reporting
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau

Ref­er­ences