QEMU/KVM and GPU Passthrough in Details: Difference between revisions

From WikiMLT
Spas (talk | contribs)
Spas (talk | contribs)
m Text replacement - "mlw-continue" to "code-continue"
 
(26 intermediate revisions by the same user not shown)
Line 4: Line 4:


== The Host System ==
== The Host System ==
The host operating system is Ubuntu Server 20.04 with kernel 5.4. Also ProxmoxVE 7.2 with kernel 5.15 is valid tested host.
The host operating system is Ubuntu Server 20.04 with kernel 5.4. Also ProxmoxVE 7.2 with kernel 5.15 is valid tested host. The host CPU is <u>Intel</u> Xeon but in the this manual will be provided also <u>AMD</u> specific parameters and commands.


=== Host Hardware ===
=== Host Hardware ===
Line 48: Line 48:
sudo nano /etc/default/grub # cat /etc/default/grub | grep 'GRUB_CMDLINE_LINUX_DEFAULT'
sudo nano /etc/default/grub # cat /etc/default/grub | grep 'GRUB_CMDLINE_LINUX_DEFAULT'
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash" class="mlw-continue">
<syntaxhighlight lang="bash" class="code-continue">
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on iommu=pt"  # For Intel CPU (current case)
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on iommu=pt"  # For Intel CPU (current case)
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="bash" class="mlw-continue">
<syntaxhighlight lang="bash" class="code-continue">
GRUB_CMDLINE_LINUX_DEFAULT="amd_iommu=on iommu=pt"    # For AMD CPU
GRUB_CMDLINE_LINUX_DEFAULT="amd_iommu=on iommu=pt"    # For AMD CPU
</syntaxhighlight>
</syntaxhighlight>
Line 66: Line 66:
{{collapse/end}}
{{collapse/end}}


{{collapse/begin}}
Update the boot manager configuration and reboot the system.
Update the boot manager configuration and reboot the system.
{{collapse/div|#CLI}}
<syntaxhighlight lang="shell" line="1">
<syntaxhighlight lang="shell" line="1">
sudo update-grub
sudo update-grub
sudo systemctl reboot
sudo systemctl reboot
</syntaxhighlight>
</syntaxhighlight>
{{collapse/end}}After the reboot verify does IOMMU is enabled:
 
{{collapse/begin}}
For systemd boot manager as used in Pop!_OS.
{{collapse/div|#Pop!_OS}}
One can use the kernelstub module, on systemd booting operating systems, in order to provide boot parameters. Use it like so:
<syntaxhighlight lang="shell" line="1">
sudo kernelstub -o "amd_iommu=on amd_iommu=pt"
</syntaxhighlight>
 
And later to do the isolation use (with correct ids):
<syntaxhighlight lang="shell" line="1">
sudo kernelstub --add-options "vfio-pci.ids=10de:1b80,10de:10f0,8086:1533"
</syntaxhighlight>
 
References:
 
* MathiasHueber: [https://mathiashueber.com/pci-passthrough-ubuntu-2004-virtual-machine/ Virtual machines with PCI passthrough on Ubuntu 20.04, straightforward guide for gaming on a virtual machine]
{{collapse/end}}
 
After the reboot verify does IOMMU is enabled:
{{collapse/begin}}
{{collapse/begin}}
<syntaxhighlight lang="shell" line="1">
<syntaxhighlight lang="shell" line="1">
Line 93: Line 110:
</syntaxhighlight>
</syntaxhighlight>
{{collapse/end}}
{{collapse/end}}
{{collapse/begin}}
{{collapse/begin}}
<syntaxhighlight lang="shell" line="1">
<syntaxhighlight lang="shell" line="1">
sudo dmesg | grep -i 'vfio'  # For Intel CPU (current case)
sudo dmesg | grep -i 'vfio'  # For Intel CPU (current case)
</syntaxhighlight>
<syntaxhighlight lang="shell" line="1">
sudo dmesg |grep 'AMD-Vi'    # For AMD CPU
</syntaxhighlight>
</syntaxhighlight>
{{collapse/div|#Output}}
{{collapse/div|#Output}}
Line 108: Line 123:
[    0.880568] vfio_pci: add [10de:107c[ffffffff:ffffffff]] class 0x000000/00000000
[    0.880568] vfio_pci: add [10de:107c[ffffffff:ffffffff]] class 0x000000/00000000
[    0.900583] vfio_pci: add [10de:0e08[ffffffff:ffffffff]] class 0x000000/00000000
[    0.900583] vfio_pci: add [10de:0e08[ffffffff:ffffffff]] class 0x000000/00000000
</syntaxhighlight>
{{collapse/end}}
{{collapse/begin}}
<syntaxhighlight lang="shell" line="1" class="mlw-shell-gray">
sudo dmesg |grep 'AMD-Vi'    # For AMD CPU
</syntaxhighlight>
{{collapse/div|#Output}}
<syntaxhighlight lang="text" class="mlw-collapsed-first-element">
[    0.607751] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.608569] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.608569] AMD-Vi: Extended features (0x58f77ef22294a5a): PPR NX GT IA PC GA_vAPIC
[    0.608572] AMD-Vi: Interrupt remapping enabled
[    0.890747] AMD-Vi: AMD IOMMUv2 loaded and initialized
</syntaxhighlight>
</syntaxhighlight>
{{collapse/end}}
{{collapse/end}}
Line 153: Line 182:
=== Isolation of the Guest GPU ===
=== Isolation of the Guest GPU ===


In order to isolate the GPU we have two options. Select the devices by PCI bus address or by device ID. Both options have pros and cons. Here we will isolate FIO-pci driver '''by device id'''. This option should only be used, in case the graphic cards (or other devices that will be isolated) in the system are not exactly the same model, otherwise we need to use isolation by PCI bus, because the devices will have an identical IDs.
In order to isolate the GPU we have two options. Select the devices by PCI bus address or by device ID. Both options have pros and cons. Here we will isolate VFIO-pci driver '''by device id'''. This option should only be used, in case the graphic cards (or other devices that will be isolated) in the system are not exactly the same model, otherwise we need to use isolation by PCI bus, because the devices will have an identical IDs.
{{collapse/begin}}
{{collapse/begin}}
<syntaxhighlight lang="shell" line="1">
<syntaxhighlight lang="shell" line="1">
Line 229: Line 258:
{{Collapse/begin}}
{{Collapse/begin}}
<syntaxhighlight lang="shell" line="1">
<syntaxhighlight lang="shell" line="1">
sudo apt install qemu qemu-kvm libvirt-daemon libvirt-clients bridge-utils virt-manager virtinst libosinfo-bin ovmf
sudo apt install qemu-system-x86 libvirt-daemon bridge-utils
sudo apt install libvirt-clients virtinst libosinfo-bin ovmf
sudo apt install virt-manager virt-viewer remmina # For desktop user
</syntaxhighlight>
</syntaxhighlight>
{{Collapse/div|#Explanations {{!}} #Apply}}
{{Collapse/div|#Explanations {{!}} #Apply}}
Short explanations:
The above set of packages is for the latest Debian based distributions. For little bit older like Ubuntu 20.04 try with the following set.
<syntaxhighlight lang="shell" line="1"  class="mlw-shell-gray" id="quemu-ubuntu-20">
sudo apt install qemu qemu-kvm bridge-utils
sudo apt install libvirt-daemon libvirt-clients virtinst libosinfo-bin ovmf
sudo apt install virt-manager virt-viewer remmina # For desktop user
</syntaxhighlight>
 
Explanation about the packages:
*The <code>qemu</code> package (quick emulator) is an application that allows you to perform hardware virtualization.
*The <code>qemu</code> package (quick emulator) is an application that allows you to perform hardware virtualization.
*The <code>qemu-kvm</code> package is the main KVM package.
*The <code>qemu-kvm</code> package is the main KVM package.
*The <code>libvrit-daemon</code> is the virtualization daemon.
*The <code>libvrit-daemon</code> is the virtualization daemon.
*The <code>bridge-utils</code> package helps you create a bridge connection to allow other users to access a virtual machine other than the host system.
*The <code>bridge-utils</code> package helps you create a bridge connection to allow other users to access a virtual machine other than the host system.
*The <code>virt-manager</code> is an application for managing virtual machines through a graphical user interface.
*The <code>virtinst</code> package contains programs to create and clone virtual machines. It is a set of command-line tools to create virtual machines using <code>libvirt</code>.
*The <code>virtinst</code> package contains programs to create and clone virtual machines. It is a set of command-line tools to create virtual machines using <code>libvirt</code>.
*The <code>libosinfo-bin</code> package contains tools for querying the osinfo database via <code>libosinfo</code>... It includes a database containing device metadata and provides APIs to match/identify optimal devices for deploying an operating system on a hypervisor.
*The <code>libosinfo-bin</code> package contains tools for querying the osinfo database via <code>libosinfo</code>... It includes a database containing device metadata and provides APIs to match/identify optimal devices for deploying an operating system on a hypervisor.
*The <code>ovmf</code> package is UEFI firmware for 64-bit x86 virtual machines. Open Virtual Machine Firmware is a build of EDK II for 64-bit x86 virtual machines. It includes full support for UEFI, including Secure Boot, allowing use of UEFI in place of a traditional BIOS in your VM.
*The <code>ovmf</code> package is UEFI firmware for 64-bit x86 virtual machines. Open Virtual Machine Firmware is a build of EDK II for 64-bit x86 virtual machines. It includes full support for UEFI, including Secure Boot, allowing use of UEFI in place of a traditional BIOS in your VM.
*The <code>virt-manager</code> is an application for managing virtual machines through a graphical user interface.
*The <code>virt-viewer</code> is a [https://www.spice-space.org/index.html SPICE] GUI client.
*The <code>remmina</code> package is remote session manager it supports RDP, VNC, SSH and SFTP protocols.
{{Collapse/end}}


Verify whether KVM module is loaded into the loaded and verify whether libvirt daemon will start automatically:
Verify whether KVM module is loaded into the loaded and verify whether libvirt daemon will start automatically:
Line 247: Line 288:
lsmod | grep -i kvm
lsmod | grep -i kvm
</syntaxhighlight>
</syntaxhighlight>
Basic management – enable, start, get the status or sto and disable <code>libvirtd.service</code>:
Basic management – enable, start, get the status or stop and disable <code>libvirtd.service</code>:
<syntaxhighlight lang="shell" line="1">
<syntaxhighlight lang="shell" line="1">
sudo systemctl (enable|start|status/stop|disable) libvirtd.service
sudo systemctl (enable|start|status/stop|disable) libvirtd.service
</syntaxhighlight>Add your user to the ''libvirt'' and ''kvm'' groups in order to execute related command without <code>sudo</code>:<syntaxhighlight lang="shell" line="1">
</syntaxhighlight>
Add your user to the ''libvirt'' and ''kvm'' groups in order to execute related command without <code>sudo</code>:<syntaxhighlight lang="shell" line="1">
sudo usermod -aG libvirt $USER
sudo usermod -aG libvirt $USER
sudo usermod -aG kvm $USER
sudo usermod -aG kvm $USER
grep "$USER" /etc/group
grep "$USER" /etc/group
</syntaxhighlight>
</syntaxhighlight>
{{Collapse/end}}


=== Host NVIDIA Kernel Modules and Drivers ===
=== NVIDIA Kernel Modules and Drivers at the Host Level ===
{{collapse/begin}}
{{collapse/begin}}
''Note this guide covers installation on Linux server where the host wont use the GPU or any other NVIDIA GPU.''
{{collapse/div|#References {{!}} #Apply}}
Remove previously installed NVIDIA drivers. And analyze the output of <code>lsmod</code>, <code>dmesg</code> and <code>lspci -nnv</code> in order to find which modules are related to the guest GPU and <code>blacklist</code> them by creating a new section at the bottom of the file <code>/etc/modprobe.d/blacklist.conf</code>.
Remove previously installed NVIDIA drivers. And analyze the output of <code>lsmod</code>, <code>dmesg</code> and <code>lspci -nnv</code> in order to find which modules are related to the guest GPU and <code>blacklist</code> them by creating a new section at the bottom of the file <code>/etc/modprobe.d/blacklist.conf</code>.
{{collapse/div|#References {{!}} #Apply}}
 
<s><syntaxhighlight lang="shell" line="1">
<s><syntaxhighlight lang="shell" line="1">
sudo apt remove --purge nvidia-headless-390 && sudo apt autoremove && sudo apt autoclean
sudo apt remove --purge nvidia-headless-390 && sudo apt autoremove && sudo apt autoclean
Line 267: Line 310:
dpkg -l | grep -i nvidia
dpkg -l | grep -i nvidia
</syntaxhighlight>
</syntaxhighlight>
<syntaxhighlight lang="shell" line="1" class="mlw-bind-collapsed-up mlw-continue-padding-top-07em">
<syntaxhighlight lang="shell" line="1" class="mlw-bind-collapsed-up code-continue-padding-top-07em">
sudo apt-get remove --purge '^nvidia-.*'
sudo apt-get remove --purge '^nvidia-.*'
sudo apt autoremove && sudo apt autoclean
sudo apt autoremove && sudo apt autoclean
Line 287: Line 330:
</syntaxhighlight>
</syntaxhighlight>
References:
References:
* [https://manpages.ubuntu.com/manpages/precise/man5/modprobe.conf.5.html Ubuntu Manuals:  modprobe.d, modprobe.confmodprobe.conf]
* [https://manpages.ubuntu.com/manpages/precise/man5/modprobe.conf.5.html Ubuntu Manuals:  modprobe.d, modprobe.conf]
* [https://linuxconfig.org/how-to-blacklist-a-module-on-ubuntu-debian-linux LinuxConfig: How to blacklist a module on Ubuntu/Debian Linux]
* [https://linuxconfig.org/how-to-blacklist-a-module-on-ubuntu-debian-linux LinuxConfig: How to blacklist a module on Ubuntu/Debian Linux]
* [https://askubuntu.com/a/972534/566421 AskUbuntu: Blacklist a Nvidia gpu for qemu/kvm passthrough (blacklist one of two NVIDIA cards)]
* [https://askubuntu.com/a/972534/566421 AskUbuntu: Blacklist a Nvidia gpu for qemu/kvm passthrough (blacklist one of two NVIDIA cards)]
Line 293: Line 336:


=== Deploy fresh OVMF Firmware for VMs ===
=== Deploy fresh OVMF Firmware for VMs ===
{{Collapse/begin}}
''This step is no longer required. The UEFI images from the latest versions of the <code>ovmf</code> package installed above are robust, stable and fast.''
{{Collapse/div|#References {{!}} #CLI {{!}} #Apply}}
<s>The default OVMF images that was installed in the previous section are relatively old. In addition I didn't succeed to setup WSL2 with these UEFI images. Also with the images provided below the VM performance is about 5% faster.</s>


{{Collapse/begin}}
Download and deploy an appropriate package with fresh OVMF images from [https://www.kraxel.org/repos/jenkins/edk2/ kraxel.org/repos/jenkins/edk2/]:
The default OVMF images that was installed in the previous section are relatively old. In addition I didn't succeed to setup WSL2 with these UEFI images. Also with the images provided below the VM performance is about 5% faser.
{{Collapse/div|#References {{!}} #CLI {{!}} #Apply}}Download and deploy an appropriate package with fresh OVMF images from [https://www.kraxel.org/repos/jenkins/edk2/ kraxel.org/repos/jenkins/edk2/]:
<syntaxhighlight lang="shell" line="1">
<syntaxhighlight lang="shell" line="1">
mkdir ~/Downloads/kvm-ovmf-kraxel/
mkdir ~/Downloads/kvm-ovmf-kraxel/
Line 312: Line 357:
{{Collapse/end}}
{{Collapse/end}}


== Virtual machine Management ==
== Additional Setup, Issues, Solutions and Tips ==
 
* [[Linux Network Bridge (refs)]]
=== Virsh: Command-line Management User Interface ===
{{:{{FULLPAGENAME}}/Virsh|mw-collapsed}}
 
=== Remote GUI Management ===
{{Collapse/begin}}
For remote GUI management you can use the tool <code>virt-manager</code> on the remote machine. If there is a set SSH connection (via <code>~/.ssh/config</code>) you can use this option where you won't have to worry about extra open ports in your firewall. If you are a Windows' with user <code>WSL2</code> enabled you can use <code>virt-manager</code> at least three ways.
{{Collapse/div|#Refs}}
* Setup xRDP in WSL2, watch [https://youtu.be/IL7Jd9rjgrM WSL2 Ubuntu GUI by David Bombal] for more details.
* [https://opticos.github.io/gwsl/tutorials/manual.html Use GWSL] available in [https://www.microsoft.com/en-us/p/gwsl/9nl6kd1h33v3?activetab=pivot:overviewtab Microsoft Store] – this is really convenient way.
* [https://github.com/microsoft/wslg Use WSLg], that will be available in Windows 11.
{{Collapse/end}}
 
== Related articles ==
* [[Virt-manager Setting-up Windows Virtual Machines]]
* [[Virt-manager Setting-up Windows Virtual Machines]]
*
* [[QEMU/KVM and GPU Passthrough Troubleshooting]]
 
* [[QEMU/KVM Guest tools]]
== Issues, Solutions and Tips ==
* [[QEMU/KVM Virtual machine Management]]
{{:{{FULLPAGENAME}}/Windows 10 Guest Message Signaled Interrupts Setup|mlw-collapsed-gallery}}
 
{{Collapse/begin}}
=== Hide the Warnings KVM: vcpu0 ignored rdmsr ===
{{Collapse/div|#References {{!}} #Apply {{!}} #CLI}}
<syntaxhighlight lang="shell" line="1">
sudo nano /etc/modprobe.d/kvm.conf
</syntaxhighlight>
<syntaxhighlight lang="bash">
options kvm report_ignored_msrs=0 # Or: GRUB_CMDLINE_LINUX_DEFAULT="... kvm.report_ignored_msrs=0 ..."
</syntaxhighlight>
<syntaxhighlight lang="shell" line=1>
sudo update-initramfs -u -k all
</syntaxhighlight>
* Note, according to my experience, this option may cause crashes of the host system!?
References:
* [https://forum.proxmox.com/threads/how-to-stop-warnings-kvm-vcpu0-ignored-rdmsr.28552/ Proxmox: How to stop warnings "kvm: vcpu0 ignored rdmsr"]
* [https://forum.proxmox.com/threads/ignore_msrs-for-host-cpu-being-ignored.42416/ ignore_msrs for host CPU being ignored]
{{Collapse/end}}
 
{{Collapse/begin}}
=== KVM Modeinfo ===
{{Collapse/div|#Output {{!}} #CLI}}
<syntaxhighlight lang="shell" line="1">
modinfo kvm
</syntaxhighlight>
 
<syntaxhighlight lang="bash">
filename:      /lib/modules/5.4.0-77-generic/kernel/arch/x86/kvm/kvm.ko
license:        GPL
author:        Qumranet
srcversion:    20C68083F39E14AB616D0B8
depends:
retpoline:      Y
intree:        Y
name:          kvm
vermagic:      5.4.0-77-generic SMP mod_unload modversions
sig_id:        PKCS#7
signer:        Build time autogenerated kernel key
sig_key:        67:66:2F:F8:26:8F:56:E9:37:7F:B7:AD:33:FA:97:31:CA:7F:FF:50
sig_hashalgo:  sha512
signature:      6C:0B:65:F9:46:AC:D8:B7:94:E8:B9:9D:A0:4B:97:E6:63:52:5A:FF:
                ...
parm:          nx_huge_pages:bool
parm:          nx_huge_pages_recovery_ratio:uint
parm:          ignore_msrs:bool
parm:          report_ignored_msrs:bool
parm:          min_timer_period_us:uint
parm:          kvmclock_periodic_sync:bool
parm:          tsc_tolerance_ppm:uint
parm:          lapic_timer_advance_ns:int
parm:          vector_hashing:bool
parm:          enable_vmware_backdoor:bool
parm:          force_emulation_prefix:bool
parm:          pi_inject_timer:bint
parm:          halt_poll_ns:uint
parm:          halt_poll_ns_grow:uint
parm:          halt_poll_ns_grow_start:uint
parm:          halt_poll_ns_shrink:uint
</syntaxhighlight>
{{Collapse/end}}
{{:{{FULLPAGENAME}}/Passthrough GPU Tips}}
{{:{{FULLPAGENAME}}/Network_Bridge|mw-collapsed}}
{{:{{FULLPAGENAME}}/ThinkServer TD350 Issues}}


== General References ==
== General References ==
Line 415: Line 383:
{{devStage  
{{devStage  
  | Прндл  = Virtual Machines
  | Прндл  = Virtual Machines
  | Стадий = 3
  | Стадий = 6
  | Фаза  = Разработване
  | Фаза  = Утвърждаване
  | Статус = Разутвърден
  | Статус = Утвърден
  | ИдтПт  = Spas
  | ИдтПт  = Spas
  | РзбПт  = {{REVISIONUSER}}
  | РзбПт  = Spas
  | АвтПт  = Spas
  | АвтПт  = Spas
  | УтвПт  = Spas
  | УтвПт  = {{REVISIONUSER}}
  | ИдтДт  = 2.07.2022
  | ИдтДт  = 2.07.2022
  | РзбДт  = {{Today}}
  | РзбДт  = 21.08.2022
  | АвтДт  = 20.08.2022
  | АвтДт  = 23.08.2022
  | УтвДт  = 20.08.2022
  | УтвДт  = {{Today}}
  | ИдтРв  = [[Special:Permalink/27120|27120]]
  | ИдтРв  = [[Special:Permalink/27120|27120]]
  | РзбРв  = {{REVISIONID}}
  | РзбРв  = [[Special:Permalink/30514|30514]]
  | АвтРв  =  
  | АвтРв  = [[Special:Permalink/30588|30588]]
  | РзАРв  = [[Special:Permalink/30475|30475]]
  | РзАРв  = [[Special:Permalink/30475|30475]]
  | УтвРв  =  
  | УтвРв  = {{REVISIONID}}
  | РзУРв  = [[Special:Permalink/30477|30477]]
  | РзУРв  = [[Special:Permalink/30477|30477]]
}}
}}
</div>
</div>
</noinclude>
</noinclude>

Latest revision as of 07:28, 26 September 2022

This is a doc­u­men­ta­tion of my ex­pe­ri­ence in cre­at­ing a vir­tu­al ma­chine ca­pa­ble to run Win­dows 10 guest OS (for desk­top op­er­a­tions) with­in my home serv­er which op­er­at­ing sys­tem is Ubun­tu Serv­er. The Win­dows 10 guest it­self must be ca­pa­ble to run vir­tu­al­iza­tion in or­der to use WSL2 in­side. The passthrough op­tion is not manda­to­ry for my user case but I de­cid­ed to try it.

The Host Sys­tem

The host op­er­at­ing sys­tem is Ubun­tu Serv­er 20.04 with ker­nel 5.4. Al­so Prox­moxVE 7.2 with ker­nel 5.15 is valid test­ed host. The host CPU is In­tel Xeon but in the this man­u­al will be pro­vid­ed al­so AMD spe­cif­ic pa­ra­me­ters and com­mands.

Host Hard­ware

#Out­put | #CLI | #Hard­ware de­tails
lspci | grep VGA
#Out­put
02:00.0 VGA compatible controller: NVIDIA Corporation GF119 [NVS 315] (rev a1)
07:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 30)
sudo lshw -class memory | sed -n -e '/bank:0/,/bank:1/p' -e '/bank:2/,/bank:3/p'| sed -e 's/^.*bank.*$//'
#Out­put
description: DIMM DDR4 Synchronous 2133 MHz (0.5 ns)
product: HMA41GR7AFR4N-TF
vendor: Hynix Semiconductor
physical id: 0
serial: 517692CB
slot: CPU1 DIMM A1
size: 8GiB
width: 64 bits
clock: 2133MHz (0.5ns)

description: DIMM DDR4 Synchronous 2133 MHz (0.5 ns)
product: HMA41GR7AFR4N-TF
vendor: Hynix Semiconductor
physical id: 2
serial: 51769065
slot: CPU1 DIMM B1
size: 8GiB
width: 64 bits
clock: 2133MHz (0.5ns)
lscpu | sed -nr '/Model name/ s/.*:\s*(.*) @ .*/\1/p'
#Out­put
Intel(R) Xeon(R) CPU E5-2673 v3
sudo lscpu
#Out­put
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   46 bits physical, 48 bits virtual
CPU(s):                          24
On-line CPU(s) list:             0-23
Thread(s) per core:              2
Core(s) per socket:              12
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           63
Model name:                      Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Stepping:                        2
CPU MHz:                         1197.645
CPU max MHz:                     3100.0000
CPU min MHz:                     1200.0000
BogoMIPS:                        4789.15
Virtualization:                  VT-x
L1d cache:                       384 KiB
L1i cache:                       384 KiB
L2 cache:                        3 MiB
L3 cache:                        30 MiB
NUMA node0 CPU(s):               0-23
Vulnerability Itlb multihit:     KVM: Mitigation: Split huge pages
Vulnerability L1tf:              Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds:               Mitigation; Clear CPU buffers; SMT vulnerable
Vulnerability Meltdown:          Mitigation; PTI
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Full generic retpoline, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constan
                                 t_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtp
                                 r pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti intel_ppin ssb
                                 d ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dther
                                 m ida arat pln pts md_clear flush_l1d

Test the Vir­tu­al­iza­tion Ca­pa­bil­i­ties of the Sys­tem

Check weath­er the sys­tem sup­ports vir­tu­al­i­sa­tion. The fol­low­ing com­mand must re­turn at least 1:

egrep -c '(vmx|svm)' /proc/cpuinfo
#Ref | #Out­put | #CLI | #Out­put
24
An­oth­er ap­proach is to use the com­mand kvm-ok from the pack­age cpu-check­er:
sudo apt install cpu-checker && kvm-ok
INFO: /dev/kvm exists
KVM acceleration can be used

En­able the rel­e­vant vir­tu­al­i­sa­tion set­tings (VT‑x/​​​AMD‑V) in the UEFI/BIOS:

Set­ting-up the PCI Passthrough

This sec­tion is pri­ma­ry based on the great Math­ias Hueber's man­u­al, so not on­ly the com­mands, but some of the sen­tences are copy/​​​paste from there.

En­abling IOM­MU

In or­der to en­abling the IOM­MU fea­ture we must ed­it the con­fig­u­ra­tion file /​​​etc/​​​default/​​​grub, as fol­low:

sudo nano /etc/default/grub # cat /etc/default/grub | grep 'GRUB_CMDLINE_LINUX_DEFAULT'
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on iommu=pt"   # For Intel CPU (current case)
GRUB_CMDLINE_LINUX_DEFAULT="amd_iommu=on iommu=pt"     # For AMD CPU
#Ref | #Ex­pla­na­tions | #Ap­ply

Short ex­pla­na­tions:

  • In com­put­ing, an input–output mem­o­ry man­age­ment unit (IOM­MU) is a mem­o­ry man­age­ment unit (MMU) that con­nects a direct-memory-access–capable (DMA-ca­pa­ble) I/O bus to the main mem­o­ry. Like a tra­di­tion­al MMU, which trans­lates CPU-vis­i­ble vir­tu­al ad­dress­es to phys­i­cal ad­dress­es, the IOM­MU maps de­vice-vis­i­ble vir­tu­al ad­dress­es (al­so called de­vice ad­dress­es or I/O ad­dress­es in this con­text) to phys­i­cal ad­dress­es. Some units al­so pro­vide mem­o­ry pro­tec­tion from faulty or ma­li­cious de­vices.
  • To en­able sin­gle-root input/​​​output vir­tu­al­iza­tion (SR-IOV) in the ker­nel, con­fig­ure intel_iommu=on in the grub file. To get the best per­for­mance, add iommu=pt (pass-through) to the grub file when us­ing SR-IOV. When in pass-through mode, the adapter does not need to use DMA trans­la­tion to the mem­o­ry, and this im­proves the per­for­mance. iommu=pt is need­ed main­ly with hy­per­vi­sor per­for­mance is need­ed.
  • The Open Vir­tu­al Ma­chine Firmware (OVMF) is a project to en­able UE­FI sup­port for vir­tu­al ma­chines. Start­ing with Lin­ux 3.9 and re­cent ver­sions of QE­MU, it is now pos­si­ble to passthrough a graph­ics card, of­fer­ing the VM na­tive graph­ics per­for­mance which is use­ful for graph­ic-in­ten­sive tasks.

Ref­er­ences:

Up­date the boot man­ag­er con­fig­u­ra­tion and re­boot the sys­tem.

sudo update-grub
sudo systemctl reboot

For sys­temd boot man­ag­er as used in Pop!_OS.

#Pop!_OS

One can use the ker­nel­stub mod­ule, on sys­temd boot­ing op­er­at­ing sys­tems, in or­der to pro­vide boot pa­ra­me­ters. Use it like so:

sudo kernelstub -o "amd_iommu=on amd_iommu=pt"

And lat­er to do the iso­la­tion use (with cor­rect ids):

sudo kernelstub --add-options "vfio-pci.ids=10de:1b80,10de:10f0,8086:1533"

Ref­er­ences:

Af­ter the re­boot ver­i­fy does IOM­MU is en­abled:

sudo dmesg | grep -i 'IOMMU'
#Out­put
[0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-77-generic root=UUID=09e7c...14 ro intel_iommu=on iommu=pt
[0.068270] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-77-generic root=UUID=09e7c...14 ro intel_iommu=on iommu=pt
[0.068326] DMAR: IOMMU enabled
[0.140127] DMAR-IR: IOAPIC id 1 under DRHD base  0xfbffc000 IOMMU 0
[0.140129] DMAR-IR: IOAPIC id 2 under DRHD base  0xfbffc000 IOMMU 0
[0.480605] iommu: Default domain type: Passthrough (set via kernel command line)
[0.764139] pci 0000:ff:0b.0: Adding to iommu group 0
[0.764165] pci 0000:ff:0b.1: Adding to iommu group 0
[0.764188] pci 0000:ff:0b.2: Adding to iommu group 0
[0.764348] pci 0000:ff:0c.0: Adding to iommu group 1
...
sudo dmesg | grep -i 'vfio'   # For Intel CPU (current case)
#Out­put
[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-77-generic root=UUID=09e7c8ed-fb55-4a44-8be4-18b1696fc714 ro intel_iommu=on iommu=pt kvm.ignore_msrs=1 irqpoll vfio-pci.ids=10de:107c,10de:0e08 vfio-pci.disable_vga=1
[    0.068558] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-77-generic root=UUID=09e7c8ed-fb55-4a44-8be4-18b1696fc714 ro intel_iommu=on iommu=pt kvm.ignore_msrs=1 irqpoll vfio-pci.ids=10de:107c,10de:0e08 vfio-pci.disable_vga=1
[    0.862286] VFIO - User Level meta-driver version: 0.3
[    0.862594] vfio-pci 0000:02:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[    0.880568] vfio_pci: add [10de:107c[ffffffff:ffffffff]] class 0x000000/00000000
[    0.900583] vfio_pci: add [10de:0e08[ffffffff:ffffffff]] class 0x000000/00000000
sudo dmesg |grep 'AMD-Vi'     # For AMD CPU
#Out­put
[    0.607751] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.608569] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.608569] AMD-Vi: Extended features (0x58f77ef22294a5a): PPR NX GT IA PC GA_vAPIC
[    0.608572] AMD-Vi: Interrupt remapping enabled
[    0.890747] AMD-Vi: AMD IOMMUv2 loaded and initialized

Iden­ti­fi­ca­tion of the Group Con­trollers

In or­der to gen­er­ate a tidy list of your grouped de­vices cre­ate a script as the fol­low.

nano ~/bin/get_iommu_groups.sh && chmod +x ~/bin/get_iommu_groups.sh
#Script
#!/bin/bash
# https://mathiashueber.com/pci-passthrough-ubuntu-2004-virtual-machine/
# change the 9999 if needed
shopt -s nullglob
for d in /sys/kernel/iommu_groups/{0..9999}/devices/*; do
    n=${d#*/iommu_groups/*}; n=${n%%/*}
    printf 'IOMMU Group %s ' "$n"
    lspci -nns "${d##*/}"
done

Run the script and fil­ter the out­put.

get_iommu_groups.sh | grep -iP 'VGA compatible controller|Ethernet controller|SATA controller|USB controller|NVIDIA'
#Out­put
IOMMU Group 30 00:11.4 SATA controller [0106]: Intel Corporation C610/X99 series chipset sSATA Controller [AHCI mode] [8086:8d62] (rev 05)
IOMMU Group 31 00:14.0 USB controller [0c03]: Intel Corporation C610/X99 series chipset USB xHCI Host Controller [8086:8d31] (rev 05)
IOMMU Group 33 00:1a.0 USB controller [0c03]: Intel Corporation C610/X99 series chipset USB Enhanced Host Controller #2 [8086:8d2d] (rev 05)
IOMMU Group 38 00:1d.0 USB controller [0c03]: Intel Corporation C610/X99 series chipset USB Enhanced Host Controller #1 [8086:8d26] (rev 05)
IOMMU Group 39 00:1f.2 SATA controller [0106]: Intel Corporation C610/X99 series chipset 6-Port SATA Controller [AHCI mode] [8086:8d02] (rev 05)
IOMMU Group 40 02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF119 [NVS 315] [10de:107c] (rev a1)
IOMMU Group 40 02:00.1 Audio device [0403]: NVIDIA Corporation GF119 HDMI Audio Controller [10de:0e08] (rev a1)
IOMMU Group 42 07:00.0 VGA compatible controller [0300]: ASPEED Technology, Inc. ASPEED Graphics Family [1a03:2000] (rev 30)
IOMMU Group 43 08:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)
IOMMU Group 44 09:00.0 Ethernet controller [0200]: Intel Corporation I210 Gigabit Network Connection [8086:1533] (rev 03)

We will iso­late IOM­MU Group 40 that con­tains PCI-bus 02:00.0 [de­vice ID 10de:107c] and 02:00.1 [de­vice ID 10de:0e08].

Iso­la­tion of the Guest GPU

In or­der to iso­late the GPU we have two op­tions. Se­lect the de­vices by PCI bus ad­dress or by de­vice ID. Both op­tions have pros and cons. Here we will iso­late VFIO-pci dri­ver by de­vice id. This op­tion should on­ly be used, in case the graph­ic cards (or oth­er de­vices that will be iso­lat­ed) in the sys­tem are not ex­act­ly the same mod­el, oth­er­wise we need to use iso­la­tion by PCI bus, be­cause the de­vices will have an iden­ti­cal IDs.

sudo nano /etc/default/grub # cat /etc/default/grub | grep 'GRUB_CMDLINE_LINUX_DEFAULT'
GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on iommu=pt kvm.ignore_msrs=1 kvm.report_ignored_msrs=0 irqpoll vfio-pci.ids=10de:107c,10de:0e08 vfio-pci.disable_vga=1"
#Ref | #Ex­pla­na­tions | #Ap­ply

Short ex­pla­na­tions:

  • The com­mand ignore_​​​msrs is on­ly nec­es­sary for Win­dows 10 ver­sions high­er 1803 (oth­er­wise BSOD).
  • The com­mand irqpoll is a work around for an er­ror like irq XX: no­body cared (try boot­ing with the "irqpool" op­tion)…, pos­si­bly hang and restart of the host. Ac­tu­al­ly, I think, this prob­lem was solved by en­abling the MSI (Mes­sage Sig­naled In­ter­rupts) op­tion in the guest OS, as it is de­scribed be­low.
  • The com­mands vfio-pci.disable_vga=1 is an at­tempt for workaround for my sys­tem, which hangs dur­ing the boot while a mon­i­tor is con­nect­ed to the guest GPU. But ac­tu­al­ly it doesn't change any­thing in my par­tic­u­lar case.

Ref­er­ences:

Up­date the boot man­ag­er con­fig­u­ra­tion and re­boot the sys­tem.

#CLI
sudo update-grub
sudo systemctl reboot

Af­ter this re­boot the iso­lat­ed GPU will be ig­nored by the host OS. Now, you have to use the oth­er GPU for the host OS. Af­ter the re­boot, ver­i­fy the Iso­la­tion of the guest GPU by an­a­lyze the out­put of the fol­low­ing com­mand:

sudo lspci -nnv -s 02:00
  • Note the lines Ker­nel dri­ver in use: vfio-pci
#Out­put
02:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF119 [NVS 315] [10de:107c] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: NVIDIA Corporation GF119 [NVS 315] [10de:102f]
        Physical Slot: 1
        Flags: bus master, fast devsel, latency 0, IRQ 255, NUMA node 0
        Memory at fa000000 (32-bit, non-prefetchable) [size=16M]
        Memory at 23ff0000000 (64-bit, prefetchable) [size=128M]
        Memory at 23ff8000000 (64-bit, prefetchable) [size=32M]
        I/O ports at d000 [size=128]
        Expansion ROM at fb000000 [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Endpoint, MSI 00
        Capabilities: [b4] Vendor Specific Information: Len=14 <?>
        Capabilities: [100] Virtual Channel
        Capabilities: [128] Power Budgeting <?>
        Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

02:00.1 Audio device [0403]: NVIDIA Corporation GF119 HDMI Audio Controller [10de:0e08] (rev a1)
        Subsystem: NVIDIA Corporation GF119 HDMI Audio Controller [10de:102f]
        Physical Slot: 1
        Flags: fast devsel, IRQ 255, NUMA node 0
        Memory at fb080000 (32-bit, non-prefetchable) [disabled] [size=16K]
        Capabilities: [60] Power Management version 3
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [78] Express Endpoint, MSI 00
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

Con­grat­u­la­tions, the hard­est part is done!

Set­ting-up the Soft­ware En­vi­ron­ment

There is a plen­ty of man­u­als how to in­stall and man­age KVM and cre­ate VMs. So I will post few ref­er­ences that I read and will ex­plain just few things about this process.

In­stall QE­MU, KVM, LIB­VIRT

sudo apt install qemu-system-x86 libvirt-daemon bridge-utils
sudo apt install libvirt-clients virtinst libosinfo-bin ovmf
sudo apt install virt-manager virt-viewer remmina # For desktop user
#Ex­pla­na­tions | #Ap­ply

The above set of pack­ages is for the lat­est De­bian based dis­tri­b­u­tions. For lit­tle bit old­er like Ubun­tu 20.04 try with the fol­low­ing set.

sudo apt install qemu qemu-kvm bridge-utils
sudo apt install libvirt-daemon libvirt-clients virtinst libosinfo-bin ovmf
sudo apt install virt-manager virt-viewer remmina # For desktop user

Ex­pla­na­tion about the pack­ages:

  • The qe­mu pack­age (quick em­u­la­tor) is an ap­pli­ca­tion that al­lows you to per­form hard­ware vir­tu­al­iza­tion.
  • The qe­mu-kvm pack­age is the main KVM pack­age.
  • The lib­vrit-dae­mon is the vir­tu­al­iza­tion dae­mon.
  • The bridge-utils pack­age helps you cre­ate a bridge con­nec­tion to al­low oth­er users to ac­cess a vir­tu­al ma­chine oth­er than the host sys­tem.
  • The virtinst pack­age con­tains pro­grams to cre­ate and clone vir­tu­al ma­chines. It is a set of com­mand-line tools to cre­ate vir­tu­al ma­chines us­ing lib­virt.
  • The li­bosin­fo-bin pack­age con­tains tools for query­ing the os­in­fo data­base via li­bosin­fo… It in­cludes a data­base con­tain­ing de­vice meta­da­ta and pro­vides APIs to match/​​​identify op­ti­mal de­vices for de­ploy­ing an op­er­at­ing sys­tem on a hy­per­vi­sor.
  • The ovmf pack­age is UE­FI firmware for 64-bit x86 vir­tu­al ma­chines. Open Vir­tu­al Ma­chine Firmware is a build of EDK II for 64-bit x86 vir­tu­al ma­chines. It in­cludes full sup­port for UE­FI, in­clud­ing Se­cure Boot, al­low­ing use of UE­FI in place of a tra­di­tion­al BIOS in your VM.
  • The virt-man­ag­er is an ap­pli­ca­tion for man­ag­ing vir­tu­al ma­chines through a graph­i­cal user in­ter­face.
  • The virt-view­er is a SPICE GUI client.
  • The rem­mi­na pack­age is re­mote ses­sion man­ag­er it sup­ports RDP, VNC, SSH and SFTP pro­to­cols.

Ver­i­fy whether KVM mod­ule is loaded in­to the loaded and ver­i­fy whether lib­virt dae­mon will start au­to­mat­i­cal­ly:

sudo systemctl is-active libvirtd
lsmod | grep -i kvm

Ba­sic man­age­ment – en­able, start, get the sta­tus or stop and dis­able libvirtd.service:

sudo systemctl (enable|start|status/stop|disable) libvirtd.service

Add your user to the lib­virt and kvm groups in or­der to ex­e­cute re­lat­ed com­mand with­out su­do:

sudo usermod -aG libvirt $USER
sudo usermod -aG kvm $USER
grep "$USER" /etc/group

NVIDIA Ker­nel Mod­ules and Dri­vers at the Host Lev­el

Note this guide cov­ers in­stal­la­tion on Lin­ux serv­er where the host wont use the GPU or any oth­er NVIDIA GPU.

#Ref­er­ences | #Ap­ply

Re­move pre­vi­ous­ly in­stalled NVIDIA dri­vers. And an­a­lyze the out­put of lsmod, dmesg and lsp­ci ‑nnv in or­der to find which mod­ules are re­lat­ed to the guest GPU and black­list them by cre­at­ing a new sec­tion at the bot­tom of the file /etc/modprobe.d/blacklist.conf.

sudo apt remove --purge nvidia-headless-390 && sudo apt autoremove && sudo apt autoclean
dpkg -l | grep -i nvidia
sudo apt-get remove --purge '^nvidia-.*'
sudo apt autoremove && sudo apt autoclean
sudo nano /etc/modprobe.d/blacklist.conf
# Blacklist NVIDIA Modules:
# 'lsmod', 'dmesg' and 'lspci -nnv'
blacklist nvidiafb
blacklist nouveau
blacklist nvidia_drm
blacklist nvidia
blacklist rivafb
blacklist rivatv
blacklist snd_hda_intel

Ref­er­ences:

De­ploy fresh OVMF Firmware for VMs

This step is no longer re­quired. The UE­FI im­ages from the lat­est ver­sions of the ovmf pack­age in­stalled above are ro­bust, sta­ble and fast.

#Ref­er­ences | #CLI | #Ap­ply

The de­fault OVMF im­ages that was in­stalled in the pre­vi­ous sec­tion are rel­a­tive­ly old. In ad­di­tion I didn't suc­ceed to set­up WSL2 with these UE­FI im­ages. Al­so with the im­ages pro­vid­ed be­low the VM per­for­mance is about 5% faster.

Down­load and de­ploy an ap­pro­pri­ate pack­age with fresh OVMF im­ages from kraxel​.org/​r​e​p​o​s​/​j​e​n​k​i​n​s​/​edk2/:

mkdir ~/Downloads/kvm-ovmf-kraxel/
cd ~/Downloads/kvm-ovmf-kraxel/
wget https://www.kraxel.org/repos/jenkins/edk2/edk2.git-ovmf-x64-0-20210421.18.g15ee7b7689.noarch.rpm
Ex­tract the pack­age and copy the im­ages at their places:
rpm2cpio edk2.git-ovmf-x64-0-20210421.18.g15ee7b7689.noarch.rpm | cpio -idmv
sudo cp -R usr/share/edk2.git /usr/share/
sudo cp usr/share/qemu/firmware/* /usr/share/qemu/firmware/
Cur­rent­ly I'm us­ing OVMF_VARS-with-csm.fd.

Ad­di­tion­al Set­up, Is­sues, So­lu­tions and Tips

Gen­er­al Ref­er­ences