QEMU/KVM and GPU Passthrough Troubleshooting: Difference between revisions

From WikiMLT
Spas (talk | contribs)
Spas (talk | contribs)
m Text replacement - "mlw-continue" to "code-continue"
 
(26 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Guest GPU Audio Crackling and IRQ xx: Nobody Cared Fix ==
<noinclude>{{ContentArticleHeader/Virtual Machines|toc=off}}{{ContentArticleHeader/Proxmox|toc-limit=1}}</noinclude>
 
==Guest HDMI Audio Crackling and IRQ xx: Nobody Cared ==


=== Windows ===
=== Windows ===
The solution of the issue '''Guest GPU HDMI Audio Crackling, Broken or Losing''' is well explained by [https://forums.unraid.net/topic/40593-windows-10-vm-losing-audio/?tab=comments#comment-398133 Jonp at UNRAID Forums]. In short we must try to '''enable the MSI Message Signaled Interrupts''' option if the device support it. Here is a detailed step-by-step instruction how to do that.
The solution of the issue '''Guest GPU HDMI Audio Crackling, Broken or Losing''' is well explained by [https://forums.unraid.net/topic/40593-windows-10-vm-losing-audio/?tab=comments#comment-398133 Jonp at UNRAID Forums]. In short we must try to '''enable the MSI - Message Signaled Interrupts''' option if the device support it. Here is a detailed step-by-step instruction how to do that.


<gallery mode="slideshow" class="mlw-slideshow-center">
<gallery mode="slideshow" class="mlw-slideshow-center">
Line 24: Line 26:


=== Linux ===
=== Linux ===
Linux guests usually enable MSI by themselves. To force use of MSI for GPU audio devices, use the following command and reboot.
Linux guests usually enable MSI by themselves. To force use of '''MSI''' ('''Message Signaled Interrupts''') for GPU audio devices, use the following command and reboot.
 
<syntaxhighlight lang="shell" line="1" class="code-continue">
{{Collapse/begin}}
echo "options snd-hda-intel enable_msi=1" | sudo tee /etc/modprobe.d/snd-hda-intel.conf
</syntaxhighlight>
<syntaxhighlight lang="shell" line="1" class="mlw-shell-gray">
sudo update-initramfs -u -k all
</syntaxhighlight>Use <code>lspci -vv</code> and check for the following line on your device to see if MSI is enabled.{{Collapse/begin}}
<syntaxhighlight lang="shell" line="1">
<syntaxhighlight lang="shell" line="1">
sudo lspci -nnv -s 09:00
sudo lspci -nnv -s 09:00
</syntaxhighlight>
</syntaxhighlight>
{{Collapse/div|#Output}}
{{Collapse/div|#Output}}
<syntaxhighlight lang="yaml" class="mlw-collapsed-first-element mlw-pre-max-height-320" highlight="9,27">
<syntaxhighlight lang="yaml" class="mlw-collapsed-first-element mlw-pre-max-height-320" highlight="10,28">
09:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117GL [T600] [10de:1fb1] (rev a1) (prog-if 00 [VGA controller])
09:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117GL [T600] [10de:1fb1] (rev a1) (prog-if 00 [VGA controller])
Subsystem: NVIDIA Corporation TU117GL [T600] [10de:1488]
Subsystem: NVIDIA Corporation TU117GL [T600] [10de:1488]
Line 65: Line 71:
Kernel modules: snd_hda_intel
Kernel modules: snd_hda_intel
</syntaxhighlight>
</syntaxhighlight>
{{collapse/end}}
{{collapse/end}}If it says <code>Enable+</code>, MSI is working, <code>Enable-</code> means it is supported but disabled, and if the line is missing, MSI is not supported by the PCIe hardware.
 
This can potentially also improve performance for other passthrough devices, including GPUs, but that depends on the hardware being used.


=== References ===
=== References ===
* [https://forums.unraid.net/topic/40593-windows-10-vm-losing-audio/?tab=comments#comment-398133 UNRAID Forums: Windows 10 VM Losing Audio]  
* UNRAID Forums: [https://forums.unraid.net/topic/40593-windows-10-vm-losing-audio/?tab=comments#comment-398133 Windows 10 VM Losing Audio]
* [https://forums.unraid.net/topic/44025-kernel-disabling-irq-16/page/2/ UNRAID Forums: Kernel Disabling IRQ #16]
* UNRAID Forums: [https://forums.unraid.net/topic/44025-kernel-disabling-irq-16/page/2/ Kernel Disabling IRQ #16]
* [https://listman.redhat.com/archives/vfio-users/2017-May/msg00046.html RedHat ListMAN: Kernel panic at vfio_intx_handler leads to low performance in guest VM]
* RedHat ListMAN: [https://listman.redhat.com/archives/vfio-users/2017-May/msg00046.html Kernel panic at vfio_intx_handler leads to low performance in guest VM]
* [http://vfio.blogspot.com/2014/08/vfiovga-faq.html Alex Williamson: VFIO tips and tricks (at vifo.blogspot.com): Usefully Q&A]
* Alex Williamson: [http://vfio.blogspot.com/2014/08/vfiovga-faq.html VFIO tips and tricks (at vifo.blogspot.com): Usefully Q&A]
* [https://pve.proxmox.com/wiki/Pci_passthrough#HDMI_Audio_crackling.2Fbroken Proxmox Wiki: Pci passthrough » HDMI Audio crackling/broken]
* Proxmox Wiki: [[pve:Pci_passthrough#HDMI_Audio_crackling.2Fbroken|'''Pci passthrough » HDMI Audio crackling/broken''']]
 
== Passthrough GPU: Windows and SPICE/VNC ==
Don't use emulated Video and Graphic while you are using GPU passthrough. It leads performance loss.
 
Spice may give trouble when passing through a GPU as it presents a "virtual" PCI graphic card to the guest and some drivers have problems with that, even when both cards show up. It's always worth a try to disable SPICE and check again if something fails.
 
===References===
* [https://pve.proxmox.com/wiki/Pci_passthrough#SPICE Proxmox Wiki: Pci passthrough]
 
== Passthrough GPU: Windows and 6ch/9ch vs ac97 audio ==
 
Do not use 6ch/9ch audio devices in the virtual machine. It creates awful stuttering and performance loss. '''Use ac97 audio''' devices instead. '''This information is outdated - probably the drivers have been updated.'''
 
===References===
*[https://mathiashueber.com/qemu-troubleshooting-errors-gpu-passthrough-vm/ Mathias Hueber: Troubleshooting – Known issues, bugs and common quirks of KVM QEMU VMs with GPU passthrough]
 
== Passthrough GPU: Windows and NVIDIA GPU Error 43 ==
I've experiencing this issue with the default OVMF image (<code>OVMF_CODE.ms.fd</code> -> <code>OVMF_CODE.secboot.fd</code>). With the image <code>[[QEMU/KVM_and_GPU_Passthrough_in_Details#Deploy_fresh_OVMF_Firmware_for_VMs|OVMF_VARS-with-csm.fd]]</code> the things were well. However here is a general solution. Apply the following changes in the Windows' virtual machine XML file. You can use either <code>virsh edit {your-windows-vm-name}</code> or the XML tab in <code>virt-manager</code>.
 
<syntaxhighlight lang="xml" highlight="9,12,15">
  ...
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='1234567890ab'/>
    </hyperv>
    <kvm>
      <hidden state='on'/>
    </kvm>
    <vmport state='off'/>
    <ioapic driver='kvm'/>
  </features>
  ...
</syntaxhighlight>
 
===References===
* [https://mathiashueber.com/fighting-error-43-nvidia-gpu-virtual-machine/ Mathias Hueber: Fighting error 43 – Nvidia GPU in a Virtual machine]
* [https://www.reddit.com/r/Proxmox/comments/moz34a/nvidia_nvs315_code_43/ Reddit: Nvidia NVS315 code 43]


== Hide the Warnings KVM: vcpu0 ignored rdmsr ==
== Hide the Warnings KVM: vcpu0 ignored rdmsr ==
Line 127: Line 177:
parm:          halt_poll_ns_shrink:uint
parm:          halt_poll_ns_shrink:uint
</syntaxhighlight>
</syntaxhighlight>
== ThinkServer TD350 Issues  ==
Within the TD350's documentation is written it doesn't support any video card. However NVS 315 works with minor issues as these mentioned here.
<gallery mode="slideshow" class="mlw-slideshow-center">
File:NVIDIA NVS 315 at ThinkServer RD350 Issue 1.jpg|{{File:NVIDIA NVS 315 at ThinkServer RD350 Issue 1.jpg}}
File:ThinkServer RD350 and NVIDIA NVS 315 KVM Passthrough Issue 1.jpg|{{File:ThinkServer RD350 and NVIDIA NVS 315 KVM Passthrough Issue 1.jpg}}
</gallery>
<noinclude>
<div id='devStage'>
{{devStage
| Прндл  = Virtual Machines
| Прндл1 = Proxmox
| Стадий = 6
| Фаза  = Утвърждаване
| Статус = Утвърден
| ИдтПт  = Spas
| РзбПт  = Spas
| АвтПт  = Spas
| УтвПт  = {{REVISIONUSER}}
| ИдтДт  = 2.09.2022
| РзбДт  = 5.09.2022
| АвтДт  = 22.09.2022
| УтвДт  = {{Today}}
| ИдтРв  = [[Special:Permalink/31005|31005]]
| РзбРв  = [[Special:Permalink/31150|31150]]
| АвтРв  = [[Special:Permalink/31707|31707]]
| УтвРв  = {{REVISIONID}}
}}
</div>
</noinclude>

Latest revision as of 07:29, 26 September 2022

Guest HD­MI Au­dio Crack­ling and IRQ xx: No­body Cared

Win­dows

The so­lu­tion of the is­sue Guest GPU HD­MI Au­dio Crack­ling, Bro­ken or Los­ing is well ex­plained by Jonp at UN­RAID Fo­rums. In short we must try to en­able the MSI – Mes­sage Sig­naled In­ter­rupts op­tion if the de­vice sup­port it. Here is a de­tailed step-by-step in­struc­tion how to do that.

I was men­tioned the ker­nel com­mand irqpool as so­lu­tion for this is­sue, but in my opin­ion and ac­cord­ing to some oth­er posts, pro­vid­ed in the ref­er­ences sec­tion, this set­ting is an ac­tu­al so­lu­tion for the prob­lem irq 44: no­body cared (try boot­ing with the "irqpool" op­tion); han­dlers: vfio_​​​intx_​​​handler; Dis­abling IRQ #44, that ap­pears at my sys­tem when boot­ing a VM with GPU passthrough with a dis­play at­tached to it, which was caus­ing host sys­tem re­boot.

Lin­ux

Lin­ux guests usu­al­ly en­able MSI by them­selves. To force use of MSI (Mes­sage Sig­naled In­ter­rupts) for GPU au­dio de­vices, use the fol­low­ing com­mand and re­boot.

echo "options snd-hda-intel enable_msi=1" | sudo tee /etc/modprobe.d/snd-hda-intel.conf
sudo update-initramfs -u -k all

Use lsp­ci ‑vv and check for the fol­low­ing line on your de­vice to see if MSI is en­abled.

sudo lspci -nnv -s 09:00
#Out­put
09:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117GL [T600] [10de:1fb1] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: NVIDIA Corporation TU117GL [T600] [10de:1488]
	Flags: bus master, fast devsel, latency 0, IRQ 103, IOMMU group 23
	Memory at fb000000 (32-bit, non-prefetchable) [size=16M]
	Memory at 7fd0000000 (64-bit, prefetchable) [size=256M]
	Memory at 7fe0000000 (64-bit, prefetchable) [size=32M]
	I/O ports at f000 [size=128]
	Expansion ROM at fc000000 [virtual] [disabled] [size=512K]
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Legacy Endpoint, MSI 00
	Capabilities: [100] Virtual Channel
	Capabilities: [250] Latency Tolerance Reporting
	Capabilities: [258] L1 PM Substates
	Capabilities: [128] Power Budgeting <?>
	Capabilities: [420] Advanced Error Reporting
	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900] Secondary PCI Express
	Capabilities: [bb0] Physical Resizable BAR
	Kernel driver in use: nvidia
	Kernel modules: nouveau, nvidia_drm, nvidia

09:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10fa] (rev a1)
	Subsystem: NVIDIA Corporation Device [10de:1488]
	Flags: bus master, fast devsel, latency 0, IRQ 99, IOMMU group 23
	Memory at fc080000 (32-bit, non-prefetchable) [size=16K]
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel

If it says En­able+, MSI is work­ing, En­able- means it is sup­port­ed but dis­abled, and if the line is miss­ing, MSI is not sup­port­ed by the PCIe hard­ware.

This can po­ten­tial­ly al­so im­prove per­for­mance for oth­er passthrough de­vices, in­clud­ing GPUs, but that de­pends on the hard­ware be­ing used.

Ref­er­ences

Passthrough GPU: Win­dows and SPICE/VNC

Don't use em­u­lat­ed Video and Graph­ic while you are us­ing GPU passthrough. It leads per­for­mance loss.

Spice may give trou­ble when pass­ing through a GPU as it presents a "vir­tu­al" PCI graph­ic card to the guest and some dri­vers have prob­lems with that, even when both cards show up. It's al­ways worth a try to dis­able SPICE and check again if some­thing fails.

Ref­er­ences

Passthrough GPU: Win­dows and 6ch/​​​9ch vs ac97 au­dio

Do not use 6ch/​​​9ch au­dio de­vices in the vir­tu­al ma­chine. It cre­ates aw­ful stut­ter­ing and per­for­mance loss. Use ac97 au­dio de­vices in­stead. This in­for­ma­tion is out­dat­ed – prob­a­bly the dri­vers have been up­dat­ed.

Ref­er­ences

Passthrough GPU: Win­dows and NVIDIA GPU Er­ror 43

I've ex­pe­ri­enc­ing this is­sue with the de­fault OVMF im­age (OVMF_CODE.ms.fd -> OVMF_CODE.secboot.fd). With the im­age OVMF_VARS-with-csm.fd the things were well. How­ev­er here is a gen­er­al so­lu­tion. Ap­ply the fol­low­ing changes in the Win­dows' vir­tu­al ma­chine XML file. You can use ei­ther virsh ed­it {your-win­dows-vm-name} or the XML tab in virt-man­ag­er.

  ...
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='1234567890ab'/>
    </hyperv>
    <kvm>
      <hidden state='on'/>
    </kvm>
    <vmport state='off'/>
    <ioapic driver='kvm'/>
  </features>
  ...

Ref­er­ences

Hide the Warn­ings KVM: vcpu0 ig­nored rdm­sr

sudo nano /etc/modprobe.d/kvm.conf
options kvm report_ignored_msrs=0 # Or: GRUB_CMDLINE_LINUX_DEFAULT="... kvm.report_ignored_msrs=0 ..."
sudo update-initramfs -u -k all
  • Note, ac­cord­ing to my ex­pe­ri­ence, this op­tion may cause crash­es of the host sys­tem!?

Ref­er­ences

KVM Mod­e­in­fo

modinfo kvm
filename:       /lib/modules/5.4.0-77-generic/kernel/arch/x86/kvm/kvm.ko
license:        GPL
author:         Qumranet
srcversion:     20C68083F39E14AB616D0B8
depends:
retpoline:      Y
intree:         Y
name:           kvm
vermagic:       5.4.0-77-generic SMP mod_unload modversions
sig_id:         PKCS#7
signer:         Build time autogenerated kernel key
sig_key:        67:66:2F:F8:26:8F:56:E9:37:7F:B7:AD:33:FA:97:31:CA:7F:FF:50
sig_hashalgo:   sha512
signature:      6C:0B:65:F9:46:AC:D8:B7:94:E8:B9:9D:A0:4B:97:E6:63:52:5A:FF:
                ...
parm:           nx_huge_pages:bool
parm:           nx_huge_pages_recovery_ratio:uint
parm:           ignore_msrs:bool
parm:           report_ignored_msrs:bool
parm:           min_timer_period_us:uint
parm:           kvmclock_periodic_sync:bool
parm:           tsc_tolerance_ppm:uint
parm:           lapic_timer_advance_ns:int
parm:           vector_hashing:bool
parm:           enable_vmware_backdoor:bool
parm:           force_emulation_prefix:bool
parm:           pi_inject_timer:bint
parm:           halt_poll_ns:uint
parm:           halt_poll_ns_grow:uint
parm:           halt_poll_ns_grow_start:uint
parm:           halt_poll_ns_shrink:uint

ThinkServ­er TD350 Is­sues

With­in the TD350's doc­u­men­ta­tion is writ­ten it doesn't sup­port any video card. How­ev­er NVS 315 works with mi­nor is­sues as these men­tioned here.