QEMU/KVM and GPU Passthrough Troubleshooting

From WikiMLT

Guest HD­MI Au­dio Crack­ling and IRQ xx: No­body Cared

Win­dows

The so­lu­tion of the is­sue Guest GPU HD­MI Au­dio Crack­ling, Bro­ken or Los­ing is well ex­plained by Jonp at UN­RAID Fo­rums. In short we must try to en­able the MSI – Mes­sage Sig­naled In­ter­rupts op­tion if the de­vice sup­port it. Here is a de­tailed step-by-step in­struc­tion how to do that.

I was men­tioned the ker­nel com­mand irqpool as so­lu­tion for this is­sue, but in my opin­ion and ac­cord­ing to some oth­er posts, pro­vid­ed in the ref­er­ences sec­tion, this set­ting is an ac­tu­al so­lu­tion for the prob­lem irq 44: no­body cared (try boot­ing with the "irqpool" op­tion); han­dlers: vfio_​​​intx_​​​handler; Dis­abling IRQ #44, that ap­pears at my sys­tem when boot­ing a VM with GPU passthrough with a dis­play at­tached to it, which was caus­ing host sys­tem re­boot.

Lin­ux

Lin­ux guests usu­al­ly en­able MSI by them­selves. To force use of MSI (Mes­sage Sig­naled In­ter­rupts) for GPU au­dio de­vices, use the fol­low­ing com­mand and re­boot.

echo "options snd-hda-intel enable_msi=1" | sudo tee /etc/modprobe.d/snd-hda-intel.conf
sudo update-initramfs -u -k all

Use lsp­ci ‑vv and check for the fol­low­ing line on your de­vice to see if MSI is en­abled.

sudo lspci -nnv -s 09:00
#Out­put
09:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU117GL [T600] [10de:1fb1] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: NVIDIA Corporation TU117GL [T600] [10de:1488]
	Flags: bus master, fast devsel, latency 0, IRQ 103, IOMMU group 23
	Memory at fb000000 (32-bit, non-prefetchable) [size=16M]
	Memory at 7fd0000000 (64-bit, prefetchable) [size=256M]
	Memory at 7fe0000000 (64-bit, prefetchable) [size=32M]
	I/O ports at f000 [size=128]
	Expansion ROM at fc000000 [virtual] [disabled] [size=512K]
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Legacy Endpoint, MSI 00
	Capabilities: [100] Virtual Channel
	Capabilities: [250] Latency Tolerance Reporting
	Capabilities: [258] L1 PM Substates
	Capabilities: [128] Power Budgeting <?>
	Capabilities: [420] Advanced Error Reporting
	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900] Secondary PCI Express
	Capabilities: [bb0] Physical Resizable BAR
	Kernel driver in use: nvidia
	Kernel modules: nouveau, nvidia_drm, nvidia

09:00.1 Audio device [0403]: NVIDIA Corporation Device [10de:10fa] (rev a1)
	Subsystem: NVIDIA Corporation Device [10de:1488]
	Flags: bus master, fast devsel, latency 0, IRQ 99, IOMMU group 23
	Memory at fc080000 (32-bit, non-prefetchable) [size=16K]
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Kernel driver in use: snd_hda_intel
	Kernel modules: snd_hda_intel

If it says En­able+, MSI is work­ing, En­able- means it is sup­port­ed but dis­abled, and if the line is miss­ing, MSI is not sup­port­ed by the PCIe hard­ware.

This can po­ten­tial­ly al­so im­prove per­for­mance for oth­er passthrough de­vices, in­clud­ing GPUs, but that de­pends on the hard­ware be­ing used.

Ref­er­ences

Passthrough GPU: Win­dows and SPICE/VNC

Don't use em­u­lat­ed Video and Graph­ic while you are us­ing GPU passthrough. It leads per­for­mance loss.

Spice may give trou­ble when pass­ing through a GPU as it presents a "vir­tu­al" PCI graph­ic card to the guest and some dri­vers have prob­lems with that, even when both cards show up. It's al­ways worth a try to dis­able SPICE and check again if some­thing fails.

Ref­er­ences

Passthrough GPU: Win­dows and 6ch/​​​9ch vs ac97 au­dio

Do not use 6ch/​​​9ch au­dio de­vices in the vir­tu­al ma­chine. It cre­ates aw­ful stut­ter­ing and per­for­mance loss. Use ac97 au­dio de­vices in­stead. This in­for­ma­tion is out­dat­ed – prob­a­bly the dri­vers have been up­dat­ed.

Ref­er­ences

Passthrough GPU: Win­dows and NVIDIA GPU Er­ror 43

I've ex­pe­ri­enc­ing this is­sue with the de­fault OVMF im­age (OVMF_CODE.ms.fd -> OVMF_CODE.secboot.fd). With the im­age OVMF_VARS-with-csm.fd the things were well. How­ev­er here is a gen­er­al so­lu­tion. Ap­ply the fol­low­ing changes in the Win­dows' vir­tu­al ma­chine XML file. You can use ei­ther virsh ed­it {your-win­dows-vm-name} or the XML tab in virt-man­ag­er.

  ...
  <features>
    <acpi/>
    <apic/>
    <hyperv>
      <relaxed state='on'/>
      <vapic state='on'/>
      <spinlocks state='on' retries='8191'/>
      <vendor_id state='on' value='1234567890ab'/>
    </hyperv>
    <kvm>
      <hidden state='on'/>
    </kvm>
    <vmport state='off'/>
    <ioapic driver='kvm'/>
  </features>
  ...

Ref­er­ences

Hide the Warn­ings KVM: vcpu0 ig­nored rdm­sr

sudo nano /etc/modprobe.d/kvm.conf
options kvm report_ignored_msrs=0 # Or: GRUB_CMDLINE_LINUX_DEFAULT="... kvm.report_ignored_msrs=0 ..."
sudo update-initramfs -u -k all
  • Note, ac­cord­ing to my ex­pe­ri­ence, this op­tion may cause crash­es of the host sys­tem!?

Ref­er­ences

KVM Mod­e­in­fo

modinfo kvm
filename:       /lib/modules/5.4.0-77-generic/kernel/arch/x86/kvm/kvm.ko
license:        GPL
author:         Qumranet
srcversion:     20C68083F39E14AB616D0B8
depends:
retpoline:      Y
intree:         Y
name:           kvm
vermagic:       5.4.0-77-generic SMP mod_unload modversions
sig_id:         PKCS#7
signer:         Build time autogenerated kernel key
sig_key:        67:66:2F:F8:26:8F:56:E9:37:7F:B7:AD:33:FA:97:31:CA:7F:FF:50
sig_hashalgo:   sha512
signature:      6C:0B:65:F9:46:AC:D8:B7:94:E8:B9:9D:A0:4B:97:E6:63:52:5A:FF:
                ...
parm:           nx_huge_pages:bool
parm:           nx_huge_pages_recovery_ratio:uint
parm:           ignore_msrs:bool
parm:           report_ignored_msrs:bool
parm:           min_timer_period_us:uint
parm:           kvmclock_periodic_sync:bool
parm:           tsc_tolerance_ppm:uint
parm:           lapic_timer_advance_ns:int
parm:           vector_hashing:bool
parm:           enable_vmware_backdoor:bool
parm:           force_emulation_prefix:bool
parm:           pi_inject_timer:bint
parm:           halt_poll_ns:uint
parm:           halt_poll_ns_grow:uint
parm:           halt_poll_ns_grow_start:uint
parm:           halt_poll_ns_shrink:uint

ThinkServ­er TD350 Is­sues

With­in the TD350's doc­u­men­ta­tion is writ­ten it doesn't sup­port any video card. How­ev­er NVS 315 works with mi­nor is­sues as these men­tioned here.