QEMU/KVM GPU Passthrough Setup Windows Guest for RDP

From WikiMLT

All fol­low­ing steps are per­formed at Win­dows 11/10 guest's side. The ex­am­ples be­low are give for NVIDIA Tes­la K20Xm, at­tached to my home lab serv­er, but they are ap­plic­a­ble for oth­er mod­els and brands video cards. NVIDIA Tes­la K20Xm is head­less and the main idea of this guide is to use it as de­fault video card with­in re­mote desk­top ses­sions.

Re­quired Soft­ware

Video 1. Proxmox QEMU/KVM GPU Passthrough Windows 10/11 Guest Demo.
Video 1. Prox­mox QEMU/KVM GPU Passthrough Win­dows 10/11 Guest De­mo.

Once the VM is suc­cess­ful­ly boot in Win­dows 11/10 the first step you need to do is to in­stall an ap­pro­pri­ate NVIDIA dri­ver for the passthrough GPU adapter (ac­cel­er­a­tor). In my case, with Tes­la K20Xm, I'm us­ing the NVIDIA dri­ver 472­.­98-da­ta-cen­ter-tes­la-desk­top-win­10-win­11–64­bit-dch-in­ter­na­tio­nal­.­exe.

You will to in­stall al­so GPU‑Z. And op­tion­al­ly you can in­stall MSI Af­ter­burn and some per­for­mance test soft­ware like those pro­vid­ed by Pass­Mark. Af­ter the in­stal­la­tion of the dri­ver and the oth­er pack­ages you need to re­boot the vir­tu­al ma­chine.

If this is new in­stal­la­tion don't for­got to add QEMU/​​​VirtIO guest tools:

Nvidia SMI Set­tings

Run Com­mand Prompt as Ad­min­is­tra­tor and pre­form the fol­low­ing com­mand to list the avail­able Nvidia de­vices.

nvidia-smi -L
GPU 0: Tesla K20Xm (UUID: GPU-f60379d4-45d8-e070-daa4-5fcb178579b0)

Run the fol­low­ing com­mand to view ver­bose out­put of he avail­able Nvidia de­vices and their gen­er­al set­tings.

nvidia-smi
Mon Mar 14 23:39:33 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 472.98       Driver Version: 472.98       CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla K20Xm         TCC  | 00000000:01:00.0 Off |                    0 |
| N/A   50C    P8    19W / 235W |      9MiB /  5696MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

In the above out­put we can see the GPU is in TCC mode which is the com­pute mode of the ac­cel­er­a­tor. In or­der to use it as GPU we must switch it to WD­DM mode by the fol­low­ing com­mand. Where -g {id} in­di­cates the num­ber id of the GPU – GPU 0 from the first com­mand.

nvidia-smi -g 0 -dm 0
Set driver model to WDDM for GPU 00000000:01:00.0.
All done. Reboot required.

Re­boot the sys­tem and check the mode again – it should be WD­DM. In ad­di­tion nvidia-smi will out­put and a ta­ble that con­tains re­port about the GUI process­es cur­rent­ly us­ing the GPU. Some in­for­ma­tion like as the cur­rent tem­per­a­ture will dis­ap­pear, but now you can read this da­ta via GPU‑Z or MSI Af­ter­burn.

In my opin­ion step and prop­er set­up of Group Pol­i­cy for RDP are the enough to use the Tes­la as GPU. How­ev­er in the most man­u­als is rec­om­mend­ed to tweak al­so the reg­istry, re­lat­ed to the cer­tain GPU ac­cel­er­a­tor, se we will pass through these steps too in the next sec­tions.

RegEd­it Set­tings

Figure 1. Prox­mox QEMU/KVM GPU Passthrough Win­dows 10/11 Guest's RegEd­it Set­tings.

As I said be­fore, I'm not sure this step is re­quired, be­cause Video 1 is tak­en be­fore its im­ple­men­ta­tion, but most of the sim­i­lar man­u­als rec­om­mend to delete the DWORD Adapter­Type:(1) аnd cre­ate a new DWORD En­ableMsHy­brid:(1) with­in the Win­dows' reg­istry fold­er of the GPU – Fig­ure 1.

I think these op­tions are use­ful in a case as this, where you have on­board GPU and head­less ac­cel­er­a­tor like Tes­la, and you want to ren­der the video out­put via the ac­cel­er­a­tor – like it is on the lap­tops. Thus, by per­form­ing this step, you will be able to choose the GPU to be used with cer­tain application(s). For more de­tails ex­plore "Nvidia Con­trol Pan­el > Man­age 3D Set­tings" and/​​​or "Win­dows Set­tings > Sys­tem > Dis­play > Graph­ics".

To per­form this step – first run GPU‑Z, nav­i­gate to the Ad­vanced tab and copy the val­ue of the en­try Reg­istry Path (for the dis­cussed GPU). Then run regedit.msc – find it by the search op­tion with­in the start menu. In the com­mand line of the Reg­istry Ed­i­tor (which is lo­cat­ed at the top bar) type Computer\HKEY_LOCAL_MACHINE\ and then paste the reg­istry path copied from GPU‑Z and press En­ter. Here is how the Reg­istry Path looks in my case.

regedit.msc
Computer\HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Control\Class\{4d36e968-e325-11ce-bfc1-08002be10318}\0017

It could take a while to find the tar­get fold­er. Once it is opened, find the en­try Adapter­Type and re­move it by the con­text menu of the right muse but­ton. Then click some­where at the white­space and from the con­text menu cre­ate New > DWORD (32-bit). Name the new en­try En­ableMsHy­brid and press En­ter. Then do dou­ble click at the new en­try and in Hexa­dec­i­mal en­ter 1 for val­ue and click OK. Fi­nal­ly re­boot the sys­tem to ap­ply the changes.

Group Pol­i­cy Set­tings

The video man­u­al Use ANY Head­less GPU for Gam­ing in a Vir­tu­al Ma­chine! pro­vides an ex­am­ple how to use Par­sec Stream­ing Serv­er to get high speed video con­nec­tion for re­mote game play­ing.

Ac­cord­ing to my needs, I de­cid­ed it is enough to use the set­up pro­vid­ed in the man­u­al How to En­able GPU ren­der­ing on Win­dows 10 and Win­dows Serv­er 2019 for Win­dows Re­mote Desk­top Pro­to­col (RDP)Fig­ure 2 is a syn­the­sis of this ap­proach.

Figure 2. Prox­mox QEMU/KVM GPU Passthrough Win­dows 10/11 Guest's Group Pol­i­cy Set­tings.
gpedit.msc
Local Computer Policy
 + Computer Configuration
    + Administrative Templates
       + Windows Components
          + Remote Desktop Services
             + Remote Desktop Session Host
                + Remote Session Environment | at the right siide >
"Use the hardware default graphics adapter for all Remote Desktop Services sessions": "ENABLE"
gpupdate /force  # to apply the changes or reboot the system

Nvidia SMI Set­tings for Over­clock­ing

Here are list­ed few ad­di­tion­al nvidia-smi com­mands that may help you if you want to over­clock the GPU. The main source of these is the ar­ti­cle Gam­ing, on my Tes­la, more like­ly than you think:

  • nvidia-smi ‑acp 0 – this gives you ad­min Nvidia priv­i­lege (dep­re­cat­ed com­mand),
  • nvidia-smi ‑pm 1, nvidia-smi –auto-boost-permission=UNRESTRICTED ‑i 0 – for some rea­son you have to in­put these two com­mands to make af­ter­burn­er stick,
  • nvidia-smi ‑q ‑d SUPPORTED_CLOCKS – this will tell you what you can SET… First you’ll set MSI Af­ter­burn­er to the de­sired clocks, it won’t set them un­til we use nvidia-smi, so if I want 1100 MHz I’ll do +225 in af­ter­burn­er, THEN en­ter this com­mand nvidia-smi ‑ac 3004,1100… The first num­ber is the RAM speed and the sec­ond num­ber is the Core clock… If I were to give plus 121 MHz to the mem­o­ry and +250 to the Core in af­ter­burn­er I would then need to in­put nvidia-smi ‑ac 3125,1100.
  • Al­so if you are mem­o­ry over­clock­ing I would dis­able ECC sup­port, this can al­so be done through: nvidia-smi ‑e 0.

Cur­rent­ly at my sys­tem there is not enough cool­ing and pow­er sup­ply to per­form over­clock­ing and com­pre­hen­sive per­for­mance tests, but as you can see in Video 1 the card works at its max­i­mum de­fault Core and RAM speed when it is need­ed. I'm not sure it is manda­to­ry but be­fore the test at the men­tioned video I was ex­e­cut­ed the fol­low­ing com­mands.

nvidia-smi -g 0 -dm 0
==============NVSMI LOG==============

Timestamp                                 : Tue Mar 15 14:58:39 2022
Driver Version                            : 472.98
CUDA Version                              : 11.4

Attached GPUs                             : 1
GPU 00000000:01:00.0
    Supported Clocks
        Memory                            : 2600 MHz
            Graphics                      : 784 MHz
            Graphics                      : 758 MHz
            Graphics                      : 732 MHz
            Graphics                      : 705 MHz
            Graphics                      : 666 MHz
            Graphics                      : 640 MHz
            Graphics                      : 614 MHz
        Memory                            : 324 MHz
            Graphics                      : 324 MHz
nvidia-smi -ac 2600,784
Applications clocks set to "(MEM 2600, SM 784)" for GPU 00000000:01:00.0
All done.

Ref­er­ences