pve8.4.1开启显卡直通启动虚拟机并解决FNOS报GPU1未启用

pve8.4.1开启显卡直通启动虚拟机并解决FNOS报GPU1未启用
内容目录

效果图

file

1. 环境

pve 8.4.1
fnos 0.9.2

1.1 pve shell 显卡信息

root\@pve:/# lspci -nn | grep -i nvidia
03:00.0 VGA compatible controller \[0300]: NVIDIA Corporation GM107GL \[Quadro K2200] \[10de:13ba] (rev a2)
03:00.1 Audio device \[0403]: NVIDIA Corporation GM107 High Definition Audio Controller \[GeForce 940MX] \[10de:0fbc] (rev a1) 
  1. 显卡:10de:13ba
  2. 音频:10de:0fbc

    1.2 其他基础信息

    
    cat /etc/default/grub | grep GRUB_CMDLINE_LINUX_DEFAULT
    cat /etc/modules
    cat /etc/modprobe.d/vfio.conf
    cat /etc/modprobe.d/blacklist.conf
    lspci -nnk | grep -A 15 -i nvidia
```shell
root\@pve:/# cat /etc/default/grub | grep GRUB\_CMDLINE\_LINUX\_DEFAULT
cat /etc/modules
cat /etc/modprobe.d/vfio.conf
cat /etc/modprobe.d/blacklist.conf
lspci -nnk | grep -A 15 -i nvidia
GRUB\_CMDLINE\_LINUX\_DEFAULT="quiet"

# /etc/modules: kernel modules to load at boot time.

#

# This file contains the names of kernel modules that should be loaded

# at boot time, one per line. Lines beginning with "#" are ignored.

# Parameters can be specified after the module name.

cat: /etc/modprobe.d/vfio.conf: No such file or directory
cat: /etc/modprobe.d/blacklist.conf: No such file or directory
03:00.0 VGA compatible controller \[0300]: NVIDIA Corporation GM107GL \[Quadro K2200] \[10de:13ba] (rev a2)
Subsystem: NVIDIA Corporation GM107GL \[Quadro K2200] \[10de:1097]
Kernel driver in use: nouveau
Kernel modules: nvidiafb, nouveau
03:00.1 Audio device \[0403]: NVIDIA Corporation GM107 High Definition Audio Controller \[GeForce 940MX] \[10de:0fbc] (rev a1)
Subsystem: NVIDIA Corporation GM107 High Definition Audio Controller \[GeForce 940MX] \[10de:1097]
Kernel driver in use: snd\_hda\_intel
Kernel modules: snd\_hda\_intel
09:00.0 PCI bridge \[0604]: Texas Instruments XIO2001 PCI Express-to-PCI Bridge \[104c:8240]
ff:0b.0 System peripheral \[0880]: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 R3 QPI Link 0 & 1 Monitoring \[8086:2f81] (rev 02)
Subsystem: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 R3 QPI Link 0 & 1 Monitoring \[8086:2f81]
ff:0b.1 Performance counters \[1101]: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 R3 QPI Link 0 & 1 Monitoring \[8086:2f36] (rev 02)
Subsystem: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 R3 QPI Link 0 & 1 Monitoring \[8086:2f36]
Kernel driver in use: hswep\_uncore
ff:0b.2 Performance counters \[1101]: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 R3 QPI Link 0 & 1 Monitoring \[8086:2f37] (rev 02)
Subsystem: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 R3 QPI Link 0 & 1 Monitoring \[8086:2f37]
Kernel driver in use: hswep\_uncore
ff:0c.0 System peripheral \[0880]: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 Unicast Registers \[8086:2fe0] (rev 02)
Subsystem: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 Unicast Registers \[8086:2fe0]
ff:0c.1 System peripheral \[0880]: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 Unicast Registers \[8086:2fe1] (rev 02)
Subsystem: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 Unicast Registers \[8086:2fe1]

还没有配置显卡直通(passthrough),当前状态如下:


项目 状态 说明
GRUB_CMDLINE_LINUX_DEFAULT quiet ❌ 没启用 intel_iommu=on iommu=pt
/etc/modules ❌ 没加载 vfio 模块
vfio.conf / blacklist.conf 不存在 ❌ 没绑定或屏蔽驱动
显卡驱动 nouveau ❌ 已被宿主机驱动占用,不能直通

2. 配置显卡直通

依次执行以下命令:


1. 修改 GRUB 配置启用 IOMMU

sed -i 's/GRUB_CMDLINE_LINUX_DEFAULT="quiet"/GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt"/' /etc/default/grub
update-grub

2. 添加内核模块(VFIO)

cat <<EOF > /etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd
EOF

3. 使用 vfio 绑定 NVIDIA 显卡(03:00.0 和 03:00.1)

根据第一步命令得到自己的信息

cat <<EOF > /etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:13ba,10de:0fbc
EOF

我这块显卡的设备 ID 是:

  • 显卡:10de:13ba
  • 音频:10de:0fbc

4. 禁用宿主驱动(nouveau)

echo "blacklist nouveau" > /etc/modprobe.d/blacklist.conf
echo "options nouveau modeset=0" >> /etc/modprobe.d/blacklist.conf
update-initramfs -u

5. 修改 VM 配置(添加pci 原始设备)

file
启用所有功能
file


✅ 最后:重启宿主机

reboot

🔍 验证是否成功 passthrough

  1. 宿主机重启后,运行以下命令确认驱动被 vfio-pci 接管:
lspci -nnk | grep -A 15 -i nvidia

你应看到:

Kernel driver in use: vfio-pci
root@pve:~# lspci -nnk | grep -A 15 -i nvidia
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Quadro K2200] [10de:13ba] (rev a2)
        Subsystem: NVIDIA Corporation GM107GL [Quadro K2200] [10de:1097]
        Kernel driver in use: vfio-pci
        Kernel modules: nvidiafb, nouveau
03:00.1 Audio device [0403]: NVIDIA Corporation GM107 High Definition Audio Controller [GeForce 940MX] [10de:0fbc] (rev a1)
        Subsystem: NVIDIA Corporation GM107 High Definition Audio Controller [GeForce 940MX] [10de:1097]
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

3. FNos影视提示GPU1未启用

原因

root\@:/vol1/@appcenter/docker-xunlei/app#  lspci -nnk | grep -A 15 -i nvidia
00:10.0 VGA compatible controller \[0300]: NVIDIA Corporation GM107GL \[Quadro K2200] \[10de:13ba] (rev a2)
Subsystem: NVIDIA Corporation GM107GL \[Quadro K2200] \[10de:1097]
Kernel driver in use: nouveau
Kernel modules: nouveau
00:10.1 Audio device \[0403]: NVIDIA Corporation GM107 High Definition Audio Controller \[GeForce 940MX] \[10de:0fbc] (rev a1)
Subsystem: NVIDIA Corporation GM107 High Definition Audio Controller \[GeForce 940MX] \[10de:1097]
Kernel driver in use: snd\_hda\_intel
Kernel modules: snd\_hda\_intel
00:12.0 Ethernet controller \[0200]: Red Hat, Inc. Virtio network device \[1af4:1000]
Subsystem: Red Hat, Inc. Virtio network device \[1af4:0001]
Kernel driver in use: virtio-pci
Kernel modules: virtio\_pci
00:1e.0 PCI bridge \[0604]: Red Hat, Inc. QEMU PCI-PCI bridge \[1b36:0001]
00:1f.0 PCI bridge \[0604]: Red Hat, Inc. QEMU PCI-PCI bridge \[1b36:0001]
01:01.0 SCSI storage controller \[0100]: Red Hat, Inc. Virtio SCSI \[1af4:1004]
Subsystem: Red Hat, Inc. Virtio SCSI \[1af4:0008]
Kernel driver in use: virtio-pci
Kernel modules: virtio\_pci

当前加载的是 nouveau(开源驱动),不是 NVIDIA 官方驱动

解决


1. 安装 NVIDIA 官方驱动

sudo apt update
sudo apt install nvidia-driver -y

2. 禁用 nouveau 驱动(官方驱动要求)

创建文件:

sudo nano /etc/modprobe.d/blacklist-nouveau.conf

添加内容:

blacklist nouveau
options nouveau modeset=0

更新 initramfs:

sudo update-initramfs -u

然后重启:

sudo reboot

3. 安装 NVIDIA 驱动后验证

重启后执行:

nvidia-smi
root@fnos:~# nvidia-smi
Wed May 14 16:04:42 2025       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.216.01             Driver Version: 535.216.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Quadro K2200                   On  | 00000000:00:10.0 Off |                  N/A |
| 42%   47C    P8               1W /  39W |      1MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

如果能看到驱动版本和 GPU 信息,说明安装成功。

Comments

No comments yet. Why don’t you start the discussion?

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注