EPGStation v2.9.1 nvidia-driver update

Summary

EPGStation が更新され、 FFmpeg 7.0 がリリースされてしばらくたったので弊宅の録画環境もアップグレードする。 NVIDIA driver は 535 系を利用していたが 1 年たち推奨バージョンが更新されているのでこれも合わせて更新する。

現状確認

弊宅の EPGStation は下記の構成にしていて自前 build している。

  • EPGStation
  • Nodejs 18
  • FFmpeg 6.0
    • libvmaf
    • nasm
    • libx264
    • libx265
    • libvpx
    • libfdk-aac
    • libopus
    • libaom
    • libaribb24
    • avisynth+
  • join_logo_scp_trial
    • chapter_exe
    • logoframe
    • join_logo_scp
    • tsdivider
    • libdelogo.so
  • Intel Media SDK
  • NVIDIA Video Codec SDK
  • l-smash
  • l-smash-source
  • nvidia-container-tools

Nvidia driver バージョン確認

dpkg -l | grep -e 'cuda*' -e 'libnvidia*' -e 'nvidia*' で確認する。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
root@dtv-01:~# dpkg -l | grep -e 'cuda*' -e 'libnvidia*' -e 'nvidia*'
ii  cuda-keyring                     1.1-1                  all      GPG keyring for the CUDA repository
ii  libnvidia-cfg1-535:amd64         535.161.08-0ubuntu1    amd64    NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-common-535             535.161.08-0ubuntu1    all      Shared files used by the NVIDIA libraries
ii  libnvidia-compute-535:amd64      535.161.08-0ubuntu1    amd64    NVIDIA libcompute package
ii  libnvidia-container-tools        1.15.0-1               amd64    NVIDIA container runtime library (command-line tools)
ii  libnvidia-container1:amd64       1.15.0-1               amd64    NVIDIA container runtime library
ii  libnvidia-decode-535:amd64       535.161.08-0ubuntu1    amd64    NVIDIA Video Decoding runtime libraries
ii  libnvidia-encode-535:amd64       535.161.08-0ubuntu1    amd64    NVENC Video Encoding runtime library
ii  libnvidia-extra-535:amd64        535.161.08-0ubuntu1    amd64    Extra libraries for the NVIDIA driver
ii  libnvidia-fbc1-535:amd64         535.161.08-0ubuntu1    amd64    NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-gl-535:amd64           535.161.08-0ubuntu1    amd64    NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  nvidia-compute-utils-535         535.161.08-0ubuntu1    amd64    NVIDIA compute utilities
ii  nvidia-container-toolkit         1.15.0-1               amd64    NVIDIA Container toolkit
ii  nvidia-container-toolkit-base    1.15.0-1               amd64    NVIDIA Container Toolkit Base
ii  nvidia-dkms-535                  535.161.08-0ubuntu1    amd64    NVIDIA DKMS package
ii  nvidia-driver-535                535.161.08-0ubuntu1    amd64    NVIDIA driver metapackage
ii  nvidia-kernel-common-535         535.161.08-0ubuntu1    amd64    Shared files used with the kernel module
ii  nvidia-kernel-source-535         535.161.08-0ubuntu1    amd64    NVIDIA kernel source package
rc  nvidia-settings                  535.104.12-0ubuntu1    amd64    Tool for configuring the NVIDIA graphics driver
ii  nvidia-utils-535                 535.161.08-0ubuntu1    amd64    NVIDIA driver support binaries
ii  xserver-xorg-video-nvidia-535    535.161.08-0ubuntu1    amd64    NVIDIA binary Xorg driver

また、推奨ドライバーバージョンも再度確認しておく、ここで今回 nvidia-driver-555 が推奨されたので更新を実施する。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
root@dtv-01:~# ubuntu-drivers devices
ERROR:root:aplay command not found
== /sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0 ==
modalias : pci:v000010DEd00001F06sv000010DEsd000013A3bc03sc00i00
vendor   : NVIDIA Corporation
model    : TU106 [GeForce RTX 2060 SUPER]
driver   : nvidia-driver-535-server - distro non-free
driver   : nvidia-driver-535 - third-party non-free
driver   : nvidia-driver-555-open - third-party non-free
driver   : nvidia-driver-515 - third-party non-free
driver   : nvidia-driver-535-open - distro non-free
driver   : nvidia-driver-520 - third-party non-free
driver   : nvidia-driver-545 - third-party non-free
driver   : nvidia-driver-555 - third-party non-free recommended
driver   : nvidia-driver-535-server-open - distro non-free
driver   : nvidia-driver-470 - distro non-free
driver   : nvidia-driver-450-server - distro non-free
driver   : nvidia-driver-470-server - distro non-free
driver   : nvidia-driver-545-open - distro non-free
driver   : nvidia-driver-550 - third-party non-free
driver   : nvidia-driver-550-open - third-party non-free
driver   : nvidia-driver-525 - third-party non-free
driver   : xserver-xorg-video-nouveau - distro free builtin

コンテナを停止

1
docker compose stop

古い ドライバーを削除

1
2
apt -y remove --purge '^cuda*' '^libnvidia*' '^nvidia*'
apt -y autoremove

削除後は空になる。

1
2
3
4
5
root@dtv-01:~# dpkg -l | grep -e 'cuda*' -e 'libnvidia*' -e 'nvidia*'
root@dtv-01:~#

root@dtv-01:~# apt policy | grep nvidia
root@dtv-01:~#

ここで再起動。

新しいドライバーのインストール

色々とやり方はありますが、一番安定したのは下記の方法です。

  • OS: Ubuntu 22.04
  • cuda version: 12.4.1

OS によってレポジトリー登録されるリンクが違うので注意。

apt -y install cuda-toolkit-12-4 はホストマシン上で cuda を使う場合に必要ですが、今回はコンテナ内に押し込んでるためホストマシンには不要です。

1
2
3
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
dpkg -i cuda-keyring_1.1-1_all.deb
apt update
Topic

上のレポジトリーから導入しない場合、下記の方法などがあるがこれは古いドライバーが降ってくるので注意。
nvidia-driver-555 が推奨のハズだがレポジトリーにないため nvidia-driver-535 が推奨されている。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
root@dtv-01:~# ubuntu-drivers devices
ERROR:root:aplay command not found
== /sys/devices/pci0000:00/0000:00:1c.0/0000:01:00.0 ==
modalias : pci:v000010DEd00001F06sv000010DEsd000013A3bc03sc00i00
vendor   : NVIDIA Corporation
model    : TU106 [GeForce RTX 2060 SUPER]
driver   : nvidia-driver-545-open - distro non-free
driver   : nvidia-driver-535-server - distro non-free
driver   : nvidia-driver-470-server - distro non-free
driver   : nvidia-driver-545 - distro non-free
driver   : nvidia-driver-470 - distro non-free
driver   : nvidia-driver-535 - distro non-free recommended
driver   : nvidia-driver-450-server - distro non-free
driver   : nvidia-driver-535-server-open - distro non-free
driver   : nvidia-driver-535-open - distro non-free
driver   : xserver-xorg-video-nouveau - distro free builtin

最新ドライバーをインストールする。 cuda-drivers パッケージでインストールでも良いがコンテナ内とバージョンをあわせる必要があるので、 nvidia-driver-555 を指定することにした。

1
apt -y install nvidia-driver-555

インストール後は下記の状態になる。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
root@dtv-01:~# dpkg -l | grep -e 'cuda*' -e 'libnvidia*' -e 'nvidia*'
ii  cuda-keyring                     1.1-1                 all      GPG keyring for the CUDA repository
ii  libnvidia-cfg1-555:amd64         555.42.02-0ubuntu1    amd64    NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-common-555             555.42.02-0ubuntu1    all      Shared files used by the NVIDIA libraries
ii  libnvidia-compute-555:amd64      555.42.02-0ubuntu1    amd64    NVIDIA libcompute package
ii  libnvidia-decode-555:amd64       555.42.02-0ubuntu1    amd64    NVIDIA Video Decoding runtime libraries
ii  libnvidia-encode-555:amd64       555.42.02-0ubuntu1    amd64    NVENC Video Encoding runtime library
ii  libnvidia-extra-555:amd64        555.42.02-0ubuntu1    amd64    Extra libraries for the NVIDIA driver
ii  libnvidia-fbc1-555:amd64         555.42.02-0ubuntu1    amd64    NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-gl-555:amd64           555.42.02-0ubuntu1    amd64    NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  nvidia-compute-utils-555         555.42.02-0ubuntu1    amd64    NVIDIA compute utilities
ii  nvidia-dkms-555                  555.42.02-0ubuntu1    amd64    NVIDIA DKMS package
ii  nvidia-driver-555                555.42.02-0ubuntu1    amd64    NVIDIA driver metapackage
ii  nvidia-firmware-555-555.42.02    555.42.02-0ubuntu1    amd64    Firmware files used by the kernel module
ii  nvidia-kernel-common-555         555.42.02-0ubuntu1    amd64    Shared files used with the kernel module
ii  nvidia-kernel-source-555         555.42.02-0ubuntu1    amd64    NVIDIA kernel source package
ii  nvidia-utils-555                 555.42.02-0ubuntu1    amd64    NVIDIA driver support binaries
ii  xserver-xorg-video-nvidia-555    555.42.02-0ubuntu1    amd64    NVIDIA binary Xorg driver

NVIDIA Container Toolkit インストール

Docker で NVIDIA を使うには NVIDIA Container Toolkit をインストールする必要がある。

1
2
3
4
5
6
7
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

apt update
apt -y install nvidia-container-toolkit

インストール後のパッケージ状態、すべて最新化された。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
root@dtv-01:~# dpkg -l | grep -e 'cuda*' -e 'libnvidia*' -e 'nvidia*'
ii  cuda-keyring                     1.1-1                 all      GPG keyring for the CUDA repository
ii  libnvidia-cfg1-555:amd64         555.42.02-0ubuntu1    amd64    NVIDIA binary OpenGL/GLX configuration library
ii  libnvidia-common-555             555.42.02-0ubuntu1    all      Shared files used by the NVIDIA libraries
ii  libnvidia-compute-555:amd64      555.42.02-0ubuntu1    amd64    NVIDIA libcompute package
ii  libnvidia-container-tools        1.15.0-1              amd64    NVIDIA container runtime library (command-line tools)
ii  libnvidia-container1:amd64       1.15.0-1              amd64    NVIDIA container runtime library
ii  libnvidia-decode-555:amd64       555.42.02-0ubuntu1    amd64    NVIDIA Video Decoding runtime libraries
ii  libnvidia-encode-555:amd64       555.42.02-0ubuntu1    amd64    NVENC Video Encoding runtime library
ii  libnvidia-extra-555:amd64        555.42.02-0ubuntu1    amd64    Extra libraries for the NVIDIA driver
ii  libnvidia-fbc1-555:amd64         555.42.02-0ubuntu1    amd64    NVIDIA OpenGL-based Framebuffer Capture runtime library
ii  libnvidia-gl-555:amd64           555.42.02-0ubuntu1    amd64    NVIDIA OpenGL/GLX/EGL/GLES GLVND libraries and Vulkan ICD
ii  nvidia-compute-utils-555         555.42.02-0ubuntu1    amd64    NVIDIA compute utilities
ii  nvidia-container-toolkit         1.15.0-1              amd64    NVIDIA Container toolkit
ii  nvidia-container-toolkit-base    1.15.0-1              amd64    NVIDIA Container Toolkit Base
ii  nvidia-dkms-555                  555.42.02-0ubuntu1    amd64    NVIDIA DKMS package
ii  nvidia-driver-555                555.42.02-0ubuntu1    amd64    NVIDIA driver metapackage
ii  nvidia-firmware-555-555.42.02    555.42.02-0ubuntu1    amd64    Firmware files used by the kernel module
ii  nvidia-kernel-common-555         555.42.02-0ubuntu1    amd64    Shared files used with the kernel module
ii  nvidia-kernel-source-555         555.42.02-0ubuntu1    amd64    NVIDIA kernel source package
ii  nvidia-utils-555                 555.42.02-0ubuntu1    amd64    NVIDIA driver support binaries
ii  xserver-xorg-video-nvidia-555    555.42.02-0ubuntu1    amd64    NVIDIA binary Xorg driver

Docker の設定を更新します。

1
2
nvidia-ctk runtime configure --runtime=docker
systemctl restart docker
Topic
ここで再起動。

Docker から使えるかテスト

問題なくホストマシンでドライバーが当たっていれば nvidia-smi の表示を拝むことができる。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
root@dtv-01:~# docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
Fri May 24 02:49:11 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.02              Driver Version: 555.42.02      CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 2060 ...    Off |   00000000:01:00.0 Off |                  N/A |
| 23%   54C    P8             17W /  175W |       1MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+
Hugo で構築されています。
テーマ StackJimmy によって設計されています。