Proxmox VE 8.3 の ZFS MIROR が壊れたので直した

Summary ¶

弊宅では Proxmox VE を母体として TrueNAS SCALE, 録画鯖 on Proxmox VE としている。こうすることで、 4U のクソデカシャーシにリソースを集約し管理と消費電力をおさせる。そのため母体の Proxmox VE は OS レベルではできるだけ障害に強いように組んだ。

そのため SSD を ZFS MIROR し OS Disk としていたがそのうちの 1枚で極端に速度が出ない現象が発生したため原因調査を実施した。お仕事でやると吐き気が出る案件だが HomeLab なので気楽に外堀を埋めていくことで原因が見えてきた、その経緯を残す。

ことの発端 ¶

それは、構築して1週間程度稼働しているが Kernel パラメーターを詰める必要があり何度か再起動していたところ、 dmesg にかなりの数 failed command: WRITE FPDMA QUEUED のエラーが出力されている状況と、ZFS が一瞬崩れた様子が見られたので調査を始めた。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
Jan 11 12:39:42 nas-01 kernel: ata7.00: exception Emask 0x10 SAct 0x80000020 SErr 0x400000 action 0x6 frozen
Jan 11 12:39:43 nas-01 kernel: ata7.00: irq_stat 0x08000000, interface fatal error
Jan 11 12:39:43 nas-01 kernel: ata7: SError: { Handshk }
Jan 11 12:39:43 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 12:39:43 nas-01 kernel: ata7.00: cmd 61/00:28:40:7c:a0/01:00:1f:00:00/40 tag 5 ncq dma 131072 out
                                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Jan 11 12:39:43 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 12:39:43 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 12:39:43 nas-01 kernel: ata7.00: cmd 61/00:f8:40:7d:a0/01:00:1f:00:00/40 tag 31 ncq dma 131072 out
                                        res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Jan 11 12:39:43 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 12:39:43 nas-01 kernel: ata7: hard resetting link
Jan 11 12:39:43 nas-01 kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jan 11 12:39:43 nas-01 kernel: ata7.00: configured for UDMA/133
Jan 11 12:39:43 nas-01 kernel: ata7: EH complete
Jan 11 12:39:43 nas-01 kernel: ata7.00: exception Emask 0x10 SAct 0x8000fe02 SErr 0x400000 action 0x6 frozen
Jan 11 12:39:43 nas-01 kernel: ata7.00: irq_stat 0x08000000, interface fatal error
Jan 11 12:39:43 nas-01 kernel: ata7: SError: { Handshk }
Jan 11 12:39:43 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 12:39:43 nas-01 kernel: ata7.00: cmd 61/00:08:40:85:a0/01:00:1f:00:00/40 tag 1 ncq dma 131072 out
                                        res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Jan 11 12:39:43 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 12:39:43 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 12:39:43 nas-01 kernel: ata7.00: cmd 61/00:48:40:83:a0/02:00:1f:00:00/40 tag 9 ncq dma 262144 out
                                        res 40/00:01:00:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Jan 11 12:39:43 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 12:39:43 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 12:39:43 nas-01 kernel: ata7.00: cmd 61/00:50:40:87:a0/01:00:1f:00:00/40 tag 10 ncq dma 131072 out
                                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Jan 11 12:39:43 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 12:39:43 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 12:39:43 nas-01 kernel: ata7.00: cmd 61/00:58:40:81:a0/01:00:1f:00:00/40 tag 11 ncq dma 131072 out
                                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Jan 11 12:39:43 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 12:39:43 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 12:39:43 nas-01 kernel: ata7.00: cmd 61/00:60:40:8b:a0/01:00:1f:00:00/40 tag 12 ncq dma 131072 out
                                        res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Jan 11 12:39:43 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 12:39:43 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 12:39:43 nas-01 kernel: ata7.00: cmd 61/00:68:40:88:a0/01:00:1f:00:00/40 tag 13 ncq dma 131072 out
                                        res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Jan 11 12:39:43 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 12:39:43 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 12:39:43 nas-01 kernel: ata7.00: cmd 61/00:70:40:89:a0/01:00:1f:00:00/40 tag 14 ncq dma 131072 out
                                        res 40/00:01:06:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Jan 11 12:39:43 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 12:39:43 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 12:39:43 nas-01 kernel: ata7.00: cmd 61/00:78:40:8a:a0/01:00:1f:00:00/40 tag 15 ncq dma 131072 out
                                        res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Jan 11 12:39:43 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 12:39:43 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 12:39:43 nas-01 kernel: ata7.00: cmd 61/00:f8:40:86:a0/01:00:1f:00:00/40 tag 31 ncq dma 131072 out
                                        res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Jan 11 12:39:43 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 12:39:43 nas-01 kernel: ata7: hard resetting link
Jan 11 12:39:43 nas-01 kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jan 11 12:39:43 nas-01 kernel: ata7.00: configured for UDMA/133
Jan 11 12:39:43 nas-01 kernel: ata7: EH complete
Jan 11 12:40:41 nas-01 kernel:  zd64: p1 p2 p3
Jan 11 12:42:44 nas-01 kernel:  zd64: p1 p2 p3
Jan 11 13:03:33 nas-01 kernel: ata7.00: exception Emask 0x10 SAct 0x80 SErr 0x400000 action 0x6 frozen
Jan 11 13:03:33 nas-01 kernel: ata7.00: irq_stat 0x08000000, interface fatal error
Jan 11 13:03:33 nas-01 kernel: ata7: SError: { Handshk }
Jan 11 13:03:33 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 13:03:33 nas-01 kernel: ata7.00: cmd 61/48:38:e0:23:22/00:00:21:00:00/40 tag 7 ncq dma 36864 out
                                        res 40/00:01:06:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Jan 11 13:03:33 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 13:03:33 nas-01 kernel: ata7: hard resetting link
Jan 11 13:03:34 nas-01 kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jan 11 13:03:34 nas-01 kernel: ata7.00: configured for UDMA/133
Jan 11 13:03:34 nas-01 kernel: ata7: EH complete
Jan 11 13:17:48 nas-01 kernel: ata7.00: exception Emask 0x10 SAct 0x40 SErr 0x400000 action 0x6 frozen
Jan 11 13:17:48 nas-01 kernel: ata7.00: irq_stat 0x08000000, interface fatal error
Jan 11 13:17:48 nas-01 kernel: ata7: SError: { Handshk }
Jan 11 13:17:48 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 13:17:48 nas-01 kernel: ata7.00: cmd 61/20:30:10:7d:22/00:00:21:00:00/40 tag 6 ncq dma 16384 out
                                        res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Jan 11 13:17:48 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 13:17:48 nas-01 kernel: ata7: hard resetting link
Jan 11 13:17:49 nas-01 kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jan 11 13:17:49 nas-01 kernel: ata7.00: configured for UDMA/133
Jan 11 13:17:49 nas-01 kernel: ata7: EH complete
Jan 11 13:18:04 nas-01 kernel: ata7.00: exception Emask 0x10 SAct 0x8400 SErr 0x400000 action 0x6 frozen
Jan 11 13:18:04 nas-01 kernel: ata7.00: irq_stat 0x08000000, interface fatal error
Jan 11 13:18:04 nas-01 kernel: ata7: SError: { Handshk }
Jan 11 13:18:04 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 13:18:04 nas-01 kernel: ata7.00: cmd 61/18:50:90:7e:22/00:00:21:00:00/40 tag 10 ncq dma 12288 out
                                        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Jan 11 13:18:04 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 13:18:04 nas-01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Jan 11 13:18:04 nas-01 kernel: ata7.00: cmd 61/30:78:a8:7e:22/00:00:21:00:00/40 tag 15 ncq dma 24576 out
                                        res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
Jan 11 13:18:04 nas-01 kernel: ata7.00: status: { DRDY }
Jan 11 13:18:04 nas-01 kernel: ata7: hard resetting link
Jan 11 13:18:04 nas-01 kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Jan 11 13:18:04 nas-01 kernel: ata7.00: configured for UDMA/133
Jan 11 13:18:04 nas-01 kernel: ata7: EH complete

ここで、疑問になったのは ata7 がどのディスクであるか? だと思う。
調べる方法を試行錯誤した結果下記で確認できた。 Linux では /dev/sdX で始まるため強引な手法であるが目的は達成する

今回は TEAM_T253512GB_TPBF240909XXXXXXXXXX であった
/devices/pci0000:40/0000:40:08.1/0000:42:00.2/ata7 のためこれが ata7 であるデバイスは /dev/sdd であった。

1
2
3
4
5
6
7
> udevadm info --query=all --name=/dev/sd{a..z} | grep -E '^(P|S|M)'
P: /devices/pci0000:40/0000:40:08.1/0000:42:00.2/ata7/host6/target6:0:0/6:0:0:0/block/sdd
M: sdd
S: disk/by-id/ata-TEAM_T253512GB_TPBF240909XXXXXXXXXX
S: disk/by-diskseq/12
S: disk/by-path/pci-0000:42:00.2-ata-6
S: disk/by-path/pci-0000:42:00.2-ata-6.0

状態を確認 ¶

さて、前述の調査で /dev/sdd のドライブが被疑であることを確認した。詳細を確認していく
まずは、 smartctl の情報で /dev/sdd が被疑の S/N か確認する

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
> smartctl -i /dev/sdd
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.8.12-5-pve] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     TEAM T253512GB
Serial Number:    TPBF240909XXXXXXXXXX
LU WWN Device Id: 0 000000 000000000
Firmware Version: HP3414B5
User Capacity:    512,110,190,592 bytes [512 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
TRIM Command:     Available
Device is:        Not in smartctl database 7.3/5319
ATA Version is:   ACS-4 T13/BSR INCITS 529 revision 5
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Jan 11 13:45:10 2025 JST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

S/N が TPBF240909XXXXXXXXXX でありあっている。また、 SATA 3.2 で 6.0Gb/s で接続されていてしっかりと SATA ||| で接続できていることも確認できた。そのため SATA の速度が削られて遅いということではないことが確認できた。

このマザーボードは Supermicro H11SSL-i というやつでマニュアルを見ると SATA 0-7、SATA 8-11、 SATA 12-15 と存在する。ブロック図を確認すると CPU に直結されていることも確認できる。

マザーボード SATA 配置図マザーボードブロック図

そのため ATA の番号も 0 ～ 16番まで存在する、接続には2つの方式を利用しており1つは SFF8087 to Reverse SATA ケーブルで接続している、残りは SFF8087 to MiniSAS で接続していた。そのためケーブル被疑でないことを確認する必要がある。

SATA Port	HCTL
0-3	1-4	SFF8087 to Reverse SATA
4-7	5-8	SFF8087 to MiniSAS
8-11	9-12	NOT USE
12-15	13-16	SFF8087 to Reverse SATA

同じ型番の製品を2枚利用していたため sdd と sdf で速度差があるのか確認した結果は sdd がかなり遅く、 sdf はそれなりであった。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
> lsblk -o NAME,HCTL,MODEL,SERIAL
NAME     HCTL       MODEL                SERIAL
sdd      6:0:0:0    TEAM T253512GB       TPBF240909XXXXXXXXXX
sdf      14:0:0:0   TEAM T253512GB       TPBF2410210XXXXXXXXX

> fio --name=test --readonly --rw=randread --filename /dev/sdd --bs=32k \
  --ioengine=libaio --iodepth=32 --direct=1 --runtime=1m --time_based=1

Run status group 0 (all jobs):
   READ: bw=48.6MiB/s (51.0MB/s), 48.6MiB/s-48.6MiB/s (51.0MB/s-51.0MB/s), io=2918MiB (3060MB), run=60021-60021msec

Disk stats (read/write):
  sdd: ios=93169/847, merge=0/26, ticks=1914572/17850, in_queue=1933389, util=99.87%

> fio --name=test --readonly --rw=randread --filename /dev/sdf --bs=32k \
  --ioengine=libaio --iodepth=32 --direct=1 --runtime=1m --time_based=1

Run status group 0 (all jobs):
   READ: bw=384MiB/s (403MB/s), 384MiB/s-384MiB/s (403MB/s-403MB/s), io=22.5GiB (24.2GB), run=60003-60003msec

Disk stats (read/write):
  sdf: ios=735683/989, merge=0/19, ticks=1909824/3140, in_queue=1913104, util=99.87%

そこで、マザーボード、SATAケーブルの被疑を排除するためにドライブを入れ替えて検証したところ ATA 6 は問題ないことがわかり、 TPBF240909XXXXXXXXXX のみ動作が安定しないことがわかった。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
> lsblk -o NAME,HCTL,MODEL,SERIAL
NAME     HCTL       MODEL                SERIAL
sdd      6:0:0:0    TEAM T253512GB       TPBF2410210XXXXXXXXX
sdf      14:0:0:0   TEAM T253512GB       TPBF240909XXXXXXXXXX

> fio --name=test --readonly --rw=randread --filename /dev/sdd --bs=32k \
  --ioengine=libaio --iodepth=32 --direct=1 --runtime=1m --ti

Run status group 0 (all jobs):
   READ: bw=385MiB/s (404MB/s), 385MiB/s-385MiB/s (404MB/s-404MB/s), io=22.5GiB (24.2GB), run=60003-60003msec

> fio --name=test --readonly --rw=randread --filename /dev/sdf --bs=32k \
  --ioengine=libaio --iodepth=32 --direct=1 --runtime=1m --time_based=1

Run status group 0 (all jobs):
   READ: bw=48.6MiB/s (50.9MB/s), 48.6MiB/s-48.6MiB/s (50.9MB/s-50.9MB/s), io=2915MiB (3056MB), run=60019-60019msec

ATA 13-16 は SFF8087 to Mini SAS で接続していたためケーブル被疑を確認。余剰の Samsung を利用したがどれも問題無し

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
# => 4 3 2 [1]
> lsblk -o NAME,HCTL,MODEL,SERIAL
NAME     HCTL       MODEL                   SERIAL
sdh      16:0:0:0   Samsung SSD 870 EVO 1TB S74XXXXXXXXXXXX

> fio --name=test --readonly --rw=randread --filename /dev/sdh --bs=32k \
  --ioengine=libaio --iodepth=32 --direct=1 --runtime=1m --ti
Run status group 0 (all jobs):
   READ: bw=409MiB/s (429MB/s), 409MiB/s-409MiB/s (429MB/s-429MB/s), io=24.0GiB (25.7GB), run=60003-60003msec


# => 4 3 [2] 1]
> lsblk -o NAME,HCTL,MODEL,SERIAL
sdg      15:0:0:0   Samsung SSD 870 EVO 1TB S74XXXXXXXXXXXX

> fio --name=test --readonly --rw=randread --filename /dev/sdg --bs=32k \
  --ioengine=libaio --iodepth=32 --direct=1 --runtime=1m --ti
Run status group 0 (all jobs):
   READ: bw=409MiB/s (428MB/s), 409MiB/s-409MiB/s (428MB/s-428MB/s), io=23.9GiB (25.7GB), run=60003-60003msec

# => 4 [3] 2 1
lsblk -o NAME,HCTL,MODEL,SERIAL
NAME     HCTL       MODEL                   SERIAL
sdf      14:0:0:0   Samsung SSD 870 EVO 1TB S74XXXXXXXXXXXX

> fio --name=test --readonly --rw=randread --filename /dev/sdf --bs=32k \
  --ioengine=libaio --iodepth=32 --direct=1 --runtime=1m --ti

Run status group 0 (all jobs):
   READ: bw=412MiB/s (432MB/s), 412MiB/s-412MiB/s (432MB/s-432MB/s), io=24.1GiB (25.9GB), run=60003-60003msec

# => [4] 3 2 1
> lsblk -o NAME,HCTL,MODEL,SERIAL
NAME     HCTL       MODEL                   SERIAL
sdg      13:0:0:0   Samsung SSD 870 EVO 1TB S74XXXXXXXXXXXX

> fio --name=test --readonly --rw=randread --filename /dev/sdg --bs=32k \
  --ioengine=libaio --iodepth=32 --direct=1 --runtime=1m --time_based=1
Run status group 0 (all jobs):
   READ: bw=411MiB/s (431MB/s), 411MiB/s-411MiB/s (431MB/s-431MB/s), io=24.1GiB (25.9GB), run=60003-60003msec

以上の結果から smartctl に表示される Attributes で Erase_Fail_Count_Chip, Wear_Leveling_Count が上昇する現象を確認したためメーカーに問い合わせしつつ、同型番の保守部材として確保していたストックを付けて復旧させる。

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
# => 4 3 [2] 1
> lsblk -o NAME,HCTL,MODEL,SERIAL
NAME     HCTL       MODEL                   SERIAL
sdg      15:0:0:0   TEAM T253512GB          TPBF240909XXXXXXXXXX

> date
Sat Jan 11 06:20:30 PM JST 2025

> smartctl -a /dev/sdf
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.8.12-5-pve] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     TEAM T253512GB
Serial Number:    TPBF240909XXXXXXXXXX
LU WWN Device Id: 0 000000 000000000
Firmware Version: HP3414B5
User Capacity:    512,110,190,592 bytes [512 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
TRIM Command:     Available
Device is:        Not in smartctl database 7.3/5319
ATA Version is:   ACS-4 T13/BSR INCITS 529 revision 5
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Jan 11 18:23:29 2025 JST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (  120) seconds.
Offline data collection
capabilities:                    (0x5d) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Abort Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0002) Does not save SMART data before
                                        entering power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   8) minutes.
Extended self-test routine
recommended polling time:        (  16) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   050    Old_age   Always       -       0
  5 Reallocated_Sector_Ct   0x0032   100   100   050    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   050    Old_age   Always       -       314
 12 Power_Cycle_Count       0x0032   100   100   050    Old_age   Always       -       27
160 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       0
161 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       100
163 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       124
164 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       14
165 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       23
166 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       1
167 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       6
168 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       0
169 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       100
175 Program_Fail_Count_Chip 0x0032   100   100   050    Old_age   Always       -       0
176 Erase_Fail_Count_Chip   0x0032   100   100   050    Old_age   Always       -       34265
177 Wear_Leveling_Count     0x0032   100   100   050    Old_age   Always       -       433299
178 Used_Rsvd_Blk_Cnt_Chip  0x0032   100   100   050    Old_age   Always       -       0
181 Program_Fail_Cnt_Total  0x0032   100   100   050    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   050    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   050    Old_age   Always       -       26
194 Temperature_Celsius     0x0032   100   100   050    Old_age   Always       -       40
195 Hardware_ECC_Recovered  0x0032   100   100   050    Old_age   Always       -       19
196 Reallocated_Event_Count 0x0032   100   100   050    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   050    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0032   100   100   050    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x0032   100   100   050    Old_age   Always       -       46
232 Available_Reservd_Space 0x0032   100   100   050    Old_age   Always       -       100
241 Total_LBAs_Written      0x0032   100   100   050    Old_age   Always       -       28403
242 Total_LBAs_Read         0x0032   100   100   050    Old_age   Always       -       2212

SMART Error Log Version: 1
ATA Error Count: 2
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 occurred at disk power-on lifetime: 97 hours (4 days + 1 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 40 a0 48 56 26 00   at LBA = 0x00265648 = 2512456

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 00 a0 48 56 26 40 08      04:45:26.120  WRITE FPDMA QUEUED
  61 00 78 48 55 26 40 08      04:45:26.120  WRITE FPDMA QUEUED
  61 00 48 48 54 26 40 08      04:45:26.120  WRITE FPDMA QUEUED
  61 00 c0 48 53 26 40 08      04:45:26.120  WRITE FPDMA QUEUED
  61 00 98 48 52 26 40 08      04:45:26.120  WRITE FPDMA QUEUED

Error 1 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours)
  When the command that caused the error occurred, the device was in an unknown state.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  84 40 50 40 9c 01 00   at LBA = 0x00019c40 = 105536

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 40 50 40 9c 01 40 08      00:03:15.780  WRITE FPDMA QUEUED
  61 40 48 00 97 01 40 08      00:03:15.780  WRITE FPDMA QUEUED
  61 40 40 c0 91 01 40 08      00:03:15.770  WRITE FPDMA QUEUED
  61 40 38 80 8c 01 40 08      00:03:15.770  WRITE FPDMA QUEUED
  61 40 30 40 87 01 40 08      00:03:15.770  WRITE FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Offline             Self-test routine in progress 80%       312         -
# 2  Offline             Self-test routine in progress 80%       312         -
# 3  Offline             Self-test routine in progress 80%       312         -
# 4  Offline             Self-test routine in progress 80%       312         -
# 5  Offline             Self-test routine in progress 80%       312         -
# 6  Offline             Self-test routine in progress 80%       312         -
# 7  Offline             Self-test routine in progress 80%       312         -
# 8  Offline             Self-test routine in progress 80%       312         -
# 9  Offline             Self-test routine in progress 80%       312         -
#10  Offline             Self-test routine in progress 80%       312         -
#11  Offline             Self-test routine in progress 80%       312         -
#12  Offline             Self-test routine in progress 80%       312         -
#13  Offline             Self-test routine in progress 80%       312         -
#14  Offline             Self-test routine in progress 80%       312         -
#15  Offline             Self-test routine in progress 80%       312         -
#16  Offline             Self-test routine in progress 80%       312         -
#17  Offline             Self-test routine in progress 80%       312         -
#18  Offline             Self-test routine in progress 80%       312         -
#19  Offline             Self-test routine in progress 80%       312         -
#20  Offline             Self-test routine in progress 80%       312         -
#21  Offline             Self-test routine in progress 80%       312         -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

ZFS から外す ¶

まず、 Proxmox VE のインストール時に ZFS MIROR で構築しているため故障 Disk TPBF240909XXXXXXXXXX を ZFS から外す

1
2
3
4
5
6
7
8
> lsblk -o NAME,HCTL,MODEL,SERIAL
NAME     HCTL       MODEL                SERIAL
sdd      6:0:0:0    TEAM T253512GB       TPBF2410210XXXXXXXXX
sdf      14:0:0:0   TEAM T253512GB       TPBF240909XXXXXXXXXX

> zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
rpool   472G   124G   348G        -         -     7%    26%  1.00x    ONLINE  -

ZFS pool 名が rpool であることが確認できたため pool の詳細を確認する

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
> zpool status rpool
  pool: rpool
 state: ONLINE
  scan: resilvered 17.9M in 00:00:00 with 0 errors on Sat Jan 11 17:39:05 2025
config:

        NAME                                               STATE     READ WRITE CKSUM
        rpool                                              ONLINE       0     0     0
          mirror-0                                         ONLINE       0     0     0
            ata-TEAM_T253512GB_TPBF240909XXXXXXXXXX-part3  ONLINE       0     0     0
            ata-TEAM_T253512GB_TPBF2410210XXXXXXXXX-part3  ONLINE       0     0     0

errors: No known data errors

by-id が確認できたため Remove する

1
> zpool offline rpool ata-TEAM_T253512GB_TPBF240909XXXXXXXXXX-part3

ステータスを再度確認すると下記

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
> zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
rpool   472G   124G   348G        -         -     7%    26%  1.00x  DEGRADED  -

> zpool status rpool
  pool: rpool
 state: DEGRADED
status: One or more devices has been removed by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using zpool online' or replace the device with
        'zpool replace'.
  scan: resilvered 17.9M in 00:00:00 with 0 errors on Sat Jan 11 17:39:05 2025
config:

        NAME                                               STATE     READ WRITE CKSUM
        rpool                                              DEGRADED     0     0     0
          mirror-0                                         DEGRADED     0     0     0
            ata-TEAM_T253512GB_TPBF2409090090201032-part3  OFFLINE      0     0     0
            ata-TEAM_T253512GB_TPBF2410210030300358-part3  ONLINE       0     0     0

ZFS へ新しい Disk を追加 ¶

トレイから抜去し、 Disk を交換したらパーティション情報を書き込む

1
2
3
4
> lsblk -o NAME,HCTL,MODEL,SERIAL
NAME     HCTL       MODEL                SERIAL
sdd      6:0:0:0    TEAM T253512GB       TPBF2410210030300358
sdh      14:0:0:0   TEAM T253512GB       TPBF2409090090200500

今回は /dev/sdd から /dev/sdh で複製するため下記となる

1
2
> sgdisk /dev/sdd -R /dev/sdh
The operation has completed successfully.

次に、 GUID を固有なものに変更

1
2
> sgdisk -G /dev/sdh
The operation has completed successfully.

Note

ここで GUID を認識させるため再起動する

再起動が完了したら新しい Disk の by-id と ZFS pool の状態を確認する

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
> ls /dev/disk/by-id -ahl
total 0
drwxr-xr-x 2 root root 560 Jan 12 11:13 .
drwxr-xr-x 9 root root 180 Jan 12 11:13 ..
lrwxrwxrwx 1 root root   9 Jan 12 11:13 ata-TEAM_T253512GB_TPBF2409YYYYYYYYYYYY -> ../../sdf
lrwxrwxrwx 1 root root  10 Jan 12 11:13 ata-TEAM_T253512GB_TPBF2409YYYYYYYYYYYY-part1 -> ../../sdf1
lrwxrwxrwx 1 root root  10 Jan 12 11:13 ata-TEAM_T253512GB_TPBF2409YYYYYYYYYYYY-part2 -> ../../sdf2
lrwxrwxrwx 1 root root  10 Jan 12 11:13 ata-TEAM_T253512GB_TPBF2409YYYYYYYYYYYY-part3 -> ../../sdf3

lrwxrwxrwx 1 root root   9 Jan 12 11:13 ata-TEAM_T253512GB_TPBF2410XXXXXXXXXXXX -> ../../sdd
lrwxrwxrwx 1 root root  10 Jan 12 11:13 ata-TEAM_T253512GB_TPBF2410XXXXXXXXXXXX-part1 -> ../../sdd1
lrwxrwxrwx 1 root root  10 Jan 12 11:13 ata-TEAM_T253512GB_TPBF2410XXXXXXXXXXXX-part2 -> ../../sdd2
lrwxrwxrwx 1 root root  10 Jan 12 11:13 ata-TEAM_T253512GB_TPBF2410XXXXXXXXXXXX-part3 -> ../../sdd3

> zpool status
  pool: rpool
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: resilvered 17.9M in 00:00:00 with 0 errors on Sat Jan 11 17:39:05 2025
config:

        NAME                                               STATE     READ WRITE CKSUM
        rpool                                              DEGRADED     0     0     0
          mirror-0                                         DEGRADED     0     0     0
            ata-TEAM_T253512GB_TPBF2409XXXXXXXXXXXX-part3  OFFLINE      0     0     0
            ata-TEAM_T253512GB_TPBF2410XXXXXXXXXXXX-part3  ONLINE       0     0     0

交換対象は ata-TEAM_T253512GB_TPBF2409XXXXXXXXXXXX-part3 で ZFS pool 名は rpool であり交換先 Disk は ata-TEAM_T253512GB_TPBF2409YYYYYYYYYYYY である。コマンドを組むと下記になる。

1
2
> zpool replace -f rpool ata-TEAM_T253512GB_TPBF2409XXXXXXXXXXXX-part3 \
                         ata-TEAM_T253512GB_TPBF2409YYYYYYYYYYYY-part3

コマンドが成功すると

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
> zpool status
  pool: rpool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sun Jan 12 11:22:32 2025
        124G / 124G scanned, 12.4G / 124G issued at 396M/s
        12.5G resilvered, 9.98% done, 00:04:48 to go
config:

        NAME                                                 STATE     READ WRITE CKSUM
        rpool                                                DEGRADED     0     0     0
          mirror-0                                           DEGRADED     0     0     0
            replacing-0                                      DEGRADED     0     0     0
              ata-TEAM_T253512GB_TPBF240909XXXXXXXXXX-part3  OFFLINE      0     0     0
              ata-TEAM_T253512GB_TPBF240909YYYYYYYYYY-part3  ONLINE       0     0     0  (resilvering)
            ata-TEAM_T253512GB_TPBF2410XXXXXXXXXXXX-part3    ONLINE       0     0     0

errors: No known data errors

完了すると ONLINE に変更される。これでリプレイスは完了。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
zpool status
  pool: rpool
 state: ONLINE
  scan: resilvered 125G in 00:24:32 with 0 errors on Sun Jan 12 12:08:04 2025
config:

        NAME                                               STATE     READ WRITE CKSUM
        rpool                                              ONLINE       0     0     0
          mirror-0                                         ONLINE       0     0     0
            ata-TEAM_T253512GB_TPBF240909YYYYYYYYYY-part3  ONLINE       0     0     0
            ata-TEAM_T253512GB_TPBF2410XXXXXXXXXXXX-part3  ONLINE       0     0     0

errors: No known data errors

次に、 bootloader を書き込みする

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
> lsblk -o NAME,HCTL,MODEL,SERIAL
NAME     HCTL       MODEL                SERIAL
sdd      6:0:0:0    TEAM T253512GB       TPBF2410XXXXXXXXXXXX
├─sdd1
├─sdd2
└─sdd3
sdf      14:0:0:0   TEAM T253512GB       TPBF2409YYYYYYYYYYYY
├─sdf1
├─sdf2
└─sdf3

> proxmox-boot-tool format /dev/sdf2 --force
UUID="11508983877615487063" SIZE="1073741824" FSTYPE="zfs_member" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sdf" MOUNTPOINT=""
Formatting '/dev/sdf2' as vfat..
mkfs.fat 4.2 (2021-01-31)
Done.

> proxmox-boot-tool init /dev/sdf2 --force
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
UUID="89FD-A1F3" SIZE="1073741824" FSTYPE="vfat" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sdf" MOUNTPOINT=""
Mounting '/dev/sdf2' on '/var/tmp/espmounts/89FD-A1F3'.
Installing systemd-boot..
Created "/var/tmp/espmounts/89FD-A1F3/EFI/systemd".
Created "/var/tmp/espmounts/89FD-A1F3/EFI/BOOT".
Created "/var/tmp/espmounts/89FD-A1F3/loader".
Created "/var/tmp/espmounts/89FD-A1F3/loader/entries".
Created "/var/tmp/espmounts/89FD-A1F3/EFI/Linux".
Copied "/usr/lib/systemd/boot/efi/systemd-bootx64.efi" to "/var/tmp/espmounts/89FD-A1F3/EFI/systemd/systemd-bootx64.efi".
Copied "/usr/lib/systemd/boot/efi/systemd-bootx64.efi" to "/var/tmp/espmounts/89FD-A1F3/EFI/BOOT/BOOTX64.EFI".
Random seed file /var/tmp/espmounts/89FD-A1F3/loader/random-seed successfully written (32 bytes).
Created EFI boot entry "Linux Boot Manager".
Configuring systemd-boot..
Unmounting '/dev/sdf2'.
Adding '/dev/sdf2' to list of synced ESPs..
Refreshing kernels and initrds..
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
Copying and configuring kernels on /dev/disk/by-uuid/89FD-A1F3
        Copying kernel and creating boot-entry for 6.8.12-4-pve
        Copying kernel and creating boot-entry for 6.8.12-5-pve
WARN: /dev/disk/by-uuid/EE7D-DAA3 does not exist - clean '/etc/kernel/proxmox-boot-uuids'! - skipping
Copying and configuring kernels on /dev/disk/by-uuid/EE7E-A82E
        Copying kernel and creating boot-entry for 6.8.12-4-pve
        Copying kernel and creating boot-entry for 6.8.12-5-pve

WARN: /dev/disk/by-uuid/EE7D-DAA3 does not exist - clean '/etc/kernel/proxmox-boot-uuids'! - skipping の対処をする

コレは、古い Disk の UUID が /etc/kernel/proxmox-boot-uuids に残っていためなのでファイルから削除すればよい

1
nano /etc/kernel/proxmox-boot-uuids

確認のためもう一度実行するとエラーが消えていると思う

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
> proxmox-boot-tool init /dev/sdf2 --force
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
UUID="89FD-A1F3" SIZE="1073741824" FSTYPE="vfat" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sdf" MOUNTPOINT=""
Mounting '/dev/sdf2' on '/var/tmp/espmounts/89FD-A1F3'.
Installing systemd-boot..
Copied "/usr/lib/systemd/boot/efi/systemd-bootx64.efi" to "/var/tmp/espmounts/89FD-A1F3/EFI/systemd/systemd-bootx64.efi".
Copied "/usr/lib/systemd/boot/efi/systemd-bootx64.efi" to "/var/tmp/espmounts/89FD-A1F3/EFI/BOOT/BOOTX64.EFI".
Random seed file /var/tmp/espmounts/89FD-A1F3/loader/random-seed successfully written (32 bytes).
Created EFI boot entry "Linux Boot Manager".
Configuring systemd-boot..
Unmounting '/dev/sdf2'.
Adding '/dev/sdf2' to list of synced ESPs..
Refreshing kernels and initrds..
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
Copying and configuring kernels on /dev/disk/by-uuid/89FD-A1F3
        Copying kernel and creating boot-entry for 6.8.12-4-pve
        Copying kernel and creating boot-entry for 6.8.12-5-pve
Copying and configuring kernels on /dev/disk/by-uuid/EE7E-A82E
        Copying kernel and creating boot-entry for 6.8.12-4-pve
        Copying kernel and creating boot-entry for 6.8.12-5-pve

以上で Proxmox VE ZFS の修復が完了した。

Summary ¶

ことの発端 ¶

状態を確認 ¶

ZFS から外す ¶

ZFS へ新しい Disk を追加 ¶

参考情報 ¶