В логе множество сообщений типа
[Tue Apr 2 12:09:46 2019] EDAC MC1: 1 CE error on CPU#1Channel#1_DIMM#0 (channel:1 slot:0 page:0x0 offset:0x0 grain:8 syndrome:0x0)
Как определить какую именно планку памяти нужно заменить?
В iLo вижу вот что:
PROC 1 DIMM 1G : not installed
PROC 1 DIMM 2D : not installed
PROC 1 DIMM 3A : 4096 MB 1333 MHz
PROC 1 DIMM 4H : not installed
PROC 1 DIMM 5E : not installed
PROC 1 DIMM 6B : 4096 MB 1333 MHz
PROC 1 DIMM 7I : 8192 MB 1333 MHz
PROC 1 DIMM 8F : 8192 MB 1333 MHz
PROC 1 DIMM 9C : 4096 MB 1333 MHz
PROC 2 DIMM 1G : not installed
PROC 2 DIMM 2D : not installed
PROC 2 DIMM 3A : 4096 MB 1333 MHz
PROC 2 DIMM 4H : not installed
PROC 2 DIMM 5E : not installed
PROC 2 DIMM 6B : 4096 MB 1333 MHz
PROC 2 DIMM 7I : 8192 MB 1333 MHz
PROC 2 DIMM 8F : 8192 MB 1333 MHz
PROC 2 DIMM 9C : 4096 MB 1333 MHz
ещё немного информации:
# grep "[0-9]" /sys/devices/system/edac/mc/mc*/csrow*/ch*_ce_count
/sys/devices/system/edac/mc/mc0/csrow0/ch0_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow0/ch1_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow0/ch2_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow1/ch2_ce_count:0
/sys/devices/system/edac/mc/mc0/csrow2/ch2_ce_count:0
/sys/devices/system/edac/mc/mc1/csrow0/ch0_ce_count:0
/sys/devices/system/edac/mc/mc1/csrow0/ch1_ce_count:2595
/sys/devices/system/edac/mc/mc1/csrow0/ch2_ce_count:0
/sys/devices/system/edac/mc/mc1/csrow1/ch2_ce_count:0
/sys/devices/system/edac/mc/mc1/csrow2/ch2_ce_count:0
# dmidecode -t memory | grep 'Locator: PROC'
Locator: PROC 1 DIMM 1G
Locator: PROC 1 DIMM 2D
Locator: PROC 1 DIMM 3A
Locator: PROC 1 DIMM 4H
Locator: PROC 1 DIMM 5E
Locator: PROC 1 DIMM 6B
Locator: PROC 1 DIMM 7I
Locator: PROC 1 DIMM 8F
Locator: PROC 1 DIMM 9C
Locator: PROC 2 DIMM 1G
Locator: PROC 2 DIMM 2D
Locator: PROC 2 DIMM 3A
Locator: PROC 2 DIMM 4H
Locator: PROC 2 DIMM 5E
Locator: PROC 2 DIMM 6B
Locator: PROC 2 DIMM 7I
Locator: PROC 2 DIMM 8F
Locator: PROC 2 DIMM 9C