A bit of background
I'm running a Nexenta (OpenSolaris kernel + a number of Debian tools) server running a ZRAID of 3x 1TB SATA2 drives (essentially a RAID5 formatted in ZFS, for those who aren't familiar with zpools).
When running the ZFS scrub command (ZFS's equivelent of fsck) I get a number of checksum errors. As I'm almost certain the drives are healthy and these errors have only started popping up when I installed some new RAM, I'm pretty sure the errors are RAM related.
So my question is this:
Is there any UNIX/Linux tools that can scan the RAM for damaged blocks and then badlist them - thus preventing future memory checksums?
the server has 6GB RAM (3x 2GB DDR2 sticks), so there's more than enough RAM even with badlisting whole chunks - thus I don't want to pay for new chips unless i really have to.
Also, if said command can run in real time or on a live system, then that will be a bonus (though I understand that the obvious dangers of doing so might make such an opinion impossible/impractical)
Specs:
# prtdiag
System Configuration: MSI MS-7390
BIOS Configuration: American Megatrends Inc. V1.1 03/26/2008
==== Processor Sockets ====================================
Version Location Tag
-------------------------------- --------------------------
AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ CPU 1
==== Memory Device Sockets ================================
Type Status Set Device Locator Bank Locator
------- ------ --- ------------------- --------------------
DDR2 in use 0 DIMM0 BANK0
DDR2 in use 0 DIMM1 BANK1
DDR2 in use 0 DIMM2 BANK2
unknown empty 0 DIMM3 BANK3
==== On-Board Devices =====================================
To Be Filled By O.E.M.
==== Upgradeable Slots ====================================
ID Status Type Description
--- --------- ---------------- ----------------------------
0 in use PCI Express PCIE
1 in use PCI PCI1
# uname -a
SunOS Primus 5.11 NexentaOS_20080312 i86pc i386 i86pc Solaris
# prtconfig
System Configuration: Sun Microsystems i86pc
Memory size: 6144 Megabytes
System Peripherals (Software Nodes):
i86pc
scsi_vhci, instance #0
isa, instance #0
asy, instance #0
fdc, instance #0
fd, instance #0
i8042, instance #0
keyboard, instance #0
motherboard (driver not attached)
pit_beep, instance #0
pci, instance #0
pci1462,7390 (driver not attached)
pci1462,7390 (driver not attached)
pci1462,7390 (driver not attached)
pci1462,7390 (driver not attached)
pci1462,7390, instance #0
pci1462,7390, instance #0
pci1462,390c, instance #0
pci1462,7390, instance #0
pci10de,449, instance #0
display, instance #0
pci-ide, instance #0
ide, instance #0
cmdk, instance #0
cmdk, instance #5
ide (driver not attached)
pci-ide, instance #1
ide, instance #2
cmdk, instance #1
cmdk, instance #2
ide, instance #3
cmdk, instance #3
pci10de,45b (driver not attached)
pci10de,45a (driver not attached)
pci10de,458 (driver not attached)
pci10de,459 (driver not attached)
pci1022,1100, instance #0
pci1022,1101, instance #1
pci1022,1102, instance #2
pci1022,1103, instance #0
iscsi, instance #0
pseudo, instance #0
options, instance #0
agpgart, instance #0
xsvc, instance #0
used-resources (driver not attached)
cpus, instance #0
cpu (driver not attached)
cpu (driver not attached)
Any help is appretiated