Data Access Error

Dear Reader,

My Sun Machine comes to halt with a message 'Data Access Error'. What / Where could be wrong..??

Thanks in Advance....

A few questions :

  1. When does this message come, while running some applications or accessing some database.

  2. While installation.

Gives us some details so as to analyse and fix the problem.

Regards,

vrk

Hai,

The machine was working fine for months together and all of a sudden this happened. to be precise twice in a week..

This is a SunOS 5.5.1 with Sun Sun Storedge A1000, arraymgr 6.1.1

while power-off and boot up again, the /var/messages shows
'panic[cpu0]/thread=0x3001fec0; trap.....

sched:data access exception:
MMU sfsr=11009b: Porivlege Violation on ASI 0x11 E 0 CID 1 PRIV 1 W 0 OW 1 FV 1

Thanks for ur post..

Your computer is complaining that it has a problem accessing a memory card. It may be correct and you may have a bad memory card. But some suns also erroneously issue this error message and there is a firmware patch to correct this.

This error is basically a Segmentation Violation that happens in the kernel. This is usually caused by a bug in a device driver. In solaris 2.5.1, there were several bug reports of drivers causing this error. If there is a stack traceback after the CPU panic, please post it here and we can get a better idea of which driver is causing this. You can also create directory /var/crash and then create a directory under /var/crash with the hostname of the machine. The next time it crashes, you will get a crash dump. You can then get the crash dump to Sun if you have a support contract. If you don't, please post that info and I can give you an ftp site to send the dump to.

An alternate shotgun method is to load the latest recommended patch cluster for 2.5.1 from sunsolve.sun.com and install it. Let us know how you want to proceed.

And last, this could be marginal hardware getting ready to fail. The stack trace or the crash dump is the only way to really tell.

Thanks.

Dear all,

Following are the traceback details ( from /var/adm/messages )

sched: data access exception:
unix: MMU sfsr=11009b: Privilege Violation on ASI 0x11 E 0 CID 1 PRIV 1 W 0 OW 1 FV 1
unix: pid=0, pc=0x1002b3f8, sp=0x0, tstate=0x3002396000000044, context=0x1e03
unix: g1-g7: 0, 0, 0, 24, 0, ef5b8f94, 30023ec0
unix: Begin traceback... sp = 30023960
unix: Called from 10025ad0, fp=300239c8, args=30023b50 30023a3c 1 1 611d8658 0
unix: Called from 1001b110, fp=30023a68, args=30023b50 0 0 6 1 ec63c000
unix: Called from 10006a88, fp=30023af0, args=ec63c000 4 0 1 10467bb8 0
unix: Called from 611d8610, fp=30023be0, args=ec63cb71 6125af48 ec63cb00 52 30023c40 6024e768
unix: Called from 611d7180, fp=30023c50, args=61211000 ec63cb0c 71 600e8170 612112a4 61aa6ac0
unix: Called from 60248c4c, fp=30023cb0, args=80000000 611daf7c 6125ad48 611d9c00 0 61211298
unix: Called from 10008c74, fp=30023d20, args=602f9a58 7d9 10412690 30023ec0 7d90 60248be0
unix: Called from 100353ec, fp=3002bcc0, args=0 1 61a5ecd0 10416c88 10412690 0
unix: End traceback...
unix: BAD TRAP: cpu=0 type=0x30 rp=0x3001f8d0 addr=0x611d8000 mmu_fsr=0x11009b
unix: sched: data access exception:
unix: MMU sfsr=11009b: Privilege Violation on ASI 0x11 E 0 CID 1 PRIV 1 W 0 OW 1 FV 1
unix: pid=0, pc=0x1002b3f8, sp=0x0, tstate=0x3001f96000000044, context=0x1e03
unix: g1-g7: 0, 0, 0, 24, 0, ef5b8f94, 3001fec0
unix: panic[cpu0]/thread=0x3001fec0: trap
unix: syncing file systems...panic[cpu0]/thread=0x3001fec0: md: writer lock is held
unix: 6802 static and sysmap kernel pages

It will be helpful if u could tell, how to link this info with other system components...

As Mr. Perderabo suspected, the problem seems to be related to Mem card.. The latest failure give out this info..

WARNING: correctable error from pci0 (upa mid 1f) during dvma read transaction.
unix:
unix: AFSR=44080000.5f800000 AFAR=00000000.05721000,
unix: double word offset=2, SIMM U0702 id 31.
unix: syndrome bits 8
unix: secondary error from dvma write transaction.
unix: WARNING: uncorrectable error from pci0 (upa mid 1f) during dvma read transaction.
unix: Transaction was a block operation, UPA bytemask 0.
unix: AFSR=4c000000.5f800000 AFAR=00000000.0634f040,
unix: double word offset=2, SIMM U0701 U0702 U0703 U0704 id 31.
unix: secondary error from dvma read transaction.
unix: secondary error from dvma write transaction.

unix: panic[cpu0]/thread=0x30023ec0: CPU0 Priv. UE Error: AFSR 0x00000000 80200000 AFAR 0x00000000 0631b5d0 SIMM U0701 U0702 U0703 U0704
unix: syncing file systems...WARNING: /pci@1f,4000/scsi@3 (glm0):
unix: invalid reselection (0.0)
unix: panic[cpu0]/thread=0x3001fec0: CPU0 Priv. UE Error: AFSR 0x00000000 80200000 AFAR 0x00000000 066fa8d0 SIMM U0701 U0702 U0703 U0704
unix: 6786 static and sysmap kernel pages

Thank u all..