I have two Solaris 10 servers. First server crashed last week (Monday) and second one crashed over the weekend. I have checked the logs such as /var/adm/messages, syslog and dmesg. So for I found none. My management wants to know why the server crashed. I need to come with some kind of reasons.
I also searched for core file and didn't find any. Can someone guide me what else I can do to figure out why the server crashed.
Both systems reboot OK? Did you look in the older /var/adm/messages log files and not just the current messages file? Is crash dump enabled? If not, you should enable it if possible.
I have Sunfire E6900. which has four domain. But I only have access to one. Other three are used by different groups currently I think they took it offline.
second one Sunfire E2900
On the E6900, I did go to console I was hostname-sc:D prompt. I typed " help " I saw this...
history -- show command history
password -- set the domain password
poweroff -- powers off components
poweron -- powers on components
reset -- reset the domain
resume -- return to domain console
setdate -- set the date and time for the domain
setdefaults -- set default configuration values
setkeyswitch -- set the keyswitch position
setls -- set FRU location status
setupdomain -- configure the domain
showboards -- show board information
showcodusage -- show COD resource usage
showcomponent -- show state of a component
showdate -- show the current date and time for the domain
showdomain -- show domain configuration and status
showenvironment -- show environmental information
showkeyswitch -- show the keyswitch position
showlogs -- show the logs
showresetstate -- show CPU registers after reset
testboard -- test a CPU/Memory board
I then typed " showlogs "
Jan 17 10:28:47 dev-sc Domain-D.SC: [ID 384869 local0.error] Domain watchdog timer expired.
Jan 17 10:28:47 dev-sc Domain-D.SC: [ID 180029 local0.notice] Using default hang-policy (RESET).
Jan 17 10:28:47 dev-sc Domain-D.SC: [ID 838382 local0.error] Saving reset state data before XIR.
Jan 17 10:28:50 dev-sc Domain-D.SC: [ID 580408 local0.notice] Resetting (XIR) domain.
Jan 17 10:28:50 dev-sc Domain-D.SC: [ID 815168 local0.error] Saving reset state data after XIR.
Can you advise what else I can look at?
---------- Post updated at 06:34 PM ---------- Previous update was at 06:29 PM ----------
yes. System is online now. Both of the server has Sybase running. Very important DB for the company. I did check the /var/adm/messages file. It has lot of data, but I didn't find anything useful as to why the system crashed.