I got my system sun fire 6800 hung later reboot after generating these message can any one help me on this to review these message..!!
nfssrv: [ID 694464 kern.warning] WARNING: nfsauth upcall failed: RPC: Operation in progress
mountd[664]: [ID 676604 daemon.error] cannot accept connection: 19: error unknown (current state -1)
KAVE00166-W The Store service is delayed and the load on the service is high. Revise the collection items and the collection intervals. (queue length=1)
^Mpanic[cpu16]/thread=2a10004fcc0:
using kernel phase-lock loop 0041, drift correction 0.00000
I am unable to diag the problem with the system... can anyone please put his/her valuable remarks on these .
There is no hardware error wht i have diagnosed from the systemn log. But wht making me weird is mpanic message of cpu
^Mpanic[cpu16]/thread=2a10004fcc0:
using kernel phase-lock loop 0041, drift correction 0.00000
wht does this mean more over there has been a var/crash/ generated is there any simple method to understand these crashes.. i dont no how ot get the mdb or adb cmd workout with these message..!
Can u help me on this ..!! i m looking forward for ur reply..
It is a little bit complicated to to lead you thru this because of my own small knowledge about this topics. But we can try it.
First of all we have to open then core files:
/var/crash/<name of the host>/
mdb -k unix.0 vmcore.0
$r <type and return and with space to the next .
Find the %pc Register
something like : %pc = 0x00008732873ff56 open_+4x66
The underlined part is important.
Now we have to disassemble the command which cause the problem.
0x00008732873ff56/ai
panic_thread/K Simple type it and dont ask me!
You got a line with a HEX address put this address:
address$<thread
Loking for a raw with procp inside and catch the seconed HEX address for the next command:
address$<proc2u
This shows you the command and the arguments which runs at the moment the system crashed.
But this is not implicitly the root cause!!!
It can be a hardware(CPU or memory) error that cause the crash at the moment the process start. In most cases the problem is a memory error!
%canrestore = 0x00 %otherwin = 0x00
%wstate = 0x00 %cleanwin = 0x00
> panic_thread/K
panic_thread:
panic_thread: 2a10004fcc0
> address$<2a10004fcc0
mdb: failed to dereference symbol: unknown symbol name
> 2a10004fcc0
0x2a10004fcc0: 2a100047cc0
> address$<thread
mdb: failed to dereference symbol: unknown symbol name
>
can u drive me through this..
> 2a10004fcc0$<proc2u
{
mdb: failed to read u_execsw pointer at 2a100050100: no mapping for address
}
>
this is what come if I type in address<proc2u what might be the issue ..
I'm confused!
OK pleas provide me the output of this command:
2a100047cc0$<proc2u
Important is that you do the commands all in serial order.
Short explanation:
This localize the panic thread:
> panic_thread/K
panic_thread:
panic_thread: 2a10004fcc0
This is the pointer to the memory where the thread is:
2a10004fcc0$<thread
And inside the output we are looking for the procp Pointer this is a pointer to the proc structure so we can identify the command or program with cause the panic.
The problem for me is that SUN changed the format of the output to XML.
In the moment I have no core files to test is on the lab.
> 2a100047cc0$<proc2u
{
mdb: failed to read u_execsw pointer at 2a100048100: no mapping for address
}
> 2a100057cc0$<proc2u
{
mdb: failed to read u_execsw pointer at 2a100058100: no mapping for address
}
it says failed to read ..is tht mean there is no process running at tht time ..