Can anyone recommend any good guides on how to investigate what a hanging process is doing?
In fact I would be interested in any online guides that would improve my forensic skills on the Linux platform.
I have a script that occasionally hangs. Strace shows:
[root@cfg01o ~]# strace -p 32370
Process 32370 attached - interrupt to quit
select(14, [3 6 8 11 13], [], NULL, NULL) = 1 (in [3])
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
read(3, "\344u\245%\6U\216\307\276\355\213p\376\203}\2617\251\320\301}\5\376Y(\235]K\222\301\304\370"..., 16384) = 64
write(3, "K\t\26O\323\344\214\341\247W\346\\*e\330\304\372\323O\356q\34\360\327\350\345*\274\35(Q'", 32) = 32
select(14, [3 6 8 11 13], [], NULL, NULL^C <unfinished ...>
I think the select is referring to a case statement in a while loop that is reading from a file. It's looking for an exit message at the point where it seems most likely to be hanging.
I don't know how to drill down further than this, so any suggestions or pointers to good guides online would be appreciated.
Thanks