df command hanging

vivek.goel.piet · January 9, 2012, 6:52am

Hi Folks,

When i execute the command

df -kh

in my system the o/p hangs..
The command runs fine but takes a lot of time before coming back to the # prompt.
Can anyone please suggest the possible cause and solution?.

amitranjansahu · January 9, 2012, 7:15am

Is there any external file system mounted on your machine ? If so may be one of the remote files system is taking time to respond.

whats the out put of the df -kh

zaxxon · January 9, 2012, 8:01am

Maybe you have any NFS mounts in there that take some time to respond? Can you make out which file system/mount point seems to hang, or does it display all of them and hangs afterwards?

jim_mcnamara · January 9, 2012, 8:03am

One way to find the slowly responding file system. Suppose your df complets and gives you this meaningless example:

Filesystem            Size  Used Avail Use% Mounted on
/foo/fah              452G   59G  394G  13% /usr/bin
/                        452G   59G  394G  13% /
/foo/bar              452G   59G  394G  13% /foo

try:

for mpoint in  /  /foo /usr/bin
do
time df -h $mpoint
done

What you are seeing is probably some overloaded directories, directories that have thousands of files in them. Performance on those is usually slow.

Then try this on the filesystem that is slow

find [filesystem name goes here] -type d |
while read dir
do
   cnt=$(ls $dir| wc -l)
   echo "$dir has $cnt entries"
done

From there on you need to clean up and sometimes re-create directory files that are
a problem.

Ex-SUN · January 9, 2012, 2:39pm

Also, if you have a DNS resolve issue, the df command result will take a long while to return. If your NFS mount points are all current, consider restarting the nscd process manually. Good luck.

vivek.goel.piet · January 9, 2012, 2:48pm

Thanks for all the replies... But how to do it?

Ex-SUN · January 9, 2012, 5:25pm

How to do it?

sh /etc/init.d/nscd stop
sh /etc/init.d/nscd start

That should work even for Solaris 10.

Cheers.

vivek.goel.piet · January 10, 2012, 6:06am

Please find the extract from

dmesg

of my server

Jan 10 13:53:44 aremarn11 nfs: [ID 733954 kern.info] NOTICE: [NFS4][Server: 10.77.64.23][Mntpt: /backup]NFS server 10.77.64.23 not responding; still trying

Hope it helps..
However the ping response for the said IP 10.77.64.23 is fine.

# ping 10.77.64.23 
10.77.64.23 is alive
# ping -s 10.77.64.23 
PING 10.77.64.23: 56 data bytes
64 bytes from 10.77.64.23: icmp_seq=0. time=0.801 ms
64 bytes from 10.77.64.23: icmp_seq=1. time=0.725 ms
64 bytes from 10.77.64.23: icmp_seq=2. time=0.698 ms
64 bytes from 10.77.64.23: icmp_seq=3. time=0.704 ms
64 bytes from 10.77.64.23: icmp_seq=4. time=0.662 ms
64 bytes from 10.77.64.23: icmp_seq=5. time=0.648 ms
64 bytes from 10.77.64.23: icmp_seq=6. time=0.609 ms
64 bytes from 10.77.64.23: icmp_seq=7. time=0.744 ms
^C
----10.77.64.23 PING Statistics----
8 packets transmitted, 8 packets received, 0% packet loss
round-trip (ms)  min/avg/max/stddev = 0.609/0.699/0.801/0.060
#

---------- Post updated at 04:36 PM ---------- Previous update was at 04:34 PM ----------

Please find the o/p

bash-3.00$ svcs -a |grep -i nscd
bash-3.00$

No process in running named as nscd in the good as well as bad server.

vivek.goel.piet · January 11, 2012, 3:42pm

The issue is now solved .
Corrective action: Reboot of the node.
Fault Cause: Not known. Will be provided later

ravijanjanam12 · January 12, 2012, 3:38am

check for the errors in /var/adm/messages.There might be a disk issue which is causing I/O operations to be slow.check for iostat.

aixlover · January 12, 2012, 1:45pm

Are you using ZFS? If so, it can be a zpool issue.