NFS problem AIX6.1

Hello,

I have a problem with an NFS file system. Both AIX servers are AIX6.1 and uses NFS version 3.
The problem is that at some point the client server cannot connect to the NFS anymore and when i do

df -k

it displays the message NFS server <server> not responding still trying.
i run the command

/usr/sbin/rmnfsmnt -f '/nim/mksysb/' '-B'

and it returns the following error:
umount: 1831-015 16 error while unmounting <server>:/nim/mksysb/ -
The requested resource is busy.
rmnfsmnt: 1831-362 Error in unmounting /nim/mksysb/
while i check that the entry in /etc/filesystems is removed i still see that the NFS is trying to connect unsuccessfully.
So i tried to forcefully unmount the file system using the command

umount -f /nim/mksysb/

but it stucks. The strange thing is that if i check my file systems at the time that the command runs it seems that the NFS is umount and

df -k

works fine without reporting any warnings or errors.
I let the

umount -f 

run for a lot of hours but it doesnt seem to finish. When i stopped the command the NFS appears back and when i do

df -k

i get the same message NFS server <server> not responding still trying again.

How can i get rid of this?
Thank you.

First off: it looks like you are trying to handle NIM-shares by hand: don't do that! Most NIM problems come from exporting the NIM root-tree (or a superset of it) via NFS and then trying to allocate a NIM-resource (which will create the entry in "/etc/exports" and run the export automatically). This sometimes fails for no apparent reason. Honestly, i believe this to be a weak spot in an otherwise phantastic OS.

Second: it lies in the nature of NFS to have very, very long timeouts. If - for instance - a DNS server becomes unresponsive and your exports use Domain Names instead of raw IP addresses it might look like NFS got stuck while in fact the DNS server is responsible and DNS is just trying very long to before giving up. Something like this might be the case here.

Unfortunately, really stuck NFS mounts are reasonably got rid of only by a reboot, otherwise you might have to wait for really long times (sometimes even hours) before accomplishing it.

I hope this helps.

bakunin

To stop all client activity you can use:

# nfs.clean
# sleep 5
# rc.nfs

The first command stops all nfs daemons, and the second restarts them. In other words, do NOT use

stopsrc -g nfs; sleep 5; startsrc -g nfs

And, you may want to do this on the (NIM) server instead/also/first.

Note:
# rmnfsmnt -f ...
will remove the entry from /etc/filesystems

  • to check on a nfs server

# rpcinfo -p nfs_server | grep nfs
This will tell you which nfs protocols are supported AND active
# showmount -e nfs_server
This will tell you what is (still) exported.

The message you are getting (NFS server ...not responding, still trying) implies that the rpc communication is "down" for whatever reason - OR - the directory is no longer exported.

1 Like