NFS write error on host xyz: Stale NFS file handle - Solaris 10

Oct 13 12:19:15 xyz nfs: [ID 626546 kern.notice] NFS write error on host xyz: Stale NFS file handle.
Oct 13 12:19:15 xyz nfs: [ID 702911 kern.notice] (file handle: 68000000 1bc5492e 20000000 377c5e 1ce9395c 720a6203 40000000 bdfb0400)
Oct 13 12:19:15 xyz nfs: [ID 626546 kern.notice] NFS write error on host zyz: Stale NFS file handle.
Oct 13 12:19:15 xyz nfs: [ID 702911 kern.notice] (file handle: 68000000 1bc5492e 20000000 377c5e 1ce9395c 720a6203 40000000 bdfb0400)
Oct 13 12:19:15 xyz nfs: [ID 626546 kern.notice] NFS write error on host xyz: Stale NFS file handle.
Oct 13 12:19:15 xyz nfs: [ID 702911 kern.notice] (file handle: 68000000 1bc5492e 20000000 377c5e 1ce9395c 720a6203 40000000 bdfb0400)

Is it possible to determine which file this file handle refers to?

This may be caused by access to a single file failing but it's most likely that the remotely NFS shared directory has gone offline, been unshared, renamed or deleted. The client has been sitting there holding open the NFS handle and when it has tried to read or write something, the object has disappeared.

Try changing directory (cd) to each of the remotely mounted resources and see if you can list the file (ls) on it. If you can't, try and remount that remote NFS resource. If you can't, there's your problem.

Sorry forgot to say the host is actually a storage array. i.e. xyz

Also, there are lots of mount points on this particular client. I'll try them all...

Would you expect this error to re-occur? Would be useful for future reference if there was a way to decrypt the file handle too (Although I guess not so easy if the host is not solaris).

If it's a reputable storage array/SAN then, no, it shouldn't happen baring power failures and the like. Dare I suggest that the network or storage team accidently did something that they shouldn't have?

1 Like

NFS has an inherent problem. It is not very good at knowing if the mounted resource is available or not, so it depends on a response from the host providing the resource - but only after some applications asks for it.

Example NFS4 on a Solaris 10 machine used to hang shutdown when a user executed a cd to the NFS directory, and then just left the process sitting. The machine had zones (virtuals) and some NFS connections were required from zone to zone. Not a great idea.

This meant that if NFS zone1 -> zone2 and zone2 was taken down, the machine would not reboot because NFS could not figure out how close the open file, and NFS will wait for long periods of time for a resource.

You may have what amounts to a similar problem. A user on one machine has an NFS mounted directory open - the directory lives on another virtual or machine. Something happens on the remote and the local one hangs. NFS has problems, avoid it if you can. NFS can negatively affect simple commands like pwd and system calls like realpath() if the parent directory of your current directory has an external NFS mount. And that directory no long is available.

1 Like

Thanks all.

Out of interest - is it even possible to do anything with those file handles?