NFS Problems

I am having a really bad day today.

I am trying to get an nfs mount to work. I want to have a mount from machinea:/home going to /home on machineb. I can mount machinea:/home on any mount point EXCEPT /home and see the files. I can not see the files or list the directory (it hangs) when I mount on /home. Machinea and machineb are both slackware.

I am at a loss.

I tried deleting mount point /home, mounting to mount point /home0 (I can see the files there) and doing a symbolic link to /home. I still can't see the files at /home but now I also can't see them at /home0.

Perhaps it is simply not responding, as an ls -las hangs. I dunno. I would appreciate any help or pointers on this. My web server has been down for two days now and I am out of ideas as well as sleep.

the Comand I was using is "mount -t nfs machinea:/home /home" - on my solaris box, the command "mount -f nfs -o soft,rw machinea:/home /home" worked fine.

Mike

Here's a little more info:

the fstab reads like this - machinea:/home /home nfs defaults,nolock 0 0

rpcinfo

program vers proto   port
100000    2   tcp    111  portmapper
100000    2   udp    111  portmapper
100011    1   udp    698  rquotad
100011    2   udp    698  rquotad
100011    1   tcp    701  rquotad
100011    2   tcp    701  rquotad
100003    2   udp   2049  nfs
100003    3   udp   2049  nfs
100021    1   udp  32770  nlockmgr
100021    3   udp  32770  nlockmgr
100021    4   udp  32770  nlockmgr
100005    1   udp  32768  mountd
100005    1   tcp  32771  mountd
100005    2   udp  32768  mountd
100005    2   tcp  32771  mountd
100005    3   udp  32768  mountd
100005    3   tcp  32771  mountd
100024    1   udp    32772  status
100024    1   tcp    32769  status

and a mount command gives:

machinea:/home on /home type nfs (rw,nolock,addr=xxx.xxx.xxx.xxx) where xxx.xxx.xxx.xxx is the ip address for machinea

Hope this sheds more light...

  1. When you mount /home to any another mount on Machine B, Can you list the files perfectly ?
  2. Also, Can you give me ls -ld /home on Machine B ?
  3. cat /etc/exports on Mahcine A (which I assume is the NFS server)

Thanks!

Yes, anywhere else the list works perfectly (it's a large filesystem, so it takes a sec to get started, but then it bangs right down).

ls -ld /home stops that session from responding. - wait a minute! - it came back after about five minutes.

drwxr-xr-x 1118 root root 86016 2008-09-13 16:13 /home/

Here is the contents of /etc/exports

cat /etc/exports
# See exports(5) for a description.
# This file contains a list of all directories exported to other computers.
# It is used by rpc.nfsd and rpc.mountd.

/home machined(rw,no_root_squash) machinec(rw,no_root_squash) machineb(rw,no_root_squash) machinee(rw,no_root_squash)

please also paste or attach the contents of dmesg for Machine A and B

I don't know what happened, but here it is again:

Machinea Server

Machineb Client

I posted it, but the mods need to approve it - I guess it is too big?

and you posted it 3 times....

Actually only twice - the first time it was in two parts. The second time it was combined into one.

Does anyone see anything that might help me out here? I'm pretty desperate.

please don't "bump" up your post. it's against the rules of unix.com!

OK, I tried increasing the time out using timeo and raised it to 5 seconds (timeo=50). It looked like it worked at first, I was able to list and all, but then after a reboot (I put the timeout into /etc/fstab) it was back to its old tricks.

I also noticed a couple (2 each of 2) ethernet cards sharing IRQs (9 & 10) - though this machine supports IRQ sharing, I thought I'd remove them in case. No change, so I put them back into the machine.

Look like timeo worked after all. I brought it up to 250 (25 seconds) and it is responding. Now all I have to do is to get the response time down. I suspect it is the 10/100 ethernet card and the Cisco 2900 switch it is going into. Probably some incompatibility between the two. I'll try synching them at 100 full duplex and see what happpens.

Thanks All!

Final Report:

Origin: Hub fried, moved ethernet from hub to switch. Both are 10/100 autonegotiating

Symptom: I can mount nfs file system to any mount point on the machine EXCEPT the mount point it needs to be on and see the file system as it should be. When I mount to that correct mount point, the machine hangs when I try to list files or perform any other action within that file system.

Initial response: I looked into many red herrings, but the one that brought success was increasing timeo to 250. When I did that, I could perform any tasks I needed to in that file system.

Probable Cause: My suspicion is that the 10/100 autonegotiation did not occur correctly, slowing down nfs communication between the file server and client. When I mounted on a mount point that did not connect to the service it needed to connect to, the load was light and I was able to do most things with it. When on the proper mount point, the load was sufficient to overload the ethernet connection and cause nfs timeouts.

Corrective Action: Disable autonegotiation and set both sides to 100 duplex.

Mike

To save yourself some grief decoding Network connectivity issues in the future, try this script (available from SUN):
note: It will not provide information on 10G ethernet, as it uses a different mechanism

#!/bin/ksh
################################################################################
# Simple script to GET stats about network cards
# Should work on hme and qfe. Will NOT change anything.
# Will report on speed and config of all network interfaces.
# Paul Bates 27.03.2000
# James Council 26.09.2001
#       - Changed output to one liners.
#       - Added IPversion check.
# James Council 10.10.2002 (jamescouncil@yahoo.com)
#       - Added test for Cassini Gigabit-Ethernet card (ce_).
#       - Added test for GEM Gigabit-Ethernet (ge_)
#       - Added test for eri Fast-Ethernet (eri_).
#       - Added "Ethernet Address" field.
#       - Removed "IPversion" field.
#       - Removed checking of a port more than once (i.e. qfe0 qfe0:1)
# James Council 10.25.2002 (jamescouncil@yahoo.com)
#       - Fixed 1GB check on ge device.
# James Council 04.02.2003 (jamescouncil@yahoo.com)
#       - Added dmfe check (suggested by John W. Rudick, & Erlend Tronsmoen)
# Octave Orgeron 02.06.2004 (unixconsole@yahoo.com)
#       - Added bge check (bge_).
# Octave Orgeron 05.18.2005 (unixconsole@yahoo.com)
#       - Corrected CE check to use kstat, which is required in Solaris 10.
# Octave Orgeron 12.13:2005 (unixconsole@yahoo.com)
#       - Corrected CE and DMFE check. Added IPGE check. Special thanks to
#         Paul Bates, Christian Jose, and Bill Qualye for suggesting fixes and
#         for keeping me on my toes;)
# Octave Orgeorn 02.07.2007 (unixconsole@yahoo.com)
#       - Added support for the Intel e1000g interfaces.
#       - Cleaned up script. Housecleaning.
#       - Tested against Fujitsu Quad GigE Nic's (FJGI)
# Paul Bates 10.03.2008  (sun@paulbates.org)
#       - included NXGE interfaces, Thanks Jorg Weiss and Randy Latimer !!
#       - Just tidied up code a little more, removed some fluff
#
# NOTE: For further updates or comments please feel free to contact me via
#       email.  James Council or Octave Orgeron or Paul Bates
#
################################################################################

NDD=/usr/sbin/ndd
KSTAT=/usr/bin/kstat
IFC=/sbin/ifconfig
DLADM=/usr/sbin/dladm

typeset -R10 LINK
typeset -R8 AUTOSPEED
typeset -R8 STATUS
typeset -R8 SPEED
typeset -R8 MODE
typeset -R18 ETHER

# Function to test that the user is root.

Check_ID()
{
ID=$(/usr/ucb/whoami)
if [ $ID != "root" ]; then
   echo "$ID, you must be root to run this program."
   exit 1
fi
}

# Function to test Quad Fast-Ethernet, Fast-Ethernet, and
# Gigabit-Ethernet (i.e. qfe_, hme_, ge_, fjgi_)

Check_NIC()
{
${NDD} -set /dev/${1} instance ${2}

if [ $type = "ge" ];then
   autospeed=`${NDD} -get /dev/${1} adv_1000autoneg_cap`
else
   autospeed=`${NDD} -get /dev/${1} adv_autoneg_cap`
fi

case $autospeed in
   1) AUTOSPEED=ON      ;;
   0) AUTOSPEED=OFF     ;;
   *) AUTOSPEED=ERROR   ;;
esac

status=`${NDD} -get /dev/${1} link_status`
case $status in
   1) STATUS=UP         ;;
   0) STATUS=DOWN       ;;
   *) STATUS=ERROR      ;;
esac

speed=`${NDD} -get /dev/${1} link_speed`
case $speed in
   1000) SPEED=1GB      ;;
   1) SPEED=100MB       ;;
   0) SPEED=10MB        ;;
   *) SPEED=ERROR       ;;
esac

mode=`${NDD} -get /dev/${1} link_mode`
case $mode in
   1) MODE=FDX          ;;
   0) MODE=HDX          ;;
   *) MODE=ERROR        ;;
esac
}

# Function to test the Davicom Fast Ethernet, DM9102A, devices
# on the Netra X1 and SunFire V100 (i.e. dmfe_)

Check_DMF_NIC()
{
autospeed=`${NDD} -get /dev/${1}${2} adv_autoneg_cap`
case $autospeed in
   1) AUTOSPEED=ON      ;;
   0) AUTOSPEED=OFF     ;;
   *) AUTOSPEED=ERROR   ;;
esac

status=`${NDD} -get /dev/${1}${2} link_status`
case $status in
   1) STATUS=UP         ;;
   0) STATUS=DOWN       ;;
   *) STATUS=ERROR      ;;
esac

speed=`${NDD} -get /dev/${1}${2} link_speed`
case $speed in
   100) SPEED=100MB     ;;
   10) SPEED=10MB       ;;
   0) SPEED=10MB        ;;
   *) SPEED=ERROR       ;;
esac

mode=`${NDD} -get /dev/${1}${2} link_mode`
case $mode in
   2) MODE=FDX          ;;
   1) MODE=HDX          ;;
   0) MODE=UNKOWN       ;;
   *) MODE=ERROR        ;;
esac
}

# Function to test a Cassini Gigabit-Ethernet (i.e. ce_).

Check_CE()
{
autospeed=`${KSTAT} -m ce -i $num -s cap_autoneg | grep cap_autoneg | awk '{print $2}'`
case $autospeed in
   1) AUTOSPEED=ON      ;;
   0) AUTOSPEED=OFF     ;;
   *) AUTOSPEED=ERROR   ;;
esac

status=`${KSTAT} -m ce -i $num -s link_up | grep link_up | awk '{print $2}'`
case $status in
   1) STATUS=UP         ;;
   0) STATUS=DOWN       ;;
   *) STATUS=ERROR      ;;
esac

speed=`${KSTAT} -m ce -i $num -s link_speed | grep link_speed | awk '{print $2}'`
case $speed in
   1000) SPEED=1GB      ;;
   100) SPEED=100MB     ;;
   10) SPEED=10MB       ;;
   0) SPEED=10MB        ;;
   *) SPEED=ERROR       ;;
esac

mode=`${KSTAT} -m ce -i $num -s link_duplex | grep link_duplex | awk '{print $2}'`
case $mode in
   2) MODE=FDX          ;;
   1) MODE=HDX          ;;
   0) MODE=UNKNOWN      ;;
   *) MODE=ERROR        ;;
esac
}

# Function to test Sun BGE interface on Sun Fire V210 and V240.
# The BGE is a Broadcom BCM5704 chipset. There are four interfaces
# on the V210 and V240. (i.e. bge_)

Check_BGE_NIC()
{
autospeed=`${NDD} -get /dev/${1}${2} adv_autoneg_cap`
case $autospeed in
   1) AUTOSPEED=ON      ;;
   0) AUTOSPEED=OFF     ;;
   *) AUTOSPEED=ERROR   ;;
esac

status=`${NDD} -get /dev/${1}${2} link_status`
case $status in
   1) STATUS=UP         ;;
   0) STATUS=DOWN       ;;
   *) STATUS=ERROR      ;;
esac

speed=`${NDD} -get /dev/${1}${2} link_speed`
case $speed in
   1000) SPEED=1GB      ;;
   100) SPEED=100MB     ;;
   10) SPEED=10MB       ;;
   0) SPEED=10MB        ;;
   *) SPEED=ERROR       ;;
esac

mode=`${NDD} -get /dev/${1}${2} link_duplex`
case $mode in
   2) MODE=FDX          ;;
   1) MODE=HDX          ;;
   0) MODE=UNKNOWN      ;;
   *) MODE=ERROR        ;;
esac
}

# Function to test a Intel 82571-based ethernet controller port (i.e. ipge_).

Check_IPGE()
{
autospeed=`${KSTAT} -m ipge -i $num -s cap_autoneg | grep cap_autoneg | awk '{print $2}'`
case $autospeed in
   1) AUTOSPEED=ON      ;;
   0) AUTOSPEED=OFF     ;;
   *) AUTOSPEED=ERROR   ;;
esac

status=`${KSTAT} -m ipge -i $num -s link_up | grep link_up | awk '{print $2}'`
case $status in
   1) STATUS=UP         ;;
   0) STATUS=DOWN       ;;
   *) STATUS=ERROR      ;;
esac

speed=`${KSTAT} -m ipge -i $num -s link_speed | grep link_speed | awk '{print $2}'`
case $speed in
   1000) SPEED=1GB      ;;
   100) SPEED=100MB     ;;
   10) SPEED=10MB       ;;
   0) SPEED=10MB        ;;
   *) SPEED=ERROR       ;;
esac

mode=`${KSTAT} -m ipge -i $num -s link_duplex | grep link_duplex | awk '{print $2}'`
case $mode in
   2) MODE=FDX          ;;
   1) MODE=HDX          ;;
   0) MODE=UNKNOWN      ;;
   *) MODE=ERROR        ;;
esac
}

# Function to test a Intel 82571-based ethernet controller port (i.e. e1000g_).

Check_E1KG()
{
autospeed=`${KSTAT} -m e1000g -i $num -s cap_autoneg | grep cap_autoneg | awk '{print $2}'`
case $autospeed in
   1) AUTOSPEED=ON      ;;
   0) AUTOSPEED=OFF     ;;
   *) AUTOSPEED=ERROR   ;;
esac

status=`${KSTAT} -m e1000g -i $num -s link_up | grep link_up | uniq |awk '{print $2}'`
case $status in
   1) STATUS=UP         ;;
   0) STATUS=DOWN       ;;
   *) STATUS=ERROR      ;;
esac

speed=`${KSTAT} -m e1000g -i $num -s link_speed | grep link_speed | awk '{print $2}'`
case $speed in
   1000) SPEED=1GB      ;;
   100) SPEED=100MB     ;;
   10) SPEED=10MB       ;;
   0) SPEED=10MB        ;;
   *) SPEED=ERROR       ;;
esac

mode=`${KSTAT} -m e1000g -i $num -s link_duplex | grep link_duplex | awk '{print $2}'`
case $mode in
   2) MODE=FDX          ;;
   1) MODE=HDX          ;;
   0) MODE=UNKNOWN      ;;
   *) MODE=ERROR        ;;
esac
}

# Function to test Sun NXGE interface on Sun Fire Tx000.

Check_NXGE_NIC()
{
autospeed=`${NDD} -get /dev/${1}${2} adv_autoneg_cap`
case $autospeed in
   1) AUTOSPEED=ON      ;;
   0) AUTOSPEED=OFF     ;;
   *) AUTOSPEED=ERROR   ;;
esac

status=`${DLADM} show-dev ${1}${2} 2> /dev/null | awk '{print $3;}'`
case $status in
   up) STATUS=UP            ;;
   down) STATUS=DOWN        ;;
   unknown) STATUS=UNKNOWN  ;;
   *) STATUS=ERROR          ;;
esac

speed=`${DLADM} show-dev ${1}${2} 2> /dev/null | awk '{print $5;}'`
case $speed in
   1000) SPEED=1GB      ;;
   100) SPEED=100MB     ;;
   10) SPEED=10MB       ;;
   0) SPEED=10MB        ;;
   *) SPEED=ERROR       ;;
esac

mode=`${DLADM} show-dev ${1}${2} 2> /dev/null | awk '{print $NF;}'`
case $mode in
   full) MODE=FDX     ;;
   half) MODE=HDX     ;;
   unknown) MODE=---  ;;
   *) MODE=ERROR      ;;
esac
}

#############################################
# Start
#############################################

Check_ID

echo "\n      Link:  Auto-Neg:   Status:   Speed:    Mode:  Ethernet Address:"
echo "---------------------------------------------------------------------"

# Create a uniq list of network ports configured on the system.
# NOTE: This is done to avoid multiple references to a single network port
# (i.e. qfe0 and qfe0:1).

NICS=`${IFC} -a| egrep -v "lo|be|dman|lpfc|jnet"| awk -F: '/^[a-z,A-z]/ {print $1}'| sort -u`

for LINK in $NICS
do
   if [ `echo $LINK | grep e1000g` ]
   then
      type=e1000g
      num=$(echo $LINK | cut -f2 -d"g")
   else
      type=$(echo $LINK | sed 's/[0-9]//g')
      num=$(echo $LINK | sed 's/[a-z,A-Z]//g')
   fi

# Here we reference the functions above to set the variables for each port which
# will be outputed below.

   case ${type} in
      bge)      Check_BGE_NIC $type $num  ;;
      ce)       Check_CE $type $num       ;;
      dmfe)     Check_DMF_NIC $type $num  ;;
      ipge)     Check_IPGE $type $num     ;;
      e1000g)   Check_E1KG $type $num     ;;
      nxge)     Check_NXGE_NIC $type $num ;;
      *)        Check_NIC $type $num      ;;
   esac

# Set ethernet variable and output all findings for a port to the screen.

   ETHER=`$IFC $LINK| awk '/ether/ {print $2}'`
   echo "$LINK   $AUTOSPEED  $STATUS $SPEED $MODE $ETHER"
done

#############################################
# End
#############################################

Thanks - it looks like a great program, but it looks like this is a Solaris program. The server in question is a Linux machine.

Mike

Have you check if /home is being used by autofs?

Thanks, but I am not using autofs

Mike