Loosing connection after authentication

Hi!

I am having a problem when logging in on a solaris 10 server, after typing one or two commands, I lose connectivity, with the following message:

server unexpectedly closed network connection

I have checked the following:

 grep `uname -n` /etc/inet/hosts /etc/inet/ipnodes
/etc/inet/hosts:190.54.1.60     AIOPTSVR        loghost aioptsvr.mcel.co.mz
/etc/inet/ipnodes:190.54.1.60   AIOPTSVR        loghost aioptsvr.mcel.co.mz
 more /etc/hostname.e1000g0
190.54.1.60
 grep -v ^# /etc/defaultrouter
190.54.1.1
ifconfig e1000g0
e1000g0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
        inet 190.54.1.60 netmask ffff0000 broadcast 190.54.255.255
        ether 0:14:4f:c6:49:1e
 dladm show-dev e1000g0
e1000g0         link: up        speed: 100   Mbps       duplex: full
 netstat -rn

Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              190.54.1.1           UG        1     550240
10.0.0.0             10.0.0.60            U         1        959 e1000g1
190.54.0.0           190.54.1.60          U         1        405 e1000g0
224.0.0.0            190.54.1.60          U         1          0 e1000g0
127.0.0.1            127.0.0.1            UH       11      12189 lo0
 netstat -ni -I e1000g0
Name  Mtu  Net/Dest      Address        Ipkts  Ierrs Opkts  Oerrs Collis Queue
e1000g0 1500 190.54.0.0    190.54.1.60    77166417 0     42951194 0     0      0
egrep '(^hosts|^services)' /etc/nsswitch.conf
hosts:      files
services:   files
inetadm | grep -v disabled
ENABLED   STATE          FMRI
enabled   online         svc:/application/x11/xfs:default
enabled   online         svc:/application/font/stfsloader:default
enabled   online         svc:/application/print/rfc1179:default
enabled   online         svc:/network/rpc/cde-calendar-manager:default
enabled   online         svc:/network/rpc/cde-ttdbserver:tcp
enabled   online         svc:/network/rpc/gss:default
enabled   online         svc:/network/rpc/smserver:default
enabled   online         svc:/network/rpc/rstat:default
enabled   online         svc:/network/rpc/rusers:default
enabled   online         svc:/network/cde-spc:default
enabled   online         svc:/network/security/ktkt_warn:default
enabled   online         svc:/network/telnet:default
enabled   online         svc:/network/nfs/rquota:default
enabled   online         svc:/network/swat:default
enabled   online         svc:/network/ftp:default
enabled   online         svc:/network/finger:default
enabled   online         svc:/network/login:rlogin
enabled   online         svc:/network/shell:default
enabled   online         svc:/network/rpc-100235_1/rpc_ticotsord:default
enabled   online         svc:/network/stdiscover:default
enabled   online         svc:/network/stlisten:default

I really dont know what else to check, I have done all this checks by connecting to the management port of the chassis,

Is it a production server?
Do you loose connection as root?

As its a long time I am in bed ( and completely disconnected from true UNIX world) I dont remember now if that means you entered by a serial port equivalent, is that so?

Yes its a production server, and yes I am root, but user oracle ha the same issue.

The server (Blade) its located in a chassis, with 6 other blades, so what I did and I can loging fine is to connect to the management port of the chassis, them from there connect to the particular server, once there I can do all the stuff the server will not kick me out

---------- Post updated at 12:32 PM ---------- Previous update was at 10:40 AM ----------

I have issued

snoop -D

and I had the following results:

 190.54.1.62 -> AIOPTSVR     drops: 0 TCP D=2049 S=1019 Ack=1829873131 Seq=35650
22256 Len=1460 Win=49640
 190.54.1.62 -> AIOPTSVR     drops: 0 TCP D=2049 S=1019 Push Ack=1829873131 Seq=
3565023716 Len=380 Win=49640
    AIOPTSVR -> 190.54.1.62  drops: 0 TCP D=1019 S=2049 Ack=3565023716 Seq=18298
73131 Len=0 Win=46720
    AIOPTSVR -> 190.54.1.62  drops: 0 NFS R 4 (write       ) NFS4_OK PUTFH NFS4_
OK WRITE NFS4_OK 3115 (ASYNC)
 190.54.1.62 -> AIOPTSVR     drops: 0 NFS C 4 (commit      ) PUTFH FH=7205 COMMI
T at 0 for 8192

the

drops: 0

indicate what? a drop connection or not?

I'd be suspicious of the network infrastructure rather than the Solaris box.

I'd be tempted to connect a client via a crossover cable directly to the normal network port of the Solaris box and login from there to see whether it is really Solaris that is closing the connection.

What network boxes (eg, switches) are between your users and Solaris? Could the network be faulty? Anyone changed anything very recently?

My number one suspect in this is really the network infraestructure, but the network guys, deny as always...
If I can connect via chassis with no issues (same as connecting directly), I really can�t relate this as solaris problem..
I have no knowledge of what network equipment is between myself (my station) and the server, but I can say that I am on a different network, and but the connectivity I have with that server via the management port of the blade chassis is fine

Well if the system is disconnecting users after only a few seconds then this "production" system is no good to them, is it!! So put a client box on the main network connection (even if you use a mini switch of your own if you haven't got a crossover cable) and test it yourself by-passing the network infrastructure. That will tell you if Solaris is actually kicking people out or not.

You're not dealing with a Mickey Mouse desktop OS here, Solaris runs banks worldwide and is unlikely to be the culprit here. A simple test will prove it or not.

---------- Post updated at 12:48 PM ---------- Previous update was at 12:46 PM ----------

The fact that you can use the management port without getting kicked out doesn't prove the main network port is okay. Connect your own client to the main network port.

---------- Post updated at 12:50 PM ---------- Previous update was at 12:48 PM ----------

And don't forget that you'll have to set your client box to a suitable static ip address.

I have done what you suggested, I have configured my laptop with the same IP of the server, connect and no ISSUES!!!
what next??

So you've logged in, it doesn't kick you out, and you can do stuff whilst logged in?

If that's so then you should tell the network boys just that and tell them to get their act together and fix this disconnection problem.

Avoid ANY temptation to mess with the Solaris OS irrespective of anything they might say. Solaris is not the issue.

---------- Post updated at 01:38 PM ---------- Previous update was at 01:34 PM ----------

You could also get one of your users who is getting this disconnection issue to bring their workstation into the computer room and test the application can be used on a direct connection (just in case the application is kicking them out). Again most unlikely.

1 Like

I have done that, one user was at the computer room, with me, and he manage do some work...

Case closed, thanks a lot