Question about error

While attempting to use ftp to retrieve a file from a remote server (running SCO Unix 5.0.2), it eventually returns a timeout error with the following message - netin: connection reset by peer.

I can't find any reference to this error. Has anyone encountered it before, or know what causes it?

The peer sent a reset packet. This might happen if it rebooted. Once it comes back up it won't remember the connection. So when your box tries to continue the conversation, the remote box just aborts the connection.

Another cause that I have have seen is when some other box takes the same ip address. You can test that by shutting down the remote sever and trying trying to ping it.

Perderado, your thoughts on my problem make sense, but when tested, were not the cause.

Both servers stayed up during the ftp process, as verified with the who -b command. I then had the remote site power down their server and attempted to ping their IP address, no success - as should be the case.

I know this is outside the scope of Unix, but could it be a timing issue with their router / ISP / phone lines?

It is possible for an application program to abort a coonection. It must set the SO_LINGER bit on and set the linger time to zero, then close(2) (as opposed to shutdown(2)) the socket. Actually, an explicit close call is not required, the automatic close that occurs when the process exits will do. And the process can exit either due to an explicit exit(2) or the default action of an uncaught signal.

Putting that all together, some programmers turn on SO_LINGER after establishing the connection. Then if the program aborts, so does the socket. This gives the remote end a clue that something went very wrong.

You can see if the ftpd on the remote system behaves this way. Establish a connection, and then have the remote sysadmin "kill -9" the appropriate instance of ftpd. If you get the "reset by peer" message, then this is a possibility. There may be a bug on the remote ftpd which is causing it to abort.

There could also be a bug in the remote kernel which is causing the abort.

Since this seems to be a WAN issue you need to examine your network topology. It is very likely that your peer in this case in not the remote system. It could be a firewall, for instance. In fact, you could have two firewalls and a proxy server daisy chained betwen you and the remote ftp server. If so, the abort could be coming from any of them.