Problems and doubts with sockets and timeouts

Kovalevski · February 9, 2016, 8:14am

Hi! I need some help to understand a little bit more the behaviour about socket and TCP connections...
Here is my problem I have a client and a server that were written in python. The server program wait
until a message arrive and then print the message but if the message not arrive in a second (a timeout of a second)
the connection is reset. In the other side the client sends a message in a random time and a "Still alive" message in about 0.4 seconds to the server.

For unexplained reasons randomly a timeout happen in the server although that I send the message just in time ,
I don't understand why this happen. I tried deactivate the Nagle's algorithm but the only change that I saw was the
time in appear another time out (in fact more larger) . I thought that if I can't avoid that problem about time outs
I must send the message again if the message was not received, but this produce me a few questions

Where are the messages that were lost in time out?
Can I have repeated messages after the time out? I mean If the timeout occurs, the message I sent can arrive later?
Why timeouts occur if I send correctly and timely message?

Thanks!

gull04 · February 9, 2016, 10:57am

Hi Kovalevski,

Just going to take a couple of guesses here with your questions;

When you say you are having a "timeout" I'm going with the listener has timed out, in which case the messages are gone - nothing was listening for them.

Once the listener has timed out, it won't matter how many messages that are sent - nothing is listening for them.

Why the time out occurs before the message arrives could be down to a number of reasons, but most likely traffic on the network or load on the system.

The time out seems to be very short at a second, can you post some more detail about what you are trying to do?

Regards

Gull04

Kovalevski · February 9, 2016, 11:45am

Hi gull04 and thank you for you answer

*When you say you are having a "timeout" I'm going with the listener has timed out, in which case the messages are gone - nothing was listening for them.
--->Ok, yes I thought the same but maybe , just maybe the message is stucked in some
"magical" buffer and the message arrive after

*Once the listener has timed out, it won't matter how many messages that are sent -
nothing is listening for them.
---> Ok I did not explain that , but the recv is in a while loop , when a timeout happen the
connection is closed and the server wait for a new connection. After that, the server is
again hear for new messages

*Why the time out occurs before the message arrives could be down to a number of reasons, but most likely traffic on the network OR LOAD ON THE SYSTEM.
--->I started to suspect about that. I was debug this in two computers and when the timeout happened the server computer was doing some hard disk things (I heard the disk)
. About the network traffic that is a question How I can to know if the traffic on the
network is elevated or not?

The time out seems to be very short at a second, can you post some more detail about what you are trying to do?
---> I just want to know as quickly as possible when the server or the connection dropped. I thought in use TCP_KEEP_ALIVE In this link (https://www.digi.com/wiki/developer/index.php/Handling\_Socket\_Error\_and_Keepalive\) explains how activate TCP_KEEP_ALIVE but also in that link explain this

"Do NOT try to use TCP Keepalive to detect TCP socket failure more quickly than a few
minutes. People who try to set it for 5 seconds (or for milliseconds) invariably cause 
serious compatibility issues with other products - and invariably fail to be satisfied. If you 
truly require detecting a TCP socket failure in 1 second or less, which implies your TCP 
peers normally send data many times per second, then use non-blocking sockets with the
 "socket.timeout" exception to detect when no data had been received in your required 
time-frame. And if you accept that a TCP peer quiet for 1 second is bad, then close the 
socket manually and attempt recovery directly. Do not use TCP Keepalive for such short-
period detection. "

Thanks!