Problem with socket binding - "system" call

Hi,

I am having an issue with using sockets.

I have a program which binds to a socket and listen on it. Later I spawn a thread to handle some function. In the new thread created I need to call a shell script which executes the specified function. Here I am using a system command to call the shell script.

In the new thread I created I am not using the socket for either reading/writing.

  1. First thing I am not sure why the shell script that i am using is binding to the listen socket.:confused:
    following is the extract from lsof

socket1 15920 janardha 3u IPv4 60551055 TCP *:4500 (LISTEN)
runtest 15922 janardha 3u IPv4 60551055 TCP *:4500 (LISTEN)

socket1 is my main thread where I am binding to port 4500. In the new thread created I am calling a shell script "runtest" which is also binding to the port 4500.

  1. Is there a way that the new thread created does not bind to port 4500?

The problem became more worse when..

For some reason the shell script is hung for long time. In the mean time the main process is restarted due to a core in another thread.
Now when the process comes up it finds the socket is already in use and fails to bind to the socket.

The bind failure keeps happening until the shell script is killed or the script exits on its own.
[I have found a solution to hung thread issue, where in the shell script, I spawn a shell, sleep for sometime and kill the main thread if it still exits. If the main thread is not hung I will kill the new spawned shell \(this piece of code is not in the example I have mentioned\)]

Now I am looking a way to avoid the socket to be used by the newly spawned thread. I mean when the shell script is running it should not be bound to the listen socket that is created in the main thread.

Please help me resolve this issue.

Your help is appreciated.

Thanks in advance.

Here I am attaching my example program.

----------------------------------------------------------------

#include<stdio.h>
#include<string.h>
#include<netinet/in.h>
#include<arpa/inet.h>
#include<sys/types.h>
#include <sys/uio.h>
#include <sys/socket.h>
#include<unistd.h>
#include<pthread.h>
#include<errno.h>
#include<stdlib.h>

#define SA struct sockaddr
#define LISTENQ 15

#define MAXLINE 1000

void * start_func(void *val);

int main(int argc, char **argv)
{
  int listenfd, connfd;
  struct sockaddr_in servaddr;
  char buff[MAXLINE];
  time_t ticks;
  pthread_t thr_id;
  int val=1;

  listenfd=socket(AF_INET,SOCK_STREAM,0);

  memset(&servaddr,0,sizeof(servaddr));
  servaddr.sin_family=AF_INET;
  servaddr.sin_addr.s_addr=htonl(INADDR_ANY);
  servaddr.sin_port=htons(4500);

  setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR, (int*)&val, sizeof(val));

  if (bind(listenfd, (SA *) &servaddr, sizeof(servaddr)) != 0) {
    printf ("Bind failed , errno %d\n", errno);
    return 0;
  }

  listen(listenfd, LISTENQ);

  if (pthread_create(&thr_id, NULL, start_func, NULL) != 0){
    printf("failed to create a thread\n");
    return 0;
  }

  pthread_detach(thr_id);

  /* For testing I am putting main thread in sleep. */
  printf("Main thread in sleep for 300 secs\n");
  sleep(300);

}

/* thread function */
void *start_func(void *val) {

  int retVal=0;

  char *cmd="/home/janardha/code/runtest;";

  printf("calling system function\n");

  retVal = system(cmd);

  if (retVal != 0) {
    printf ("failed to execute the command\n");
    return 0;
  }

}

Shell script runtest.
-------------------

#!/usr/bin/ksh

ls -lrt
print "In sleep"
sleep 100

----------------------------------------------------------------

The extracts from the output are:

bash-3.1$ ./socket1
Main thread in sleep for 300 secs
calling system function
total 12
-rwxrwxr-x 1 janardha janardha 53 May 7 06:49 runtest
-rw-rw-r-- 1 janardha janardha 1536 May 9 07:21 socket1.c
-rwxrwxr-x 1 janardha janardha 7096 May 9 07:22 socket1
In sleep

Output of lsof:
--------------
socket1 15920 janardha 3u IPv4 60551055 TCP *:4500 (LISTEN)
runtest 15922 janardha 3u IPv4 60551055 TCP *:4500 (LISTEN)

output of process list:
---------------------
janardha 15920 32001 0 07:24 pts/25 00:00:00 ./socket1
janardha 15922 15920 0 07:24 pts/25 00:00:00 /usr/bin/ksh /home/janardha/code/runtest

thanks,
Janardhan

Just a guess, but: a thread isn't a "new" part of a program, but a part of the code that executes in parallel. A call to system() is, basically, a fork(), exec(), and wait(). And with fork() the complete context of the process is copied, including your socket.

One option that I can think of would be to do the fork() in the thread, clean up the environment, and then call system().

Thanks for the reply.

How do I clean up the environment?

In your case, it should be enough to just close that copy of the socket. But I can't guarantee that it won't also close the parents socket, so you should test that first.

or call fcntl() with SET_FD and an argument of 1. This sets the close-on-exec flag to true for the socket. The parent's copy of the descriptor remains open.

Thanks for the solution. It worked.