Count Number Of Threads in a Process

I am trying to find out that how many number of threads are currently running or in any other state which is created by POSIX standard in a process.
First I have defined a variable called proc_var of type proc defined in sys/proc.h.Next I open up the dir /proc and per directory wise I do an ioctl operation with cmd type PIOCPSINFO and target as address space of proc_var. But when I print the value of the member variable p_lwpcnt of proc_var ( i am not sure wether i am reffering to the correct one ) , i get output '6488064' as its value.I found out that the process is a user created one and the code creates only four threads.
Kindly guide me how to find the correct number Threads and its details, currently executing in a process.

What OS are you using? With SunOS 5.6, a simple "ls /proc/*/lwp" will show you all the threads.

We are working on SunOS Rel 5.8.Infact I have worked on the method - counting the number of subdirectories in the /proc/[pid]/lwp by using the following algorithm :

.......
/* After I have found the pid of the required process */
sprintf ( Dir , "/proc/%d/lwp" , pid ) ;
if ( ! chdir ( Dir ) ) {
Counter = 0 ;
if ((dp = opendir ( Dir ))!= NULL) {
while((dirp=readdir(dp))!=NULL)
if(dirp->d_name[0]!='.') Counter ++ ;
}
}
.......
I get the right count of subdirectories in the variable Counter. But the threads created in the program does not matches in number to subdirectories in /proc/[pid]/lwp. For example in an application I have created two thread by using pthread_create but number of subdirectories created in lwp directory for that process is five.I am not able to figure it as to why this difference exists.

In a simple C program like the following :
/* program :- One.c */
main ( )
{
printf ( "Hi To Everyone !\n" ) ;
for ( ; ; ) ;
}
which is compiled with the following command :
cc One.c
and later executed, I found that in the /proc/[pid]/lwp only one subdirectory exists, but when the same program is compiled with following option :
cc One.c -lpthread
and later executed, the lwp directory of its pid in /proc filesystem shows presence of three subdirectories.
Why does it happens so ?

I don't exactly have an answer, but I may be able to shed some light here.

When you use fopen(), you get a stream. But this is built on open() and open() would give you an fd. The fd is a kernel thing and the stream is a library thing built on top on it.

In the same way, a lwp is a kernel thing and a thread (or a pthread) is a library thing.

What I have just discovered this morning is that the lwp's and the threads are not in a one-to-one correspondence.

Look at "man pthread_attr_init". You will see language like 'This thread is not "bound" to a LWP, and is also called an unbound thread.'

I gotta read up on threads sometimes. Since you are trying to count them, maybe you should do the same. In any event, it looks like counting LWP's is not going to help you counting threads.

Here is my guess as to why 3 lwps:

Consider what would happen if you did an fopen(), but all of the fd's were in use. If you had not yet reached the limit on fd's, when the open() system call occurred, the kernel would allocate another chunk of fd's to the process, not just one.

Or if you need more stack, your stack grows by a page, not just the few bytes that you need.

Or if you write another byte to a disk file, the file may grow by a full block.

The kernel may not be easily able to give you just one lwp, or it may just be attempting to be efficient.

The size of the LWP pool has a critical impact on the performance of the many-to-many model: if the number of LWPs in the pool is nearly equal to the number of threads, the implementation will act much like the one-to-one model. Conversely, if there are very few LWPs in the pool, the implementation will act like the many-to-one model.
Of particular concern is the risk of deadlock with an excessively small pool: one thread may block on a resource in the kernel and go to sleep, and by so doing block the LWP needed to run the resource-holder. To solve this problem, the threads package makes a minimal guarantee to the threads programmer: progress will always be made. This is implemented through the use of the SIGWAITING signal. When the kernel realizes that all of a process's LWPs are blocked at the kernel level, it drops a SIGWAITING on the process. Upon receipt of the signal, the user-level threads package decides whether or not to create a new LWP, on the basis of the number of runnable threads. The SIGWAITING mechanism makes no guarantees about optimal use of LWPs on a multiprocessor. Specifically, a process may have many more runnable user-level threads than it has LWPs, but it does not receive a SIGWAITING until all LWPs are blocked. Thus, even if there are processors available and work to be done, the SIGWAITING mechanism does not guarantee that there is a sufficient number of LWPs to run the user threads on the available processors. If the programmer wishes to use unbound threads and take advantage of all available processors, he or she is required to advise the library on the number of LWPs required.

That is interesting. I've been looking for some info on SIGWAITING. Do you have any info on SIGLWP as well?

Hmmm, it's not obvious to me how the SIGWAITING signal handler is run if all the lwp's are blocked. Does the thread library keep one lwp to itself? That would explain the 3 lwp's created as a default. One for the library itself and two for the user.

SIGLWP signal is used as an inter-LWP signaling mechanism when directed to particular LWPs within the process via the _lwp_kill() interface. It is reserved for the threads packages.

SIGWAITING signal is generated by the kernel when it detects that all the LWPs in the process have blocked in indefinite waits. It is used by threads packages to ensure that processes don't deadlock indefinitely due to lack of execution resources.