pthread_cancel failure

John_S · February 8, 2012, 7:48pm

I'm running a simple web server and seem to be having a problem canceling sessions.

When a new request is received I start a thread to handle that session's requests. Since I want to keep the pipe open for a long time - 10 minutes or maybe 2 hours - I also have a session manager that checks for inactivity. When the time is up I want to cancel the session thread.

The problem is I don't think the thread is getting canceled and I'm leaking memory with zombie sessions. I've set the threads for asynchronous cancellation but I'm getting a return code of 3 and I should get all zeroes per the manual.

Server code:
pthread_create( &NewSessionThread, NULL, &HTTP_Server_Session, (void *) (NewSessionListItem));

Session code:
pthread_setcancelstate(PTHREAD_CANCEL_ENABLE, oldstate);
pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, oldtype);

Session manager code:
CancelResult = pthread_cancel(SessionPointer->SessionThreadID);

fprintf(CrashLog, "HTTP pthread cancellation result (0, success - other not): %d\n", CancelResult);
if (CancelResult == 3)
fprintf(CrashLog, "HTTP pthread cancellation error: %s\n", strerror(errno));

Logfile sample result:
HTTP pthread cancellation result (0, success - other not): 3
HTTP pthread cancellation error: Success

I guess I have three questions:
Why isn't the cancellation working?
Why is the error string �Success� when the result is not 0?
How should I try to fix it?

Thanks!

jim_mcnamara · February 8, 2012, 11:04pm

If the sessions are blocking on a read() they should be at a cancellation point, no problem.
Are you invoking any kind of cleanup -- pthread_cleanup_push & pop? Are you leaving malloc'd memory behind?

I think you ARE cancelling but not cleaning up something. I dunno what precisely.

John_S · February 9, 2012, 7:32am

Hi Jim,

Thanks for your reply.

It's good to hear that threads blocked on a read() are at a cancellation point.

Some threads also flow to their exit and mark themselves for immediate cancellation. Should I *not* cancel these since they've exited? Maybe I'm canceling threads that have exited and that is giving me the '3' return code.

I'm not doing any cleanup in the session itself since I don't allocate any memory there. I do free its associated structure in the Session Manager. This structure was created by the server - the NewSessionListItem - and was passed to the session.

I didn't originally mention it, but I do the same things (Server, Session, SessionManager) for SMTP and FTP as well. And I've got code to generate any number of FTP sessions so I can perform high-volume testing to flush out the leaks. Do you think I should try the shotgun approach of making hundreds of sessions per minute or continue with the rifle approach of trying to find the precise problem?

Thanks,

John

jim_mcnamara · February 9, 2012, 10:58am

Already "dead" threads can be reaped with pthread_join, if they are not set to detached.
And yes, you should not cancel a dead thread, it will return an error.

You can also call pthread_cancel on the value returned by pthread_self, which is probably what you want to do. This cancels the thread, then if it was not detached, you call pthread_join to clean up. Otherwise you leave OS memory allocated to the LWP that was the now defunct thread.

In all honesty, I'm not getting what you are doing exactly. And yes, to solve your problem work on one problem only until you clear it. This kind of stuff can crash the system when no more process masthead header slots (LWP's use them too) are left because they have not been cleaned up after. What is PTHREAD_MAX on your system?

John_S · February 13, 2012, 9:44pm

Thanks again, Jim.

Just wanted to let you know I'm still working the problem and your points are highly impactful to me.

I'll get back with substantive results soon...

Thanks,

John