Process lose its parent then consume high CPU usage ...

Hello.

In an informix context, on AIX 5.3 TL 12, we encounter this problem :

Sometimes in the day (probably when users exits from their session), a child process lose its parent (PPID is now "1") and this child is consumming lot of CPU "USER".

I tried, on different cases, " truss -p <mypid> " : sometimes (for some processus) it says " truss: 0915-023 Cannot control process #950712 ", sometimes (for other processus) it waits and print nothing.

I tried procstack <my pid> :

513736: xxxxxx.4ge 15/11/2016 myuser 35 FI7:B p011 p011 p011 p011 p011 p011 p011 p011
0x09000000000ab03c  exit(??) + 0x64
0x09000000005e03d4  ibm_efm_exit() + 0x1c
0x09000000005e0418  ix_sigcleanup() + 0x18
0x09000000005e0aec  ix_sigproc() + 0x60
0x09000000005e07dc  efsignal_sigHup() + 0x28
<signal>
0x09fffffff0002990  usl_exit_fini_mods(??) + 0xfc
0x0900000000074dbc  __modfini64() + 0x1c
0x09000000000ab1c0  exit(??) + 0x1e8
0x09000000005e03d4  ibm_efm_exit() + 0x1c
0x09000000005e0418  ix_sigcleanup() + 0x18
0x09000000005e0aec  ix_sigproc() + 0x60
0x09000000005e15a0  rfetchch() + 0xe8
0x09000000005e97a4  ibm_efm_rGetKey() + 0x7c
0x09000000005e937c  _mbrgetkey_() + 0x54
0x090000000060756c  ibm_efm_getUserDataCopyToBuffer() + 0x574
0x0900000000603364  efinput_getValidatedDataField() + 0x578
0x0900000000603bc0  efinput_redisplayFieldWithAttr() + 0x234
0x09000000006046d0  ibm_efm_processBATable() + 0x434
0x0900000000615710  efinar_inputArray() + 0x6cc
0x0000000101d742e4  isitcg(0x0) + 0x12348
0x000000010001e9d0  menu207(0x0) + 0xbf4
0x0000000100034bdc  main(0xf0000000f, 0x203fe898) + 0xa50c
0x00000001000002d8  __start() + 0x98

Do you have any clue of what happend ?

I think of the way users are stopping their sessions ... but I dont know how it could cause these symptoms.

The last operations were : upgrade AIX 5.3 to TL12 (before upgrading to 7) and move partition from Power 6 to Power 7.

Thank you !

nobody has a magic crystall ball, but it seems that your user just pressed the close button of the window. because the terminal was lost, the system sent SIGHUP (hangup) signal to all applications, which were running in the session. your application can understand this signal and should be able to exit clean. Your stack shows, that it called standard C function exit(). After this call the process should die.

IS the stack dumpfrom the process running out of control? I do not think so - as agent_kgb said, the process should end on an exit() call. Process running amok just crank cpu until somebody kills it.

It definitely is a user training issue. Closing the desktop window will cause a SIGHUP to be sent to the UNIX processes. The process you are seeing is probably forked over to whatever the informix user has, not the original user account. Check the owning username to be sure. It affects where to look for code that is trapping the SIGHUP signal. Which your problem seems to be - a guess.

I agree with you, but I don't understand why the behavior changed.

We updated a 5.3 TL 9 AIX to 5.9 TL12 and we also move from a physical server to a VIO client. Before this migration, we didn't have the problem.

It happens with multiple users (more than 10 different users for the last period I checked) now in 5.3 TL12 but nothing similar happens with another server, with same application, in AIX 5.3TL9.

Not a happy thought, but maybe it behaved well before because of a bug that got fixed with the update, or worse - a bug got introduced.

As to running on Power7 - in native mode, or in a vWPAR. In either case I suspect more than just the standard upgrade to AIX 5.3 TL12 (from memory SP7). In extended support there were several more updates (I have never seen).

Maybe among the "extended support" updates there is something extra you need.

Wish I could have been bringing a happy thought :slight_smile: