Profiling results and SMP

The SCO OSR 5.7 system was migrated from older HP DL360 to new DL380 G7. The SMP feature was not activated on older box, it is activated now on this 4 core Xeon.

A s/w we maintain has been copied without any change over to the new box. I noticed that the application profiling does not show any Seconds and Cumulative seconds for our functions.

results from old system:

 %Time Seconds Cumsecs  #Calls   msec/call  Name
   3.8    0.32    4.21 4464168      0.0001  my_func1
   3.5    0.30    4.51 4315824      0.0001  my_func2
   3.4    0.29    4.80 1973552      0.0001  my_func4

results from new system:

 %Time Seconds Cumsecs  #Calls   msec/call  Name
   0.0    0.00    0.00 4464168      0.0001  my_func1
   0.0    0.00    0.00 4315824      0.0001  my_func2
   0.0    0.00    0.00 1973552      0.0001  my_func4

One more note: the last line on the prof output in Cumsecs in the old system would show pretty much the time it took to execute, on the new system it shows a number roughly ten times smaller than actual execution time.

It is static COEFF binary built with the same make file with same compiler, nothing changed between old and new environment, except SMP. The system (old and new) is pretty much one user system idle 99.9% of the time.

I guess the SMP feature is the reason for the skewed profiling. Does anybody know about this issue? Any ideas?

Thanks in advance.

The msec/call number is awfully small to start with - and it doesn't change. How are you timing each function call? To what precision? You may be running fast enough on the new server that your measurement isn't precise enough and your code keeps adding zero time for each execution.

I don't think it's the SMP that's causing the problem, at least not directly.

Thanks for your input. To answer your questions how am I timing and to what precision - I just compiled the app with -p flag. Then I run the app through the test routine, which created the mon.out file. Then I run prof app-name and that gives me results I show. I don't think I can control precision of the output.

As far as new machine being much faster than the old one - yes, the app runs roughly twice faster on this new h/w. But definitely not 10 times faster.

Can you disable CPUs and/or hyperthreading as necessary via BIOS settings on your new hardware to limit the OS to only one CPU thread? That seems to me to be a really easy test to run that will provide good information.

In fact, that was exactly what we just did.

Disabled it in BIOS first and it did not help. Removed the SMP from OS and now it works as expected.