High CPU Utilization

Hi Experts,

I need to understand few basic things regarding top command result from one of the node i have collected:

Cpu0 : 4.6%us, 2.0%sy, 0.0%ni, 91.4%id, 1.3%wa, 0.3%hi, 0.3%si, 0.0%st
Cpu1 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 0.0%us, 3.3%sy, 0.0%ni, 96.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu4 : 1.0%us, 0.3%sy, 0.0%ni, 98.3%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 0.7%us, 0.3%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu8 : 0.0%us, 0.7%sy, 0.0%ni, 99.0%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu9 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu10 : 0.3%us, 0.0%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu11 : 0.7%us, 0.0%sy, 0.0%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu12 : 0.3%us, 0.0%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu13 : 0.3%us, 0.0%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu14 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu15 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 49432636k total, 10601592k used, 38831044k free, 277300k buffers
Swap: 24575992k total, 150852k used, 24425140k free, 1934080k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14539 root 5 -20 48316 6680 1572 R 100.1 0.0 7356:25 /opt/perf/bin/perfd

If we look at the above result , can we conclude that the CPU9 was 100% SY usage :
Cpu9 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

due to :

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14539 root 5 -20 48316 6680 1572 R 100.1 0.0 7356:25 /opt/perf/bin/perfd

If not what else we should look for finding the root cause for this?

Regards,
MackJack

Yes the 100% relates to a full 1 core clock cycle, if a process ran over multiple CPU's you could expect to see 500% for example.

You need to look at the perfd if its even required, if not you can turn it off, if it is required you may need to delve deeper into what it is doing and why its taking up a full CPU.

Thanks Tommy for clarification,

Now one more scenerio i want to add here as well:
Say with perfd 100%, i also got 100% Utilization from other few processes

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14539 root 5 -20 48316 6680 1572 R 100.1 0.0 7356:25 /opt/perf/bin/perfd

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14344 root 5 -20 48316 6680 1572 R 100.1 0.0 56:25 oracle

But my System is normal (i.e

Cpu9 : 0.0%us,3.5%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

Why i have to add this here because,
my client complaint me that perfd is taking 100% CPU utilization , thus there system got in bad stage.

Now when perfd process is been killed, still we are finding other process are in 100% as well. But CPU9 is with 3.x%SY

So now point is when perfd was taking 100% CPU utilization then there would be more other factor which made the system performance weak.

Is this an emotional problem, not a performance one? You buy CPUs so they can be used, not just to consume power while idle. I used to pin the needles regularly with nice -19 processes, and the system was responding fine, but the system admins were upset because it was unusual and had the appearance of a problem. Top is nice for keeping an eye on things. Laurence Tratt: perfd

Well put, lol.

When you have 16 CPU's, and 15 of them are idle, there is no problem.

You might get Oracle to defragment tables and indexes, compute statistics, etc. to give them something to do until real work shows up! :smiley:

I had a regional manager question using our last sense amp to fix a memory once, because then we would have no spare parts! (The "We keep them for real emergencies." mentality! Luckily, customer did not hear this! :slight_smile: