Cooler GPU

In one of our computers a Zotac GPU was installed (Zotac GeForce GTX 780 3GB AMP! to be exact) and the processor remained an Intel Core i7 2600. For the processor a new after-market cooler was installed and for the GPU also, together with fans on all case grids.
Initially the temperatures as measured on the CPU were around 70C to 75C. However, after performing a software upgrade on the install stack, temperatures suddenly dropped to 40C to 45C, with no noticeable drop in performance. No other changes were made, the machine is connected as it was before, in the same location etc.

What could have caused the drop in operating temperatures?

75C is dangerously high. I wouldn't be surprised if it was into thermal throttling at that point.

Reducing the clockspeed could let it run at a similar pitch without thermal throttling.

It also depends what was in the 'software upgrade'. Any operating system components in there? Could have been misreading the temperature. Or you might be misreading it now, reading case/board temp while your cores are still cooked to the edge of oblivion.

We did encounter shutdowns a few times, particularly when the temperature would enter the 80C+ region.
The upgrade indeed involved one from Centos 6.5 to Centos 7, so an answer may lie in there. No shutdowns as of yet either.

Also, I just confirmed that we are actually measuring core temperature, ie 4 measurements every 5 mins, one for each core.

I am not sure if we can adjust the clockspeed, given that the CPU is boxed.

Newer kernels include more support for CPU throttling, C-states. Probably stepping down the CPU when it can. What kind of load is the system usually under? Honestly even if your CPU is pegged 100% 24/7 you shouldnt be in 75-80C. Thats way too hot. 60C is where you want to be for a non-overclocked CPU even at 100% duty cycle. 30-45 for idle or normal use.

There is an entry in the kernel cpu config to enable or disable turbo speeds. ( not overclocking specifically) Maybe the old config happened to have that enabled, and the new one is disabled by default?

Not that you can compare to values pre-upgrade, but two things to look at that might be of use is to check the physical power usage of the box with a power-meter inline from the socket to the PSU. They can be had for $20-30. Again, not all that usefull without having "before" values to compare too. But more info on your system is always good to have. Second thing to look at is install "turbostat". It will show you the clock-rates and precentage of time each core is spending in various C states ( full power through idle/power-save modes). There are a handful of other tools that show the same info. Thats just the one I happened to use recently.

1 Like

For the record, the upgrade from Centos 6.5 to Centos 7 was performed again and we have been experiencing no higher temperature than 45C, routinely lower than 40C. This is the case for the last two months, so problem seems solved.