Tuning Cool'n'Quiet: Maximize Power And Performance, Part 2

Benchmark Results: SuperPi 1M And 8M

SuperPi is quite popular throughout the benchmarking world. As you can see, this benchmark is not really capable of fully utilizing two cores, let alone more than two. Like 3DMark 2001, a single- or dual-core processor is the better solution for this benchmark.

Or is it?

Do you remember the differences between these three processors? If you guessed synchronous clock changes, you're right. Since it can only fully utilize a single core, 3DMark 2001 and SuperPI repeatedly “jump” cores in Windows Vista, wasting time, and performance suffers as a result. Just look at the processor utilization graph again--it's as clear as day. Notice how the peaks and valleys on both lines mirror each other throughout the benchmark.

Let's see which processor completed the benchmark with the least amount of power.

Like 3DMark 2001, a performance penalty is incurred when we enable power management. But it’s a result of Windows Vista bouncing each single-threaded workload between physical cores, which idle independently from each other. Notice how the differences are smaller with the Phenom II compared to the Athlon X2 and Athlon II X2? Once we enabled synchronous clock changes in K10Stat (with the Optimized setting), we regained some of that lost performance back, though not all.

By enabling synchronous clock changes and shorter p-state transition times, we were able to regain performance, from losing 8-18% to just 2%. The benefit of shorter p-state transitions is about 1%, easily seen on the Phenom II results. In contrast, the penalty from asynchronous clock changes is about 17%, which is quite significant.

As a side note, we were able to compare single-threaded performance on the Phenom II, Athlon II, and the older Athlon X2. Working 400 MHz faster, the Athlon II offers the same single-threaded performance as the Phenom II X3 710. So, that 6MB L3 cache can make up for a 400 MHz clock deficiency. The presence of L3 cache also helped the older Athlon X2 from falling too far behind the newer Athlon II X2 250.

Let’s look at the power savings. With Optimized settings, not only did we regain some lost performance, we also saved around 16% of our power budget with the Athlon X2 7750. That's quite a difference compared to the default settings (4%). Now, if you’re looking for a good reason to buy the Athlon II X2 250, just look at the power consumption numbers. It’s about 8 watts lower than the Athlon X2 7750 and slightly faster.

Of course, the final measure is total power consumed by these processors during the benchmark. If you’re using stock Cool'n'Quiet settings (or no power management), there’s really not that much of a difference between the 45nm-based processors. But tweak the voltages and the Phenom II X4 945 steps ahead of the rest of the pack. You actually consume less power with this processor. This is pretty interesting, since this gives us an idea of what power consumption levels look like in single-threaded applications using these processors.

Create a new thread in the UK Article comments forum about this subject
This thread is closed for comments
3 comments
Comment from the forums
    Your comment
  • Anonymous
    I would have liked to see how the quad 810 processor would have gone, considering it is just one peg above the triple core 720, and a few pegs below the 945 and 955 - perhaps it yields a nice balance/performance/voltage ratio? Certainly interesting given that it is a 95w processor, but possessing four cores, I guess that it would have trumped the 720. Maybe AMD sponsors these articles, and have to sell a few more 720s first.
  • chispas
    AMD 810 FTW
  • silverblue
    I doubt AMD sponsors these. Have you seen how poorly the 710 does at DivX? I doubt that DivX will change their code for triple core CPUs, so the 720 will behave similarly to the 710. In addition, this guide is an AMD-only guide.