Sign in with
Sign up | Sign in

AMD's Bulldozer Architecture: Overclocking Efficiency Explored

AMD's Bulldozer Architecture: Overclocking Efficiency Explored
By , Achim Roos

AMD’s FX processor line-up was supposedly designed with efficiency in mind, according to AMD. We're putting this claim to the test, assessing the Bulldozer architecture at a number of different clock rates and comparing the results to Intel's CPUs.

AMD's new FX family was highly anticipated, but its performance simply underwhelmed us. Rather than leap-frogging Intel's mainstream CPUs, it only managed to (at best) match and (at worst) fall behind them. Of course, this is a result of a redesign from the ground-up, which involved certain decisions that affected performance, and others made with power efficiency in mind. In theory, the FX family should be more efficient than its predecessor thanks to AMD's decisions. And a 32 nm manufacturing node would have been thought to help, too.

Just how does the design fare with regard to power as you move frequency around? That's what we're aiming to find out.

There are seven models based on the Bulldozer architecture, presenting a range of clock rates and prices to the folks interesting in dropping one of these chips into a Socket AM3+-based motherboard. For more information about them, check out AMD Bulldozer Review: FX-8150 Gets Tested.

Better Utilization Thanks To Second-Gen Turbo Core

Turbo Core, similar to Intel's Turbo Boost technology, tries to optimize processor performance by evaluating several power-related variables in real time and adjusting clock rate in response. When thermal headroom permits, the feature increase frequency, completing workloads faster and ideally dropping you back to idle more quickly.

From our FX launch story:

"Application Power Management (APM) describes Zambezi/Valencia/Interlagos’ ability to monitor (in real-time) the amount of power each core consumes. Rather than taking thermal or current measurements, the activity of each Bulldozer module is tracked. AMD knows how much power each operation requires and is able to come up with instantaneous power use on a per-module basis. A quick comparison between real consumption and maximum TDP indicates whether or not there’s headroom to increase performance. In an example where you’re running an application that doesn’t tax the processor’s resources, Turbo Core dithers between the processor’s base frequency and a higher clock rate, jumping between them to average better overall performance at the defined TDP.

Turbo Core isn’t limited to just a base and some arbitrarily higher frequency, either. It’s actually implemented in three p-states: the base (referred to as P2), an intermediate state (P1), and a higher state (P0). That’s an improvement over the first-gen version of Turbo Core, which AMD says only switched between two p-states. And it’s significant, too, because you can enter P1 with all eight cores active, so long as the headroom is there. Stepping up to P0 requires at least two of four modules to idle. AMD does allow the chip’s TDP to be exceeded instantaneously, but of course it can’t hold that for any thermally significant amount of time.

As such, when you look at the specs for an FX processor and see CPU Base, CPU Turbo Core, And CPU Max. Turbo, you are guaranteed to always get at least that base frequency. You’ll see the Turbo Core clock rate so long as TDP is in check (as it would be in a well-threaded workload that doesn’t exceed the processor’s thermal ceiling). And, whenever half of the chip’s cores are idle, it’s possible to realize maximum Turbo Core speeds."

How Efficient is Bulldozer?

Although a superficial examination of AMD's architecture implies some pretty lofty expectations on the efficiency front, enthusiasts only really care about how they translate to the real world. We answered a lot of questions in AMD FX: Energy Efficiency Compared To Eight Other CPUs. However, in that story, we limited ourselves to stock clocks. Here, we're expanding our analysis to overclocking.

We also want to find out where the Bulldozer architecture achieves a balance between low voltage, low power, and decent performance. It's particularly convenient, then, that all of the FX-based processors feature unlocked multiplier ratios. Combined with our test bench's firmware, which lets us easily modify voltage and performance, we're able to fine-tune performance very flexibly. We have six different combinations of clock rate and voltage to explore, so let's get to it.

Ask a Category Expert

Create a new thread in the UK Article comments forum about this subject

Example: Notebook, Android, SSD hard drive

Display all 4 comments.
This thread is closed for comments
  • 0 Hide
    tranzz , 27 December 2011 16:55
    With a CPU that boosts performance based on real time monitoring of power consumption it is not surprising that an under volted CPU performs slightly faster. The CPU knows its using less power so can use speed boots and stay at elevated clocks for longer periods.
  • 0 Hide
    tranzz , 27 December 2011 16:58
    ^ should have read the final page b4 posting
  • 0 Hide
    mikeyw , 7 January 2012 03:57
    Nothing much to add just want to say thanks for including the MATLAB benchmarks, it's exactly what I need and it's very rare information!
  • 0 Hide
    ser1 , 7 January 2012 21:45
    It would have been nice to include under voltage map for all P-states,not just a sparse CPU-Z picture.
    Also,what program did you use to perform voltage and frequency settings ?