Homework: How Turbo Mode Works

AMD and Intel both figured out the practical maximum power consumption of a desktop CPU. Intel actually discovered it first, through trial and error, in the Prescott days. At the high end that's around 130W, for the upper mainstream market that's 95W. That's why all high end CPUs ship with 120 - 140W TDPs.

Regardless of whether you have one, two, four, six or eight cores - the entire chip has to fit within that power envelope. A single core 95W chip gets to have a one core eating up all of that power budget. This is where we get very high clock speed single core CPUs from. A 95W dual core processor means that individually the cores have to use less than the single 95W processor, so tradeoffs are made: each core runs at a lower clock speed. A 95W quad core processor requires that each core uses less power than both a single or dual core 95W processor, resulting in more tradeoffs. Each core runs at a lower clock speed than the 95W dual core processor.

The diagram below helps illustrate this:

  Single Core Dual Core Quad Core Hex Core
TDP
Tradeoff

 

The TDP is constant, you can't ramp power indefinitely - you eventually run into cooling and thermal density issues. The variables are core count and clock speed (at least today), if you increase one, you have to decrease the other.

Here's the problem: what happens if you're not using all four cores of the 95W quad core processor? You're only consuming a fraction of the 95W TDP because parts of the chip are idle, but your chip ends up being slower than a 95W dual core processor since its clocked lower. The consumer has to thus choose if they should buy a faster dual core or a slower quad core processor.

A smart processor would realize that its cores aren't frequency limited, just TDP limited. Furthermore, if half the chip is idle then the active cores could theoretically run faster.

That smart processor is Lynnfield.

Intel made a very important announcement when Nehalem launched last year. Everyone focused on cache sizes, performance or memory latency, but the most important part of Nehalem was far more subtle: the Power Gate Transistor.

Transistors are supposed to act as light switches - allowing current to flow when they're on, and stopping the flow when they're off. One side effect of constantly reducing transistor feature size and increasing performance is that current continues to flow even when the transistor is switched off. It's called leakage current, and when you've got a few hundred million transistors that are supposed to be off but are still using current, power efficiency suffers. You can reduce leakage current, but you also impact performance when doing so; the processes with the lowest leakage, can't scale as high in clock speed.

Using some clever materials engineering Intel developed a very low resistance, low leakage, transistor that can effectively drop any circuits behind it to near-zero power consumption; a true off switch. This is the Power Gate Transistor.

On a quad-core Phenom II, if two cores are idle, blocks of transistors are placed in the off-state but they still consume power thanks to leakage current. On any Nehalem processor, if two cores are idle, the Power Gate transistors that feed the cores their supply current are turned off and thus the two cores are almost completely turned off - with extremely low leakage current. This is why nothing can touch Nehalem's idle power:

Since Nehalem can effectively turn off idle cores, it can free up some of that precious TDP we were talking about above. The next step then makes perfect sense. After turning off idle cores, let's boost the speed of active cores until we hit our TDP limit.

On every single Nehalem (Lynnfield included) lies around 1 million transistors (about the complexity of a 486) whose sole task is managing power. It turns cores off, underclocks them and is generally charged with the task of making sure that power usage is kept to a minimum. Lynnfield's PCU (Power Control Unit) is largely the same as what was in Bloomfield. The architecture remains the same, although it has a higher sampling rate for monitoring the state of all of the cores and demands on them.

The PCU is responsible for turbo mode.

New Heatsinks and Motherboards Lynnfield's Turbo Mode: Up to 17% More Performance
Comments Locked

343 Comments

View All Comments

  • lordmetroid - Tuesday, September 8, 2009 - link

    I am using Linux!
  • andrenb91 - Wednesday, September 9, 2009 - link

    c'mon probably u still running windows for somethings...wine doesn't work owith every thin...i run liux on dual boot for years and still trying to make wine run fligh simulator x..which is the only game I play...remember, these benchmarkes are only for win bases pcs, in linux the history is diferent, see it at phoronix.com...
  • james jwb - Tuesday, September 8, 2009 - link

    is turbo boost on for the benchmarks?
  • snakeoil - Tuesday, September 8, 2009 - link

    yes they benchmarked with turbo boost, that is cheating because thats overclocking the processor at least 600 mhz and presenting the results as it were at stock speeds.
    that's abusing the reader's trust.
  • maxxcool - Tuesday, September 8, 2009 - link

    Hahaha, you are just as much of a idiot here as on techreport snake! ... did you come here and claim to have proof that i5 will not run xp-mode to?

    hahahaha, your just sad that Amd did not come up with this feature 1st.
  • Jarp Habib - Tuesday, September 8, 2009 - link

    "yes they benchmarked with turbo boost, that is cheating because thats overclocking the processor at least 600 mhz and presenting the results as it were at stock speeds.
    that's abusing the reader's trust. "

    This statement is a load of bullcrap. Anand's intent is to present the benchmarks in a way reflective of the chip's standard performance in normal use- hence not manually overclocking for maximized performance. The processor's very design revolves on itself automatically shutting down inactive cores and boosting the speed of active cores, *regardless* of what the end user does to the chip in BIOS or what apps he's running. Since all you need to do to use Turbo Boost is just *install the CPU in your system* then benchmarks should be run with it enabled.

    If you want to COMPLETELY level the playing field, then TurboBoost should be shut down, for both Bloomfield i7 chips and Lynnfield i5 AND Lynnfield i7, as well as future i3 and i9. Also, HyperThreading must be disabled from all chips, 3DNow!, SpeedStep, Cool N' Quiet, MMX and the entire SSE instruction sets. After all, each different type of CPU executes those standard instruction sets differently. And since the SpeedStep and Cool N Quiet instructions force the chip to underclock and shut off cores while at idle, they must be eliminated from testing as well, or they'll throw off your idle power consumption benchmarks.

    Since you will be normalizing the clock frequencies as well, you can save time by only needing to test just one chip from each product line. I'm not sure just how you will normalize the clock frequencies of your test units *without overclocking or underclocking* some of them though. Perhaps you'll let me know?

    Meanwhile, back in the real world...
  • Voo - Tuesday, September 8, 2009 - link

    The difference is, that turbo mode impairs the possible benefit of overclocking the chip, while most things you enumerated do not.

    If you want to get the maximum out of the 860 you've got to disable turbo mode as we see in the review, so for everyone who'd want to overclock their CPU the most interesting test would be a comparison between the two chips both at their maximum stable performance. Which at the moment means disabling turbo mode as we can see.
  • erple2 - Tuesday, September 8, 2009 - link

    A-HA! So really, you're just interested in the benchmark "What does the maximum overclock do", not "How does the CPU perform at normal operations". BTW, does disabling HT does improve overclocking a little bit, so should that also be disabled? Cool-n-Quiet plus SpeedStep may also affect overclocking capabilities. Should those be disabled? I fail to see the difference between what the GP said and your justifications.
  • MadMan007 - Tuesday, September 8, 2009 - link

    I'm not bothered by enabled Turboboost in a 'stock speed' review either but I would really like to see more sites run their benchmark suite with 3.6-4.0GHz (or higher) C2D, C2Q and Phenom II versus overclocked but non-Turboboost i5/i7. The reason is that this type of comparison would be most directly useful for the site's enthusiast readerships to know what the actual difference between *their rig* and an i5/i7 would be.
  • Kaleid - Thursday, September 10, 2009 - link

    Seconded.

Log in

Don't have an account? Sign up now