Homework: How Turbo Mode Works

AMD and Intel both figured out the practical maximum power consumption of a desktop CPU. Intel actually discovered it first, through trial and error, in the Prescott days. At the high end that's around 130W, for the upper mainstream market that's 95W. That's why all high end CPUs ship with 120 - 140W TDPs.

Regardless of whether you have one, two, four, six or eight cores - the entire chip has to fit within that power envelope. A single core 95W chip gets to have a one core eating up all of that power budget. This is where we get very high clock speed single core CPUs from. A 95W dual core processor means that individually the cores have to use less than the single 95W processor, so tradeoffs are made: each core runs at a lower clock speed. A 95W quad core processor requires that each core uses less power than both a single or dual core 95W processor, resulting in more tradeoffs. Each core runs at a lower clock speed than the 95W dual core processor.

The diagram below helps illustrate this:

  Single Core Dual Core Quad Core Hex Core
TDP
Tradeoff

 

The TDP is constant, you can't ramp power indefinitely - you eventually run into cooling and thermal density issues. The variables are core count and clock speed (at least today), if you increase one, you have to decrease the other.

Here's the problem: what happens if you're not using all four cores of the 95W quad core processor? You're only consuming a fraction of the 95W TDP because parts of the chip are idle, but your chip ends up being slower than a 95W dual core processor since its clocked lower. The consumer has to thus choose if they should buy a faster dual core or a slower quad core processor.

A smart processor would realize that its cores aren't frequency limited, just TDP limited. Furthermore, if half the chip is idle then the active cores could theoretically run faster.

That smart processor is Lynnfield.

Intel made a very important announcement when Nehalem launched last year. Everyone focused on cache sizes, performance or memory latency, but the most important part of Nehalem was far more subtle: the Power Gate Transistor.

Transistors are supposed to act as light switches - allowing current to flow when they're on, and stopping the flow when they're off. One side effect of constantly reducing transistor feature size and increasing performance is that current continues to flow even when the transistor is switched off. It's called leakage current, and when you've got a few hundred million transistors that are supposed to be off but are still using current, power efficiency suffers. You can reduce leakage current, but you also impact performance when doing so; the processes with the lowest leakage, can't scale as high in clock speed.

Using some clever materials engineering Intel developed a very low resistance, low leakage, transistor that can effectively drop any circuits behind it to near-zero power consumption; a true off switch. This is the Power Gate Transistor.

On a quad-core Phenom II, if two cores are idle, blocks of transistors are placed in the off-state but they still consume power thanks to leakage current. On any Nehalem processor, if two cores are idle, the Power Gate transistors that feed the cores their supply current are turned off and thus the two cores are almost completely turned off - with extremely low leakage current. This is why nothing can touch Nehalem's idle power:

Since Nehalem can effectively turn off idle cores, it can free up some of that precious TDP we were talking about above. The next step then makes perfect sense. After turning off idle cores, let's boost the speed of active cores until we hit our TDP limit.

On every single Nehalem (Lynnfield included) lies around 1 million transistors (about the complexity of a 486) whose sole task is managing power. It turns cores off, underclocks them and is generally charged with the task of making sure that power usage is kept to a minimum. Lynnfield's PCU (Power Control Unit) is largely the same as what was in Bloomfield. The architecture remains the same, although it has a higher sampling rate for monitoring the state of all of the cores and demands on them.

The PCU is responsible for turbo mode.

New Heatsinks and Motherboards Lynnfield's Turbo Mode: Up to 17% More Performance
POST A COMMENT

341 Comments

View All Comments

  • jnr0077 - Thursday, July 26, 2012 - link

    well i have the better model i5 750 1156 socket gaming score is 5.9 on basic 500 gb hd 7200 with a ssd it hit 7.9 on a gigabyte GA-P55A-UD6 12gb ram. as for the price
    cost was cheep intel (R)quad core (TM) i5 750 @2.66 GHz 2.67GHz cost around £100 mobo cost me £100 i though it is a very cheep upgrade considering price i wood like to here what score any Pehnom II X4 965 hit
    Reply
  • Milleman - Sunday, September 13, 2009 - link

    The article itself is good. But Why on earth compare a standard clocked CPU (AMD) against overclocked ones (Intel). Makes no objective sense att all. I's like having a car test between a standard car and a tuned racecar. Of course the racecar will win in performance. The overclock results shouldn't be there at all. Maybe as a remark that tell what will happen if one would like to overclock. Looks rather unfair and biased.

    So... why??
    Reply
  • Nich0 - Sunday, September 13, 2009 - link

    All I saw in this article is comparison of CPUs in their stock configuration. What's wrong with that? Reply
  • Bozo Galora - Friday, September 11, 2009 - link

    I must say this was a very good logical coherent review with just about all the info one would require

    Good job - I had no intention of getting one of these, but now I may change my mind
    Reply
  • IntelUser2000 - Thursday, September 10, 2009 - link

    http://www.intel.com/support/processors/sb/CS-0299...">http://www.intel.com/support/processors/sb/CS-0299...

    According to Intel...

    Core i7 870:

    5/4/2/2

    Core i7 860:

    5/4/1/1/

    Core i5 750:

    4/4/1/1

    So the i7 870 has higher Turbo mode for 3 and 4 cores than 860 does.
    Reply
  • Nich0 - Friday, September 11, 2009 - link

    Yeah and that means that the OC numbers for the 750 with Turbo don't make sense. For example 4160 / 160 = 26 which would be a Turbo of 6 BCLK.
    Same thing for the 860 OC 3C/4C Turbo number.

    Am I missing something?
    Reply
  • IntelUser2000 - Friday, September 11, 2009 - link

    Its likely Anand has ES versions or such which allows multiplier adjustments. But at stock, the linked speeds are the Turbo Boost grades. Reply
  • Nich0 - Friday, September 11, 2009 - link

    Yeah obviously I am not disputing the stock OC with Turbo enabled (that sounds weird: stock OC?), ie 160*20= 3200, but just what it means in terms of Turbo: it 'should' read 3.36 for 3/4C and 3.84 for 1/2C if the 1/1/4/4 Turbo spec is correct. Reply
  • rdkone - Thursday, September 10, 2009 - link

    I don't like the fact that the BCLK directly and synchronously communicates with PCIe buss, thus affecting the videocard negatively (among other PCIe cards)... This is like overclocking years ago whereas the PCI bus would be affected in the same way and causing headaches... This is a major issue I feel for those wanting to push a fairly big overclock on these CPU's... Intel screwed the pooch for us overclockers I feel... Just more justification to limp along with my core 2 quad at 4.1Ghz rock solid... Like others have said, is funny how the articles don't show older CPU overclocks against all this new garb... In the past they used to... But that hurts sales : ) Reply
  • SnowleopardPC - Thursday, September 10, 2009 - link

    Ok, so what type of boost do I get over a Q6600 with 8gb of ram and windows 7 64?

    Is it worth upgrading or waiting for that 6 core 32nm to come out next year?

    To upgrade to any of these I will need to replace a motherboard and ram with the processor.
    Reply

Log in

Don't have an account? Sign up now