Homework: How Turbo Mode Works

AMD and Intel both figured out the practical maximum power consumption of a desktop CPU. Intel actually discovered it first, through trial and error, in the Prescott days. At the high end that's around 130W, for the upper mainstream market that's 95W. That's why all high end CPUs ship with 120 - 140W TDPs.

Regardless of whether you have one, two, four, six or eight cores - the entire chip has to fit within that power envelope. A single core 95W chip gets to have a one core eating up all of that power budget. This is where we get very high clock speed single core CPUs from. A 95W dual core processor means that individually the cores have to use less than the single 95W processor, so tradeoffs are made: each core runs at a lower clock speed. A 95W quad core processor requires that each core uses less power than both a single or dual core 95W processor, resulting in more tradeoffs. Each core runs at a lower clock speed than the 95W dual core processor.

The diagram below helps illustrate this:

  Single Core Dual Core Quad Core Hex Core
TDP
Tradeoff

 

The TDP is constant, you can't ramp power indefinitely - you eventually run into cooling and thermal density issues. The variables are core count and clock speed (at least today), if you increase one, you have to decrease the other.

Here's the problem: what happens if you're not using all four cores of the 95W quad core processor? You're only consuming a fraction of the 95W TDP because parts of the chip are idle, but your chip ends up being slower than a 95W dual core processor since its clocked lower. The consumer has to thus choose if they should buy a faster dual core or a slower quad core processor.

A smart processor would realize that its cores aren't frequency limited, just TDP limited. Furthermore, if half the chip is idle then the active cores could theoretically run faster.

That smart processor is Lynnfield.

Intel made a very important announcement when Nehalem launched last year. Everyone focused on cache sizes, performance or memory latency, but the most important part of Nehalem was far more subtle: the Power Gate Transistor.

Transistors are supposed to act as light switches - allowing current to flow when they're on, and stopping the flow when they're off. One side effect of constantly reducing transistor feature size and increasing performance is that current continues to flow even when the transistor is switched off. It's called leakage current, and when you've got a few hundred million transistors that are supposed to be off but are still using current, power efficiency suffers. You can reduce leakage current, but you also impact performance when doing so; the processes with the lowest leakage, can't scale as high in clock speed.

Using some clever materials engineering Intel developed a very low resistance, low leakage, transistor that can effectively drop any circuits behind it to near-zero power consumption; a true off switch. This is the Power Gate Transistor.

On a quad-core Phenom II, if two cores are idle, blocks of transistors are placed in the off-state but they still consume power thanks to leakage current. On any Nehalem processor, if two cores are idle, the Power Gate transistors that feed the cores their supply current are turned off and thus the two cores are almost completely turned off - with extremely low leakage current. This is why nothing can touch Nehalem's idle power:

Since Nehalem can effectively turn off idle cores, it can free up some of that precious TDP we were talking about above. The next step then makes perfect sense. After turning off idle cores, let's boost the speed of active cores until we hit our TDP limit.

On every single Nehalem (Lynnfield included) lies around 1 million transistors (about the complexity of a 486) whose sole task is managing power. It turns cores off, underclocks them and is generally charged with the task of making sure that power usage is kept to a minimum. Lynnfield's PCU (Power Control Unit) is largely the same as what was in Bloomfield. The architecture remains the same, although it has a higher sampling rate for monitoring the state of all of the cores and demands on them.

The PCU is responsible for turbo mode.

New Heatsinks and Motherboards Lynnfield's Turbo Mode: Up to 17% More Performance
POST A COMMENT

341 Comments

View All Comments

  • Seramics - Wednesday, September 09, 2009 - link

    So what's the big deal here? I dun tink its that impressive, just good. While S196 of 750 look to outcompete the "way" more expensive $245 of AMD's 965, the truth is that the mobo that you need to pair the 750/860/870 is far from being competitive. P55 is severely stripped down and it is only slightly cheaper than their X58 counterpart. So wht if 750 is cheaper than 965 by about %50? Did you just buy the cpu only? Ppl shud at least look at the CPU+mobo price because they both come together. Truth is, when you take into account mobo price, 750 is far from outcompete 965. Added up, I think its only about balanced. The 750 is a better CPU, but it also cost more. In comparison to their socket 1366 partner, socket 1156 system cost a little less, but they are also inferior a little bit. So what's special them? Sure, there are better turbo and better thermal performance. For me, that is all that is good about the 1156 CPU. For enthusiast, socket 1366 is the way to go. Reply
  • jnr0077 - Friday, July 27, 2012 - link

    i have a i5 750 chip cost £100 a gigabyte GA-P55A-UD6 cost £100 as it has six ram slots 16gb max radeon hd 4850 i love this mobo i cant fault it for the price i find it is a brilliant upgrade for cost i spent £250 considering the price of shops build you own pc you get what you put in :) very happy with the i5 750 1156 socket windows score on basic 500gb 7200 is 5.9 sweet 7.9 with a ssd :) can anyone tell me what the amd 965 hit on base score as i will never DV8 to amd intel 4 me allways :) Reply
  • hob196 - Wednesday, September 09, 2009 - link

    Hi,
    Thanks for another great article.
    I figure that having PCI-e on chip would be great to reduce the latency. Any thoughts about plugging non graphics PCI-e cards into the second PCI-e slot?
    I've heard some motherboards cripple the 2nd slots performance down to x1 if you plug an x1 card in the other slot (in a shared x8 environment)any evidence of this?

    In case you're curious I work with digital audio in a studio environment and I'm always striving to reduce the latency of audio going through the CPU.
    These days, the latency (in streaming audio) is down to how fast the CPU can push floating point plus any overhead for the buffers in the various busses you go through. e.g. A firewire sound interface adds a few ms because of the inherent buffers between CPU -> Northbridge -> Southbridge -> Firewire -> Interface.
    Reply
  • tempestor - Wednesday, September 09, 2009 - link

    Another great article Anand!

    You should consider a 2nd job as a novel writer! :D

    lp, M.
    Reply
  • AndyKH - Wednesday, September 09, 2009 - link

    I don't really get it:
    It is stated that most PCIe cards don't work well with higher frequencies and that the BCLK frequency should be kept at multiples of 133 MHz, and then they overclock it using a BCLK of ~200 MHz in one instance???
    Doesn't the 133 MHz requirement make it pretty much impossible to overclock?

    Someone please enlighten me.
    Reply
  • Anand Lal Shimpi - Wednesday, September 09, 2009 - link

    It doesn't make it impossible to overclock, just impossible to overclock (very high) without additional voltage.

    Take care,
    Anand
    Reply
  • AndyKH - Thursday, September 10, 2009 - link

    Thank you for the response!

    I see how using a higher voltage will increase switching speed of the buffers driving the PCIe bus. However, I fail to see why it would make it any less dificult for PCIe cards to cope with the increased clock frequency, unless the increased voltage is also fed to the PCIe cards (is this the case?). Otherwise I assume they would surely experience the same problems driving communication to the CPU?

    Also, you write multiples of 133 MHz but overclock to 200 MHz BCLK. Shouldn't it read multiples of 33 MHz?
    Reply
  • TotalLamer - Wednesday, September 09, 2009 - link

    I really, really don't understand why Anand is so obsessed with Turbo Modes. Any enthusiast who dares call himself such is going to clock this chip to the moon, at which point Turbo doesn't do anything. So with a 4.2GHz i7 870, all you're really left with is an i7 920 with worse multi-GPU gaming performance and and a less-certain upgrade path. Reply
  • coconutboy - Wednesday, September 09, 2009 - link

    You're assuming all enthusiasts think like you do, but the heavy majority of people (enthusiast or not) want nothing to do with a $500+ i7 870 cpu. The i7 920, 860, and i5 920 are much more attractive options.

    There are plenty of "enthusiasts" who instead prefer silent computers that use no fans, or people living in hot climates who focus on very low temps, or all manner of different things. On top of that, the overwhelming majority of people simply do not care about any of the aforementioned, and those people buy the heavy majority of computers.

    I started OCing in 1996, and used to OC pretty heavily, but got tired of constant tweaking or seeing my well-worn parts die prematurely. Now I tend to focus on very quiet computers that have a small/moderate overclock. So taking an i5 750 or i7 860 and raising it up 200-400 MHz and leaving turbo on is very appealing to me. Also of note is the extra heat generated and the extra money I'll spend on my electric bill by having a 24/7 overclock versus turbo modes. Dig the link and scroll to the bottom-

    http://www.guru3d.com/article/core-i5-750-core-i7-...">http://www.guru3d.com/article/core-i5-750-core-i7-...
    review-test/10

    The 13 watt increase at idle is no big deal, but 133 extra watts under load, well... it's worth the performance boost and heat to some folks, but other people (like me) look at those things as tradeoffs that need to be weighed versus reliability, cost for extra cooling, noise, my electric bill etc.
    Reply
  • Skiprudder - Thursday, September 10, 2009 - link

    I think that some of us are quite honestly getting more green conscious these days too. It's nice to have a CPU this fast that's also this energy efficient. We can get similar to OCed performance at a much smaller power envelope. I know it doesn't add up to a lot over the course of a year (less than $100 I assume), but these things add up and it saves me some dinero on the power bills! Reply

Log in

Don't have an account? Sign up now