Homework: How Turbo Mode Works

AMD and Intel both figured out the practical maximum power consumption of a desktop CPU. Intel actually discovered it first, through trial and error, in the Prescott days. At the high end that's around 130W, for the upper mainstream market that's 95W. That's why all high end CPUs ship with 120 - 140W TDPs.

Regardless of whether you have one, two, four, six or eight cores - the entire chip has to fit within that power envelope. A single core 95W chip gets to have a one core eating up all of that power budget. This is where we get very high clock speed single core CPUs from. A 95W dual core processor means that individually the cores have to use less than the single 95W processor, so tradeoffs are made: each core runs at a lower clock speed. A 95W quad core processor requires that each core uses less power than both a single or dual core 95W processor, resulting in more tradeoffs. Each core runs at a lower clock speed than the 95W dual core processor.

The diagram below helps illustrate this:

  Single Core Dual Core Quad Core Hex Core
TDP
Tradeoff

 

The TDP is constant, you can't ramp power indefinitely - you eventually run into cooling and thermal density issues. The variables are core count and clock speed (at least today), if you increase one, you have to decrease the other.

Here's the problem: what happens if you're not using all four cores of the 95W quad core processor? You're only consuming a fraction of the 95W TDP because parts of the chip are idle, but your chip ends up being slower than a 95W dual core processor since its clocked lower. The consumer has to thus choose if they should buy a faster dual core or a slower quad core processor.

A smart processor would realize that its cores aren't frequency limited, just TDP limited. Furthermore, if half the chip is idle then the active cores could theoretically run faster.

That smart processor is Lynnfield.

Intel made a very important announcement when Nehalem launched last year. Everyone focused on cache sizes, performance or memory latency, but the most important part of Nehalem was far more subtle: the Power Gate Transistor.

Transistors are supposed to act as light switches - allowing current to flow when they're on, and stopping the flow when they're off. One side effect of constantly reducing transistor feature size and increasing performance is that current continues to flow even when the transistor is switched off. It's called leakage current, and when you've got a few hundred million transistors that are supposed to be off but are still using current, power efficiency suffers. You can reduce leakage current, but you also impact performance when doing so; the processes with the lowest leakage, can't scale as high in clock speed.

Using some clever materials engineering Intel developed a very low resistance, low leakage, transistor that can effectively drop any circuits behind it to near-zero power consumption; a true off switch. This is the Power Gate Transistor.

On a quad-core Phenom II, if two cores are idle, blocks of transistors are placed in the off-state but they still consume power thanks to leakage current. On any Nehalem processor, if two cores are idle, the Power Gate transistors that feed the cores their supply current are turned off and thus the two cores are almost completely turned off - with extremely low leakage current. This is why nothing can touch Nehalem's idle power:

Since Nehalem can effectively turn off idle cores, it can free up some of that precious TDP we were talking about above. The next step then makes perfect sense. After turning off idle cores, let's boost the speed of active cores until we hit our TDP limit.

On every single Nehalem (Lynnfield included) lies around 1 million transistors (about the complexity of a 486) whose sole task is managing power. It turns cores off, underclocks them and is generally charged with the task of making sure that power usage is kept to a minimum. Lynnfield's PCU (Power Control Unit) is largely the same as what was in Bloomfield. The architecture remains the same, although it has a higher sampling rate for monitoring the state of all of the cores and demands on them.

The PCU is responsible for turbo mode.

New Heatsinks and Motherboards Lynnfield's Turbo Mode: Up to 17% More Performance
Comments Locked

343 Comments

View All Comments

  • erple2 - Tuesday, September 8, 2009 - link

    [quoting]
    Not only did the feature that provided the least benefit (triple vs. dual channel) drive the reason for the socket/pin count difference, they gimp the platform with superior tech by cutting PCIE lanes in half[/quoting]

    I thought that the X58 has the PCIe controller on the mobo, and the P55 doesn't? That the Lynnfield CPU's had a built-in PCIe controller, whereas the Bloomfields lacked the built-in PCIe controller? That appears to be another reason why intel had to make 2 separate sockets/platforms.

    Now, whether that was made intentionally to force this issue with multiple platforms is a side issue (IMO). I don't necessarily think that it's a problem.
  • JonnyDough - Tuesday, September 8, 2009 - link

    "Personally, from a consumer standpoint, I feel Intel botched the whole X58/P55 design and launch starting with the decision to go with 2 sockets. Not only did the feature that provided the least benefit (triple vs. dual channel) drive the reason for the socket/pin count difference, they gimp the platform with superior tech by cutting PCIE lanes in half."

    I believe it was intentional and not a botch. Intel was trying to separate a high and low end and to sell more chipsets. It's Intel being boss. It's what they do. Confuse the consumer, sell more crap, and hope that AMD stays a step behind. This is why we need AMD.

    Intel is good at marketing and getting consumers to jump on the latest trend. Remember the Pentium 4? Why buy a lower ghz chip when the P4 clocks higher right?

    The educated consumer waits and pounces when the price is right, not when the tech is new and seems "thrilling". This review is great but no offense it still almost seems to come with a "buy this" spin - which may be the only way a tech journalist can stay privy to getting new information ahead of the curve.
  • Comdrpopnfresh - Tuesday, September 8, 2009 - link

    You purposefully placing the possibility of overclocking solely in the hands of the lower chip, while completely disregarding the history and facts. This-or-that logical fallacy. Third option: You can overclock the higher-clocked chip too.
    Granted, I see your point about the hardware being of the same generation of the architecture; that lynnnfield is not the tock to bloomfield's tick (or the other way around if how you hear clocks starts mid-cycle) and therefore the silicon has the same ceiling for OC.
    But bloomfield is a like a D.I.N.K. household; dual-income-no-kids. When you overclock bloomfield, not only do you have the physical advantage of lower heat-density due to a large die, but you also don't have the whiny pci-e controller complaining how timmy at school doesn't have to be forced into overclocking. The on-die pci-e controller will hinder overclocking- period.
    Just like trying to overclock cpu's in nearly identical s775 motherboards/systems. The system with the igp keeps the fsb from overclocking too much. So then what- you buy a dedicated gpu, negate your igp you spent good money on, just to have your cpu scream?
    Except in this case, if one were able to disable the on-die pci-e controller and plop a gpu in a chipset-appointed slot (sticking with the igp mobo situation in s775) you'd be throwing away the money on the on-die goodies, and also throwing away the reduced latency it provides.

    Has it occured to anyone that this is going to open an avenue for artificial price inflation of ddr-3. Now the same products will be sold in packages of 3's and 2's? Sorry- just figured I'd change the subject from your broken heard still stick on overclocking.
  • chizow - Tuesday, September 8, 2009 - link

    quote:

    You purposefully placing the possibility of overclocking solely in the hands of the lower chip, while completely disregarding the history and facts. This-or-that logical fallacy. Third option: You can overclock the higher-clocked chip too.

    Actually in the real world, overclockers are finding the 920 D0s clock as well and often better than the 965s for sure (being C0), and even the 975s D0. You're certainly not going to see a 5x proportionate return in MHz on the difference spent between a $200 920 and a $1000 975. There is no third option because their maximum clock thresholds are similar and limited by uarch and process. The only advantage the XE versions enjoy is multiplier flexibility, a completely artificial restriction imposed by Intel to justify a higher price tag.
  • philosofool - Tuesday, September 8, 2009 - link

    Not seeing it dude. A little overvoltage and LGA 1156 overclocks with 1366.
  • chizow - Tuesday, September 8, 2009 - link

    Yes and early reports indicate they will overclock to equivalent clockspeeds, negating any Turbo benefit Lynnfield enjoys in the review. That leaves less subtle differences like multi-GPU performance where the X58 clearly shines and clearly outperfoms P55.
  • puffpio - Tuesday, September 8, 2009 - link

    In the article you refer to x264 as an alternative to h264
    in fact, h264 is just the standard (like jpeg or png) and x264 is an encoder that implements the standard. i wouldn't call it an alternative.

    that would be like saying photoshop is an alternative to jpeg, becuase it can save in jpeg format
  • puffpio - Tuesday, September 8, 2009 - link

    "You'd think that Intel was about to enter the graphics market or something with a design like this."

    dun dun dun! foreshadowing?

    ----

    and since these parts consume less power yet are built on the same process, I assume they run at lower voltage? If so, since they ARE built on the same process, I'd assume they can survive the voltages of the original Bloomfield and beyond? eg for overclocking...
  • Anand Lal Shimpi - Tuesday, September 8, 2009 - link

    Yes, Lynnfield shouldn't have a problem running at the same voltages as Bloomfield. The only unknown is the PCIe circuitry. I suspect that over time we'll figure out the tricks to properly overclocking Lynnfield.

    As far as Larrabee goes, I wouldn't expect much from the first generation. If Intel is *at all* competitive in gaming performance it'll be a win as far as they're concerned. It's Larrabee II and ultimately Larrabee III that you should be most interested in.

    The on-die PCIe controller is a huge step forward though. CPU/GPU integration cometh.

    Take care,
    Anand
  • Comdrpopnfresh - Tuesday, September 8, 2009 - link

    Have you seen bios implementations allowing for the controller to be disabled? Know if anyone intends to do this?

Log in

Don't have an account? Sign up now