Homework: How Turbo Mode Works

AMD and Intel both figured out the practical maximum power consumption of a desktop CPU. Intel actually discovered it first, through trial and error, in the Prescott days. At the high end that's around 130W, for the upper mainstream market that's 95W. That's why all high end CPUs ship with 120 - 140W TDPs.

Regardless of whether you have one, two, four, six or eight cores - the entire chip has to fit within that power envelope. A single core 95W chip gets to have a one core eating up all of that power budget. This is where we get very high clock speed single core CPUs from. A 95W dual core processor means that individually the cores have to use less than the single 95W processor, so tradeoffs are made: each core runs at a lower clock speed. A 95W quad core processor requires that each core uses less power than both a single or dual core 95W processor, resulting in more tradeoffs. Each core runs at a lower clock speed than the 95W dual core processor.

The diagram below helps illustrate this:

  Single Core Dual Core Quad Core Hex Core
TDP
Tradeoff

 

The TDP is constant, you can't ramp power indefinitely - you eventually run into cooling and thermal density issues. The variables are core count and clock speed (at least today), if you increase one, you have to decrease the other.

Here's the problem: what happens if you're not using all four cores of the 95W quad core processor? You're only consuming a fraction of the 95W TDP because parts of the chip are idle, but your chip ends up being slower than a 95W dual core processor since its clocked lower. The consumer has to thus choose if they should buy a faster dual core or a slower quad core processor.

A smart processor would realize that its cores aren't frequency limited, just TDP limited. Furthermore, if half the chip is idle then the active cores could theoretically run faster.

That smart processor is Lynnfield.

Intel made a very important announcement when Nehalem launched last year. Everyone focused on cache sizes, performance or memory latency, but the most important part of Nehalem was far more subtle: the Power Gate Transistor.

Transistors are supposed to act as light switches - allowing current to flow when they're on, and stopping the flow when they're off. One side effect of constantly reducing transistor feature size and increasing performance is that current continues to flow even when the transistor is switched off. It's called leakage current, and when you've got a few hundred million transistors that are supposed to be off but are still using current, power efficiency suffers. You can reduce leakage current, but you also impact performance when doing so; the processes with the lowest leakage, can't scale as high in clock speed.

Using some clever materials engineering Intel developed a very low resistance, low leakage, transistor that can effectively drop any circuits behind it to near-zero power consumption; a true off switch. This is the Power Gate Transistor.

On a quad-core Phenom II, if two cores are idle, blocks of transistors are placed in the off-state but they still consume power thanks to leakage current. On any Nehalem processor, if two cores are idle, the Power Gate transistors that feed the cores their supply current are turned off and thus the two cores are almost completely turned off - with extremely low leakage current. This is why nothing can touch Nehalem's idle power:

Since Nehalem can effectively turn off idle cores, it can free up some of that precious TDP we were talking about above. The next step then makes perfect sense. After turning off idle cores, let's boost the speed of active cores until we hit our TDP limit.

On every single Nehalem (Lynnfield included) lies around 1 million transistors (about the complexity of a 486) whose sole task is managing power. It turns cores off, underclocks them and is generally charged with the task of making sure that power usage is kept to a minimum. Lynnfield's PCU (Power Control Unit) is largely the same as what was in Bloomfield. The architecture remains the same, although it has a higher sampling rate for monitoring the state of all of the cores and demands on them.

The PCU is responsible for turbo mode.

New Heatsinks and Motherboards Lynnfield's Turbo Mode: Up to 17% More Performance
Comments Locked

343 Comments

View All Comments

  • strikeback03 - Tuesday, September 8, 2009 - link

    How would you have graphics then? You would be limited to the 4xPCIe off the P55 on motherboards which support it, as there are no integrated graphics (yet)
  • MX5RX7 - Tuesday, September 8, 2009 - link

    I'm not sure that CPU/GPU integration is a good thing, from a consumer standpoint. At least in the short term.

    For example, in the article you mention how the majority of modern games are GPU, not CPU limited. The current model allows us to purchase a very capable processor and pair it with a very capable GPU. Then, when the ultra competitive GPU market has provided us with a choice of parts that easily eclipse the performance of the previous generation, we either swap graphics cards for the newer model, or purchase a second now cheaper identical card and (hopefully) double our game performance with SLI or Crossfire. All without having to upgrade the rest of the platform.

    With the current model, a new graphics API requires a new graphics card. With Larrabee, it might very well require a whole new platform.

  • Ben90 - Tuesday, September 8, 2009 - link

    Yea, im really excited for Larrabee, who knows if it will be good or not... but with intel kicking ass in everything else, it will at least be interesting

    With overclocking performance seemingly being limited by the PCI-E controller, it seems like an unlocked 1156 would be pretty sweet

    All in all i gotta admit i was kinda bitter with this whole 1156 thing because i jumped on the 1336 bandwagon and it seemed that Intel was mostly just jacking off with the new socket... but this processor seems to bring a lot more innovation than i expected (just not in raw performance, still great performance though)
  • chizow - Tuesday, September 8, 2009 - link

    Was worried no one was going to properly address one of the main differences between P55 and X58, thanks for giving it a dedicated comparison. Although I would've like to have seen more games tested, it clearly indicates PCIE bandwidth becoming an issue with current generation GPUs. This will only get worst with the impending launch of RV8x0 and GT300.
  • Anand Lal Shimpi - Tuesday, September 8, 2009 - link

    PCIe bandwidth on Lynnfield is only an issue with two GPUs, with one you get the same 16 lanes as you would on X58 or AMD 790FX.

    If I had more time I would've done more games, I just wanted to focus on those that I knew scaled the best to see what the worst case scenario would be for Lynnfield.

    In the end 2 GPUs are passable (although not always ideal on Lynnfield), but 4 GPUs are out of the question.

    Take care,
    Anand
  • JumpingJack - Thursday, September 10, 2009 - link

    Anand, a few other sites have attempted SLI/Xfire work ... on in particular shows 4 GPUs having no impact at all on gaming performance in general -- well, 3 or 4 FPS, but nothing more than a few percentages over norm.

    Could your configuration with beta or just bad first release drivers be an issue?

    Jack
  • JonnyDough - Tuesday, September 8, 2009 - link

    Would it be possible to incorporate two GPU controllers onto a die instead of one or is that what they'll be doing with future procs? I would think that two controllers with a communication hub might supply the needed bandwidth of x16 + x16.
  • Comdrpopnfresh - Tuesday, September 8, 2009 - link

    with two gpu's being passable- do you foresee that applying to both two independent gpus, as well as the single dual-card gpus?
  • Ryan Smith - Tuesday, September 8, 2009 - link

    Yes. The only difference between the two is where the PCIe bridge chip is. In the former it's on the mobo, in the latter it's on the card itself.
  • Eeqmcsq - Tuesday, September 8, 2009 - link

    Talk about bringing a bazooka to a knife fight. AMD better be throwing all their innovation ideas and the kitchen sink into Bulldozer, because Intel is thoroughly out-innovating AMD right now.

Log in

Don't have an account? Sign up now