Homework: How Turbo Mode Works

AMD and Intel both figured out the practical maximum power consumption of a desktop CPU. Intel actually discovered it first, through trial and error, in the Prescott days. At the high end that's around 130W, for the upper mainstream market that's 95W. That's why all high end CPUs ship with 120 - 140W TDPs.

Regardless of whether you have one, two, four, six or eight cores - the entire chip has to fit within that power envelope. A single core 95W chip gets to have a one core eating up all of that power budget. This is where we get very high clock speed single core CPUs from. A 95W dual core processor means that individually the cores have to use less than the single 95W processor, so tradeoffs are made: each core runs at a lower clock speed. A 95W quad core processor requires that each core uses less power than both a single or dual core 95W processor, resulting in more tradeoffs. Each core runs at a lower clock speed than the 95W dual core processor.

The diagram below helps illustrate this:

  Single Core Dual Core Quad Core Hex Core
TDP
Tradeoff

 

The TDP is constant, you can't ramp power indefinitely - you eventually run into cooling and thermal density issues. The variables are core count and clock speed (at least today), if you increase one, you have to decrease the other.

Here's the problem: what happens if you're not using all four cores of the 95W quad core processor? You're only consuming a fraction of the 95W TDP because parts of the chip are idle, but your chip ends up being slower than a 95W dual core processor since its clocked lower. The consumer has to thus choose if they should buy a faster dual core or a slower quad core processor.

A smart processor would realize that its cores aren't frequency limited, just TDP limited. Furthermore, if half the chip is idle then the active cores could theoretically run faster.

That smart processor is Lynnfield.

Intel made a very important announcement when Nehalem launched last year. Everyone focused on cache sizes, performance or memory latency, but the most important part of Nehalem was far more subtle: the Power Gate Transistor.

Transistors are supposed to act as light switches - allowing current to flow when they're on, and stopping the flow when they're off. One side effect of constantly reducing transistor feature size and increasing performance is that current continues to flow even when the transistor is switched off. It's called leakage current, and when you've got a few hundred million transistors that are supposed to be off but are still using current, power efficiency suffers. You can reduce leakage current, but you also impact performance when doing so; the processes with the lowest leakage, can't scale as high in clock speed.

Using some clever materials engineering Intel developed a very low resistance, low leakage, transistor that can effectively drop any circuits behind it to near-zero power consumption; a true off switch. This is the Power Gate Transistor.

On a quad-core Phenom II, if two cores are idle, blocks of transistors are placed in the off-state but they still consume power thanks to leakage current. On any Nehalem processor, if two cores are idle, the Power Gate transistors that feed the cores their supply current are turned off and thus the two cores are almost completely turned off - with extremely low leakage current. This is why nothing can touch Nehalem's idle power:

Since Nehalem can effectively turn off idle cores, it can free up some of that precious TDP we were talking about above. The next step then makes perfect sense. After turning off idle cores, let's boost the speed of active cores until we hit our TDP limit.

On every single Nehalem (Lynnfield included) lies around 1 million transistors (about the complexity of a 486) whose sole task is managing power. It turns cores off, underclocks them and is generally charged with the task of making sure that power usage is kept to a minimum. Lynnfield's PCU (Power Control Unit) is largely the same as what was in Bloomfield. The architecture remains the same, although it has a higher sampling rate for monitoring the state of all of the cores and demands on them.

The PCU is responsible for turbo mode.

New Heatsinks and Motherboards Lynnfield's Turbo Mode: Up to 17% More Performance
Comments Locked

343 Comments

View All Comments

  • jonup - Tuesday, September 8, 2009 - link

    Unfortunately people in corporate world do not make a difference between a HD4500 and a GX790. As lond as the Intel can display spreadsheets its good enough (or better) than a GTX295/HD4890X2, because it is Intel. You can change ignorance when it works.
  • PassingBy - Tuesday, September 8, 2009 - link

    My horizons are broad enough, thank you. The needs of many corporate desktops/laptops will be met by Clarkdale/Arrandale and no, nobody will go blind or suffer eyestrain (by virtue of the IGP anyway).
  • PassingBy - Tuesday, September 8, 2009 - link

    No edit function, so, as I point out later in the thread, people reading this review presumably won't be interested in IGPs anyway, given that these processors now have no IGP market. Wait for Clarkdale before trying to compare IGPs.
  • dragunover - Tuesday, September 8, 2009 - link

    Thanks for the review, if not as soon as I wanted it!
  • Boobs McGee - Tuesday, September 8, 2009 - link

    Do you guys have plans to do a motherboard review roundup for P55?
    If not, please do.
  • Gary Key - Tuesday, September 8, 2009 - link

    I actually have three roundups planned, we have 15 boards here ranging from the $100 uATX items up to the $300 EVGA Classified series. We are only testing with retail products, released BIOS', and retail processors so the delivery of more than 70% of the boards late last week has created a small logjam. ;)
    The first article should be up on Thursday with a couple of my favorite boards and then a rather large one up on Monday and the last one a few days after that. Raja is working on a separate roundup of the top three boards targeted for the more extreme OC community. We will also have a P55 memory specific article shortly.
  • ClagMaster - Tuesday, September 8, 2009 - link

    Looking forward to reading these P55 motherboard roundups.
  • Anand Lal Shimpi - Tuesday, September 8, 2009 - link

    Yes, Gary is nearly complete with his. Give him another day and it'll be up :)

    Take care,
    Anand
  • Comdrpopnfresh - Tuesday, September 8, 2009 - link

    By creating a new socket- they're providing a disincentive for early adopters of bloomfield. This chip is literally a humpty-dumpty that stands to benefit intel with everyone suffering a small loss of their own. The benefits of lynnfield vs bloomfield come from shuffling the architectural deck of nehalem. In reality, it only shows the possibilities of an inflexible architecture.

    The turbo mode isn't cutting it in day-to-day power consumption reduction. On the scale of a day, the average shmoe who is ass enough to leave a computer on for no reason gains no benefit. Lower the reach of a voltage plane, and reduce the number of components sucking juice, that only present benefits under certain situations (a third memory channel), and shmoe is happier.

    If it was in the article, I apologize, but with the pci-e controller being on the un-core... what happens on a chipset with integrated graphics? Will the igp be linked to the processor now, rather than a bridge chip? If ati or nvidia made their own supporting chipsets with an igp- would the igp represent a chip onto itself, solely connected to the cpu, or would it have to work through dmi, and leave those on-die pci-e lanes for domestic usage?

    It seems this is the warning rattle to nvidia that they chose their place with ion, and are stuck in it. When the change to 32nm comes, and the gpu is integrated into the cpu- what kind of robust 3rd party chipsets could exist in the budget end? Sure, you can always add a dedicated, off-die, gpu... but for budget boards used to eons of making room for a cpu and working a bridge chip around an igp- either horrible inefficiencies will creep up, or higher prices.
    My money is on westmere having at least three power planes.
    I'd like to know: with the pci-e controller on-die now... what impact this puts on graphics cards with higher on-card memory. Does it strengthen or minimalize it?
    And, can the cpu now share the gpu's memory as a way to extend cache- after years of being forced to share the system pool. That 16gb/s link to gddr5 looks mouthwatering. I'd like to see performance tests run with the pci-e varient ssds floating around out there saddled to the on-die pci-e lanes, and a graphics card running off of chipset. Rather than elevating a horse-power driven graphics subsystem, I think the benefits of supplying more 'torque' by freeing mass storage ssds from the SATA interface would be far more substantial, and in all applications of the PC. You already have the means for nearly 2+2/3 times the theoretical bandwidth of SATA-6- which up til now seems rather bug-ridden and defunct.

    Also interested in the outcomes of usb3 with this- as usb is built on the foundations of pci-e, is it not? If usb3 can allow for pci-e externally, and you remove the latency issue of usb signaling traveling from some peripheral bridge chip to the cpu, and just jack the usb3 communications into the cpu... could one use usb3 as a computer-to-computer psuedo qpi teaming/networking bridge for inter-desktop cpu communication. skip the entire bottleneck of client-level software implementation, and the subsystem communication buses for out-of-box signaling too...
  • plague911 - Tuesday, September 8, 2009 - link


    The market just got a little more crowded so hopefully this will bring a reduction in prices of the 920. but..

    “The Core i7 870 gets close enough to the Core i7 975 that I'm having a hard time justifying the LGA-1366 platform at all. As I see it, LGA-1366 has a few advantages:
    1) High-end multi-GPU Performance
    2) Stock Voltage Overclocking
    3) Future support for 6-core Gulftown CPUs

    Your exactly right 1366 I think is going to be be the best option to “future proof” my system however the new chips make the 920- seem a little low on features. With the goal of “performance on a budget” I feel like we are stuck either getting a board with a socket which wont compete in the future, or chip which is weaker than its lower class cousins. Unfortunately I dont see any of this being fixed in the next few cycles. Id like to see a low clocked gulftown (to save cost) feature rich with good OC potential thats on the lower end of the price scale. To me this would be a good follow up to 920 but but it dosent seem like that will be coming out for several cycles. Unless ofc i'm missing something which is probably the case.

Log in

Don't have an account? Sign up now