Lynnfield's Turbo Mode: Up to 17% More Performance

Turbo on Bloomfield (the first Core i7) wasn't all that impressive. If you look back at our Core i7 article from last year you'll see that it's responsible for a 2 - 5% increase in performance depending on the application. All Bloomfield desktop CPUs had 130W TDPs, so each individual core had a bit more breathing room for how fast it could run. Lynnfield brings the TDP down around 27%, meaning each core gets less TDP to work with (the lower the TDP, the greater potential there is for turbo). That combined with almost a full year of improving yields on Nehalem means that Intel can be much more aggressive with Turbo on Lynnfield.

  SYSMark 2007: Overall Dawn of War II Sacred 2 World of Warcraft
Intel Core i7 870 Turbo Disabled 206 74.3 fps 84.8 fps 60.6 fps
Intel Core i7 870 Turbo Enabled 233 81.0 fps 97.4 fps 70.7 fps
% Increase from Turbo 13.1% 9.0% 14.9% 16.7%

 

Turbo on Lynnfield can yield up to an extra 17% performance depending on the application. The biggest gains will be when running one or two threads as you can see from the table below:

Max Speed Stock 4 Cores Active 3 Cores Active 2 Cores Active 1 Core Active
Intel Core i7 870 2.93GHz 3.20GHz 3.20GHz 3.46GHz 3.60GHz
Intel Core i7 860 2.80GHz 2.93GHz 2.93GHz 3.33GHz 3.46GHz
Intel Core i5 750 2.66GHz 2.80GHz 2.80GHz 3.20GHz 3.20GHz

If Intel had Turbo mode back when dual-cores first started shipping we would've never had the whole single vs. dual core debate. If you're running a single thread, this 774M transistor beast will turn off three of its cores and run its single active core at up to 3.6GHz. That's faster than the fastest Core 2 Duo on the market today.


WoW doesn't stress more than 2 cores, Turbo mode helps ensure the i7 870 is faster than Intel's fastest dual-core CPU

It's more than just individual application performance however, Lynnfield's turbo modes can kick in when just interacting with the OS or an application. Single threads, regardless of nature, can now execute at 3.6GHz instead of 2.93GHz. It's the epitomy of Intel's hurry up and get idle philosophy.

The ultimate goal is to always deliver the best performance regardless of how threaded (or not) the workload is. Buying more cores shouldn't get you lower clock speeds, just more flexibility. The top end Lynnfield is like buying a 3.46GHz dual-core processor that can also run well threaded code at 2.93GHz.

Take this one step further and imagine what happens when you have a CPU/GPU on the same package or better yet, on the same die. Need more GPU power? Underclock the CPU cores, need more CPU power? Turn off half the GPU cores. It's always availble, real-time-configurable processing power. That's the goal and Lynnfield is the first real step in that direction.

Speed Limits: Things That Will Keep Turbo Mode from Working

As awesome as it is, Turbo doesn't work 100% of the time, its usefulness varies on a number of factors including the instruction mix of active threads and processor cooling.

The actual instructions being executed by each core will determine the amount of current drawn and total TDP of the processor. For example, video encoding uses a lot of SSE instructions which in turn keep the SSE units busy on the chip; the front end remains idle and is clock gated, so power is saved there. The resulting power savings are translated into higher clock frequency. Intel tells us that video encoding should see the maximum improvement of two bins with all four cores active.

Floating point code stresses both the front end and back end of the pipe, here we should expect to see only a 133MHz increase from turbo mode if any at all. In short, you can't simply look at whether an app uses one, two or more threads. It's what the app does that matters.

There's also the issue of background threads running in the OS. Although your foreground app may only use a single thread, there are usually dozens (if not hundreds) of active threads on your system at any time. Just a few of those being scheduled on sleeping cores will wake them up and limit your max turbo frequency (Windows 7 is allegedly better at not doing this).

You can't really control the instruction mix of the apps you run or how well they're threaded, but this last point you can control: cooling. The sort-of trump all feature that you have to respect is Intel's thermal throttling. If the CPU ever gets too hot, it will automatically reduce its clock speed in order to avoid damaging the processor; this includes a clock speed increase due to turbo mode.


Lynnfield and its retail cooler

The retail cooler that ships with the Core i7 is tiny and while it's able to remove heat well enough to allow the chip to turbo up, we've seen instances where it doesn't turbo as well due to cooling issues. Just like we recommended in the Bloomfield days, an aftermarket cooler may suit you well.

Lynnfield: Made for Windows 7 (or vice versa)

Core Parking is a feature included in Windows 7 and enabled on any multi-socket machine or any system with Hyper Threading enabled (e.g. Pentium 4, Atom, Core i7). The feature looks at the performance penalty from migrating a thread from one core to another; if the fall looks too dangerous, Windows 7 won't jump - the thread will stay parked on that core.

What this fixes are a number of the situations where enabling Hyper Threading will reduce performance thanks to Windows moving a thread from a physical core to a logical core. This also helps multi-socket systems where moving a thread from one core to the next might mean moving it (and all of its data) from one memory controller to another one on an adjacent socket.

Core Parking can't help an application that manually assigns affinity to a core. We've still seen situations where HT reduces performance under Windows 7 for example with AutoCAD 2010 and World of Warcraft.

With support in the OS however, developers should have no reason to assign affinity in software - the OS is now smart enough to properly handle multi-socket and HT enabled machines.

Homework: How Turbo Mode Works Lynnfield's Un-Core: Faster Than Most Bloomfields
POST A COMMENT

341 Comments

View All Comments

  • Seramics - Wednesday, September 09, 2009 - link

    So what's the big deal here? I dun tink its that impressive, just good. While S196 of 750 look to outcompete the "way" more expensive $245 of AMD's 965, the truth is that the mobo that you need to pair the 750/860/870 is far from being competitive. P55 is severely stripped down and it is only slightly cheaper than their X58 counterpart. So wht if 750 is cheaper than 965 by about %50? Did you just buy the cpu only? Ppl shud at least look at the CPU+mobo price because they both come together. Truth is, when you take into account mobo price, 750 is far from outcompete 965. Added up, I think its only about balanced. The 750 is a better CPU, but it also cost more. In comparison to their socket 1366 partner, socket 1156 system cost a little less, but they are also inferior a little bit. So what's special them? Sure, there are better turbo and better thermal performance. For me, that is all that is good about the 1156 CPU. For enthusiast, socket 1366 is the way to go. Reply
  • jnr0077 - Friday, July 27, 2012 - link

    i have a i5 750 chip cost £100 a gigabyte GA-P55A-UD6 cost £100 as it has six ram slots 16gb max radeon hd 4850 i love this mobo i cant fault it for the price i find it is a brilliant upgrade for cost i spent £250 considering the price of shops build you own pc you get what you put in :) very happy with the i5 750 1156 socket windows score on basic 500gb 7200 is 5.9 sweet 7.9 with a ssd :) can anyone tell me what the amd 965 hit on base score as i will never DV8 to amd intel 4 me allways :) Reply
  • hob196 - Wednesday, September 09, 2009 - link

    Hi,
    Thanks for another great article.
    I figure that having PCI-e on chip would be great to reduce the latency. Any thoughts about plugging non graphics PCI-e cards into the second PCI-e slot?
    I've heard some motherboards cripple the 2nd slots performance down to x1 if you plug an x1 card in the other slot (in a shared x8 environment)any evidence of this?

    In case you're curious I work with digital audio in a studio environment and I'm always striving to reduce the latency of audio going through the CPU.
    These days, the latency (in streaming audio) is down to how fast the CPU can push floating point plus any overhead for the buffers in the various busses you go through. e.g. A firewire sound interface adds a few ms because of the inherent buffers between CPU -> Northbridge -> Southbridge -> Firewire -> Interface.
    Reply
  • tempestor - Wednesday, September 09, 2009 - link

    Another great article Anand!

    You should consider a 2nd job as a novel writer! :D

    lp, M.
    Reply
  • AndyKH - Wednesday, September 09, 2009 - link

    I don't really get it:
    It is stated that most PCIe cards don't work well with higher frequencies and that the BCLK frequency should be kept at multiples of 133 MHz, and then they overclock it using a BCLK of ~200 MHz in one instance???
    Doesn't the 133 MHz requirement make it pretty much impossible to overclock?

    Someone please enlighten me.
    Reply
  • Anand Lal Shimpi - Wednesday, September 09, 2009 - link

    It doesn't make it impossible to overclock, just impossible to overclock (very high) without additional voltage.

    Take care,
    Anand
    Reply
  • AndyKH - Thursday, September 10, 2009 - link

    Thank you for the response!

    I see how using a higher voltage will increase switching speed of the buffers driving the PCIe bus. However, I fail to see why it would make it any less dificult for PCIe cards to cope with the increased clock frequency, unless the increased voltage is also fed to the PCIe cards (is this the case?). Otherwise I assume they would surely experience the same problems driving communication to the CPU?

    Also, you write multiples of 133 MHz but overclock to 200 MHz BCLK. Shouldn't it read multiples of 33 MHz?
    Reply
  • TotalLamer - Wednesday, September 09, 2009 - link

    I really, really don't understand why Anand is so obsessed with Turbo Modes. Any enthusiast who dares call himself such is going to clock this chip to the moon, at which point Turbo doesn't do anything. So with a 4.2GHz i7 870, all you're really left with is an i7 920 with worse multi-GPU gaming performance and and a less-certain upgrade path. Reply
  • coconutboy - Wednesday, September 09, 2009 - link

    You're assuming all enthusiasts think like you do, but the heavy majority of people (enthusiast or not) want nothing to do with a $500+ i7 870 cpu. The i7 920, 860, and i5 920 are much more attractive options.

    There are plenty of "enthusiasts" who instead prefer silent computers that use no fans, or people living in hot climates who focus on very low temps, or all manner of different things. On top of that, the overwhelming majority of people simply do not care about any of the aforementioned, and those people buy the heavy majority of computers.

    I started OCing in 1996, and used to OC pretty heavily, but got tired of constant tweaking or seeing my well-worn parts die prematurely. Now I tend to focus on very quiet computers that have a small/moderate overclock. So taking an i5 750 or i7 860 and raising it up 200-400 MHz and leaving turbo on is very appealing to me. Also of note is the extra heat generated and the extra money I'll spend on my electric bill by having a 24/7 overclock versus turbo modes. Dig the link and scroll to the bottom-

    http://www.guru3d.com/article/core-i5-750-core-i7-...">http://www.guru3d.com/article/core-i5-750-core-i7-...
    review-test/10

    The 13 watt increase at idle is no big deal, but 133 extra watts under load, well... it's worth the performance boost and heat to some folks, but other people (like me) look at those things as tradeoffs that need to be weighed versus reliability, cost for extra cooling, noise, my electric bill etc.
    Reply
  • Skiprudder - Thursday, September 10, 2009 - link

    I think that some of us are quite honestly getting more green conscious these days too. It's nice to have a CPU this fast that's also this energy efficient. We can get similar to OCed performance at a much smaller power envelope. I know it doesn't add up to a lot over the course of a year (less than $100 I assume), but these things add up and it saves me some dinero on the power bills! Reply

Log in

Don't have an account? Sign up now