Final Words

First keep in mind that these performance numbers are early, and they were run on a partly crippled, very early platform. With that preface, the fact that Nehalem is still able to post these 20 - 50% performance gains says only one thing about Intel's tick-tock cadence: they did it.

We've been told to expect a 20 - 30% overall advantage over Penryn and it looks like Intel is on track to delivering just that in Q4. At 2.66GHz, Nehalem is already faster than the fastest 3.2GHz Penryns on the market today. At 3.2GHz, I'd feel comfortable calling it baby Skulltrail in all but the most heavily threaded benchmarks. This thing is fast and this is on a very early platform, keep in mind that Nehalem doesn't launch until Q4 of this year.

One valid concern is with regards to performance in applications that don't scale well beyond two or four cores, what will Nehalem offer us then?  Our DivX test doesn't scale well beyond four cores and even then Nehalem's performance was in the 20 - 30% faster range that we've been expecting.  The other thing to keep in mind is that none of these tests are really stressing Nehalem's integrated memory controller.  When AMD made the move to an IMC, we saw an instant 20% performance boost in most applications.  I suspect that the applications that don't benefit from Hyper Threading, will at least benefit from the IMC.  We've only scratched the surface of Nehalem here, looking at the benefits of Hyper Threading and its lower latency unaligned cache accesses.  We've hinted at what's to come with the extremely well balanced and low latency memory hierarchy of Intel's new baby.  Once this thing gets closer to launch, we should be able to fill in the rest of the puzzle.

Over six years ago I had dinner with Intel's Pat Gelsinger (back when he was Intel's CTO), and I asked him the same question I always do: "what are you excited about?" Back then his response was "threading", Intel was about to launch Hyper Threading and Pat was convinced that it was absolutely necessary for the future of microprocessors.

It was at the same dinner that Pat mentioned Intel may do a chip with an integrated memory controller much like AMD, but that an IMC wouldn't solve the problem of idle execution units - only indirectly mitigate it. With Nehalem, Intel managed to combine both - and it only took 6 years to pull it off.

Pat also brought up another very good point at that dinner. He turned to me and said that you can only integrate a memory controller once, what do you do next to improve performance? Intel has managed to keep increasing performance, but what I really want to see is what happens at the next tock. Intel proved its ability with Conroe and with Nehalem it shows that the tick-tock model can work, but more than anything looking at Nehalem today makes me excited at what Sandy Bridge will bring.

The fact that we're able to see these sorts of performance improvements despite being faced with a dormant AMD says a lot. In many ways Intel is doing more to improve performance today than when AMD was on top during the Pentium 4 days.

AMD never really caught up to the performance of Conroe, through some aggressive pricing we got competition in the low end but it could never touch the upper echelon of Core 2 performance. With Penryn, Intel widened the gap. And now with Nehalem it's going to be even tougher to envision a competitive high-end AMD CPU at the end of this year. 2009 should hold a new architecture for AMD, which is the only thing that could possibly come close to achieving competition here. It's months before Nehalem's launch and there's already no equal in sight, it will take far more than Phenom to make this thing sweat.

Power Consumption
Comments Locked

108 Comments

View All Comments

  • kilkennycat - Thursday, June 5, 2008 - link

    Isn't 6GB of RAM a pretty sweet spot for desktop 64-bit applications, whatever about servers?
  • jimmysmitty - Thursday, June 5, 2008 - link

    Well I have been waiting for Nehalem. I gave in and decided to build a rig with the Q6600 but kinda sad now.

    Anwways. Crank the Planet, hes not showing fanboyism. He stated Intel has been promising 20-30% increase with Nehalem. They are seeing 20-50% from these benchmarks. Take 21 and divide it by 14 that gives you 1.5. That means that the AMD Phenoms latency is about 50% slower.

    If anything you are showing fanboyism. Nehalem is showing to be one hell of a chip and you are just angry that AMD has nothing to compare to it. Even after AMD finishes absorbing ATI whats next, K10.5 aka Deneb? Thats just a 45nm refresh (just like Penryn was for Conroe). Unless there are some major changes in the architecture it will just, hopefully, make Phenom run at higher clocks and cooler.

    Other than that I can't wait to see what this does for games. I know that most games are more GPU dependant but I myself play mainly Valve games using Source and thats very CPU dependant and already runs great on my Q6600 but I want to see what this game will do for their particle and physics system...
  • Nehemoth - Thursday, June 5, 2008 - link

    Please, Please, Please Intel I would to have this monsters chip in our servers without the annoying FBD, I don't want hoty FBD bring me normal DDR2 (without FBD) or DDR3.

    Just what I ask.

  • Griswold - Thursday, June 5, 2008 - link

    I'm a big fan of multi-core systems, but I'm not blind to reality: Why no single threaded benchmarks, but only benchmarks that scale very good with more cores/SMT? By the time these things will be on the market, most applikations will still be single threaded and you know it...

    I just want to know how much faster it is per clock per core.
  • Anand Lal Shimpi - Thursday, June 5, 2008 - link

    Interestingly enough, none of our standard CPU benchmarks are single threaded at all - even the most benign ones are multithreaded (including the games). I did run some single thread Cinebench numbers though:

    Nehalem - 3015
    Q9450 - 2396
  • bradley - Thursday, June 5, 2008 - link

    Why is there such a large discrepancy between previous single-threaded Cinebench tests from six months ago: where the Q9450 scored a 2944, or a mere 2.4% decrease, compared to the current 2396, or a more substantial 20.5% decrease.

    http://www.anandtech.com/printarticle.aspx?i=3153">http://www.anandtech.com/printarticle.aspx?i=3153

    I too believe single-threaded benches should be the foundation of any meaningful and relevant cpu review, if time indeed was permitting. To me this is the greatest objective real-world equalizer. There just isn't enough multi-threaded software out there, much less software able to run all eight cores. I would also like to emphasize that unlike server chips, desktop Nehalems will only have two memory channels. And as I understand, hyper threading also will only make an appearance in server and enthusiast chipsets. So already this makes an accurate comparison difficult enough.

    Finally, I understand the avg visitor will treat this like any good entertainment, where one is meant to suspend his-her disbelief. Still I have a hard time believing anyone has the ability to abscond away such important chips from a huge corporation like Intel. "Without Intel's approval, supervision, blessing or even desire - we went ahead and snagged us a Nehalem (actually, two) and spent some time with them." That initial premise does make anything coming after less impactful, or seemingly less than straightforward.

    Certainly if history has taught us anything, we know final shipping silicon is sometimes quite different from test chips. We should also assume it's a lot easier to create ond one chip than manufacture hundreds of thousands on a large scale. Nothing is ever a given, which makes it hard to draw much of a conclusion. Interesting preview nonetheless.
  • SiliconDoc - Monday, July 28, 2008 - link

    Shhhhh... gosh we have to have core hype ... and the multicore testers have to optimize for the coming chips... geeze they have to make a living somehow...
    ( You sir, are exactly correct, but we live in a strange world nowadays where the truth is so evident it must be hidden most of the time for various other reasons... )
    Gosh, you want to crash the whole economy with that sane and rational talk ?
    What are you an anarchist ? ( yes I'm kidding, that was a big high five to you)
  • Anand Lal Shimpi - Thursday, June 5, 2008 - link

    Ignore those numbers (check page 6 of the comments for an explanation), the Q9450 comes in at 2931 vs. Nehalem's 3015.

    -A
  • pnyffeler - Thursday, June 5, 2008 - link

    I'm not a Mac person, but I think Mac's may benefit from this technology even more than Vista. As I recall from a previous Anandtech article, Mac's have an excellent memory management system, which very direct benefit in increasing memory size. The increased bandwidth could make the snazzy OS even better...
  • Visual - Thursday, June 5, 2008 - link

    It is great that your "clock for clock" comparisons to the penryn in encoding and rendering are showing an improvement... but could that improvement be from the doubled amount of virtual processors that are visible? Are all of these benchmarks using eight or four threads on the nehalem?

Log in

Don't have an account? Sign up now