This is a very volatile time for Intel. In an ARM-less vacuum, Intel’s Haswell architecture would likely be the most amazing thing to happen to the tech industry in years. In mobile Haswell is slated to bring about the single largest improvement in battery life in Intel history. In graphics, Haswell completely redefines the expectations for processor graphics. There are even some versions that come with an on-package 128MB L4 cache. And on the desktop, Haswell is the epitome of polish and evolution of the Core microprocessor architecture. Everything is better, faster and more efficient.

There’s very little to complain about with Haswell. Sure, the days of insane overclocks without touching voltage knobs are long gone. With any mobile-first, power optimized architecture, any excess frequency at default voltages is viewed as wasted power. So Haswell won’t overclock any better than Ivy Bridge, at least without exotic cooling.

You could also complain that, for a tock, the CPU performance gains aren’t large enough. Intel promised 5 - 15% gains over Ivy Bridge at the same frequencies, and most of my tests agree with that. It’s still forward progress, without substantial increases in power consumption, but it’s not revolutionary. We compare the rest of the industry to Intel’s excellent single threaded performance and generally come away disappointed. The downside to being on the top is that virtually all improvements appear incremental.

The fact of the matter is that the most exciting implementations of Haswell exist outside of the desktop parts. Big gains in battery life, power consumption and even a broadening of the types of form factors the Core family of processors will fit into all apply elsewhere. Over the coming weeks and months we’ll be seeing lots of that, but today, at least in this article, the focus is on the desktop.

Haswell CPU Architecture Recap

Haswell is Intel’s second 22nm microprocessor architecture, a tock in Intel’s nomenclature. I went through a deep dive on Haswell’s Architecture late last year after IDF, but I’ll offer a brief summary here.

At the front end of the pipeline, Haswell improved branch prediction. It’s the execution engine where Intel spent most of its time however. Intel significantly increased the sizes of buffers and datastructures within the CPU core. The out-of-order window grew, to feed an even more parallel set of execution resources.

Intel added two new execution ports (8 vs 6), a first since the introduction of the Core microarchitecture back in 2006.

On the ISA side, Intel added support for AVX2, which includes an FMA operation that considerably increases FP throughput of the machine. With a doubling of peak FP throughput, Intel doubled L1 cache bandwidth to feed the beast. Intel also added support for transactional memory instructions (TSX) on some Haswell SKUs.

The L3 cache is now back on its own power/frequency plane, although most of the time it seems to run in lockstep with the CPU cores. There appears to be a 2 - 3 cycle access penalty as a result of decoupling the L3 cache.

Power Improvements
POST A COMMENT

209 Comments

View All Comments

  • jeffkibuule - Saturday, June 01, 2013 - link

    I wouldn't say that Pentium 4 was terrible, but their 2004-2006 exercise of continually pumping up clocks was misguided. Reply
  • Nfarce - Saturday, June 01, 2013 - link

    Exactly. As someone who still has my P4 Northwood 3.06GHz (with HT) as a general use PC, I loved it. It served as my main gaming and photo/video editing PC back in the day, and was only replaced with a C2D E8400 overclock build four and a half years ago (which was replaced two years ago with a SB 2500k build). Anyone who says the P4 was terrible is either an AMD fanboy trolling or never had one at the time. Reply
  • bji - Saturday, June 01, 2013 - link

    By any reasonable metric, P4s were pretty bad. Glad you like yours but that's mostly because even back in the P4 days CPUs were already "fast enough" most of the time for most tasks and you probably would have liked a Pentium M or Athlon just as well. P4s started out with very weak performance and were improved a decent amount during the lifetime of the architecture, but they were never spectacular performers vs. the competition and they were always extremely hot and power hungry. Also Rambus memory was a joke.

    More on topic, I'm not surprised that Haswell isn't significantly faster than Ivy Bridge. I said when Sandy Bridge came out that the x86 architecture would never get 50% faster per core than Sandy Bridge. With the combination of nearing the end of the road for process shrinking, the architecture itself already having been optimized to such a degree that any additional significant gains come at an extremely high transistor and R&D cost, the declining of importance of the x86 market as mobile devices become more prominent, and the "already much more than fast enough" aspect of modern CPUs for the vast majority of what they're used for, it's pretty clear that we'll never see significant increases in x86 speed again. There just isn't enough money available in the market to fund the extremely high costs necessary to significantly increase speed in a market where fast enough was achieved years ago.

    I'll stand by my statement of ~2 years ago: x86 will top out at 50% faster than Sandy Bridge per core.
    Reply
  • nunomoreira10 - Saturday, June 01, 2013 - link

    Maybe not on the comon instruncion set, wich intel has already adress on haxwell, just wait for the software to update to avx2 and you will see how slow sandy bridge is by comparation Reply
  • klmccaughey - Monday, June 03, 2013 - link

    @bji: Totally agree. We are in the halcion days and I can't see the likes of the 4770k getting significantly more powerful any time soon. I believe it will take a huge technology breakthrough in terms of fab materials, along the lines of optical or biological chips. At least 10 years away.

    The corollary to this is that we don't actually really need any more power. We already have the level of "good enough" for the GPU (in gaming terms). In terms of compute power, that is definitely continuing in the concurrency paradigm - which is where it should be, it makes sense. Programmers (like myself) are proceeding along these lines to get more power.

    I think we are at either a pivotal point or a point of divergence again in computer technology. It's very exciting and interesting for me :)
    Reply
  • jmelgaard - Sunday, June 02, 2013 - link

    Wait what... I must be an AMD fanboy then (although I love Intel and never owned an AMD >.<, lol)...

    Honestly, the P4 platform was terrible in many aspects, and yes I did own one, several actually (2.266, 2.4, 2.8)... But having a Dual Pentium III 1GHz at the time as well made it pretty obvious to me how bad the P4 really was... Granted all those P4 was at lower clocks than yours...

    But nothing is bad not to be good for something, after all intel's after the P4 generation has all been pretty amazing...

    More in the topic though, I am a bit dismayed and disappointed that the power consumption goes up compared to the last generation under load... Great that the idle power goes that much down, but I would rather see the exact same performance as 3rd gen and a huge power reduction... After all, performance wise I am still over satisfied with my i970... I don't feel like i need more juice, so I would rather save some bucks on the electrical bill... Obviously there will be different minds about that part... Just saying what I feel...
    Reply
  • Donkey2008 - Monday, June 03, 2013 - link

    Weird how you keep saying how "bad" it was in it's time, yet you present no actual facts to back that up. About the only bad thing I ever saw with the P4 were high temps, which any decent HSF fixed. Reply
  • bji - Monday, June 03, 2013 - link

    It was so bad that Intel had to pay vendors not to buy the competitor's chips, an action that they were later sued for and settled to the tune of $1.25 billion.

    The P4 started out very badly; it was very power hungry and had weak performance compared to the competition. Intel was also the only company able to make chip sets for it (can't remember if there were technical or legal reasons behind this or both), and they refused to support any memory but Rambus (for a long time), further hurting their cause by propping up a company that is pretty much the dregs of submarine patent lawsuit filth.

    I can't think of any way in which the P4 was better than its competition of the day except that it had Intel's sleazy business practices behind it, if you consider that "better". It certainly played better in the marketplace, ethics notwithstanding.

    You may have been happy with your P4 because it did what you needed it to do. Awesome. Nobody is saying that the P4 didn't work or that it couldn't actually fulfill the duties of a CPU, we're just saying that compared to its contemporaries, it kinda blew chunks.
    Reply
  • superjim - Wednesday, June 05, 2013 - link

    I had two P4 chips (2.4 Northwood and 3.0 Prescott) along with many Athlon XP systems (Palomino, Thoroughbred and Barton) and the Athlon's beat the P4s in nearly every metric. Then came the Athlon 64 to solidify AMD's crown. It wasn't until the original Core (Conroe) chips when Intel came screaming back and have held it since. Reply
  • Donkey2008 - Monday, June 03, 2013 - link

    "Anyone who says the P4 was terrible is either an AMD fanboy trolling or never had one at the time. "

    +5

    My Northwood 3GHz was as fast, stable and solid as any CPU I have ever owned. Performed slightly slower than an equivalent A64, but nothing noticable to the human eye. Maybe these people who bag on it have bionic eyes.
    Reply

Log in

Don't have an account? Sign up now