The GPU: Intel HD 5000 (Haswell GT3)

Hire enough ex-ATIers and you’ll end up really caring about GPU performance apparently. It’s good to see that Apple still views increasing GPU performance as non-negotiable, even at the MacBook Air level. Discrete GPUs are out of the question in the MacBook Air, so all models ship with Intel’s on-die processor graphics. More importantly, all CPU choices integrate the largest GPU offering: Intel’s HD 5000 (aka GT3).

Clock speeds alone prevent the 40 EU GPU implementation from being called an Iris 5100. Given the 15W TDP limit, Intel wouldn’t be able to do the Iris name justice even if it tried.

It’s hilarious that Intel refused to give out die photos for anything other than quad-core Haswell GT2, citing competitive concerns, yet at Apple’s WWDC launch of the new MacBook Airs we got to see the first die shot of a dual-core Haswell GT3. Update: I stand corrected. Intel posted its own shot here.

From the die photo it’s very obvious that like the quad-core Haswells with Iris Pro, the dual-core GT3 parts are over half GPU. Here's the only Haswell die shot Intel PR officially released by comparison, a quad-core GT2 part that's mostly made up of CPU cores:

Similar to the CPU discussion, on the GPU front Haswell has to operate under more serious thermal limits than with Ivy Bridge. Previously the GPU could take the lion’s share of a 17W TDP with 16 EUs, now it has 15W to share with the PCH as well as the CPU and 2.5x the number of EUs to boot. As both chips are built on the same 22nm (P1270) process, power either has to go up or clocks have to come down. Intel rationally chose the latter. What you get from all of this is a much larger GPU, that can deliver similar performance at much lower frequencies. Lower frequencies require lower voltage, which in turn has a dramatic impact on power consumption.

Take the power savings you get from all of this machine width, frequency and voltage tuning and you can actually end up with a GPU that uses less power than before, while still delivering incrementally higher performance. It’s a pretty neat idea. Lower cost GPUs tend to be smaller, but here Intel is trading off die area for power - building a larger GPU so it can be lower power, instead of just being higher performance.

A Historical Look at MacBook Air GPU Performance
  2011 2012 2013
GPU Intel HD 3000 Intel HD 4000 Intel HD 5000
Manufacturing Process 32nm 22nm 22nm
Frequency 350/1150MHz 350/1050MHz 200/1000MHz
Cores 12 16 40
Peak GFLOPS 165.6 GFLOPS 268.8 GFLOPS 640 GFLOPS
TDP 17W 17W 15W

This is an even bigger deal because few of the other OEMs seem interested in paying for the larger die. Acer’s S7 uses Intel’s HD 4400 (Haswell GT2, 20 EUs), as do most of the other Haswell Ultrabooks that have been announced thus far. Armed with a 2011, 2012 and 2013 MacBook Air as well as Acer’s 2nd generation S7, we now have the ability to compare everything from Intel’s HD 3000 (Sandy Bridge) all the way up to HD 5000. It’s important to keep in mind that with the exception of HD 3000, everything here is built on the same 22nm process, and with HD 4400/5000 TDPs actually went down. In other words, post Ivy Bridge, any GPU performance gains were very hard to come by.

I’m splitting up the GPU performance data into three sections. The first is a look at some games/settings that are actually playable on processor graphics. The second is comparison data for laptop Bench. The deltas here are academic at best since nothing slower than Iris Pro can really deliver playable frame rates in our high-end notebook GPU tests. The final section focuses on synthetic performance, which should help characterize the peak theoretical gains you can expect from HD 5000.

All of the gaming tests were run under Boot Camp/Windows 8. I think it’s time to retire the HL2/Portal testing under OS X.

Playable Gaming Performance

There's a surprising number of games that are actually playable on Intel's HD 5000 in the MacBook Air. You have to be ok with the fan spinning quite loudly, but it's possible to get some ultra portable gaming in if you're up for it.

For all of these tests I stuck with 1366 x 768 so I could run comparable data on the only HD 3000 equipped MBA I had, an 11-inch model. I also threw in data from Inte's HD 4400 using the new Haswell equipped Acer S7. I'll start with GRID 2, a brand new racer, running at relatively low quality settings.

GRID 2

GRID 2 is absolutely playable on the new MacBook Air. At 43.1 fps it's 16% faster than last year's HD 4000 model. A 16% gain without increasing TDP on the same manufacturing process is pretty impressive. The gains over the 2011 MBA are substantial. GRID 2 goes from almost playable to fast enough where you can actually turn up some of the quality settings if you wanted to.

Next up is Borderlands 2. Again, a fairly modern title, but one that's really optimized for current generation consoles - making high-end processor graphics more than up for the task. While a higher TDP implementation of Haswell's integrated graphics wouldn't have an issue here, things are a little more difficult with a 15W TDP.

Super Street Fighter IV: Arcade Edition

We see a marginal improvement over the HD 4000, we're clearly thermally bound at this point. What's interesting is the HD 4400 on the S7 is actually quicker here. The difference could be cooling or how Apple decides to scale back on GPU frequency when faced with thermal limits. A quick look at Haswell's power reporting confirms that while running my Borderlands 2 test the GPU was already exceeding the PL1 (Power Limit 1) of 15W: 

 

Remember, with Sandy Bridge Intel introduced Turbo Boost 2.0 that effectively allowed for two separate power limits - one equal to the processor's TDP (PL1) and one higher than the processor's TDP (PL2) that could be hit as long as the die temperature doesn't get too high.

Despite the sub 30fps frame rate in this benchmark, Borderlands 2 was definitely playable on the HD 5000. It wasn't always smooth but if you need your single player fix, it'll suffice.

I've had a few requests to bring back our Minecraft benchmark. We ditched it from our higher end GPU reviews since it's no longer stressful enough, but for 15W TDP iGPUs it's perfect.

Minecraft

Once again we see almost a 17% increase over Intel's HD 4000. The HD 4400 comparison is also very impressive with a 12% increase in performance vs. what most MBA competitors will be using.

When I was a kid all I wanted was a console that could play arcade quality ports of Street Fighter II and Mortal Kombat II. These days, even the latest Street Fighter title has no issue playing on free graphics:

Borderlands 2

16% seems to be the magic number as that's exactly how much faster HD 5000 is compared to HD 4000. Given the lower TDP this year, that's a pretty reasonable gain. Looking at the sheer number of transistors that had to be used to get there however gives you good insight into just how hard it is to improve performance without a corresponding process node shrink.

 

CPU Performance More GPU Performance Numbers
Comments Locked

233 Comments

View All Comments

  • seapeople - Tuesday, June 25, 2013 - link

    Brightness is pretty much the number one power consumer in a laptop like this (which is actually mentioned in the review). If you expect to run anything at 100% brightness and get anywhere near ideal battery life then you are bound to be disappointed.
  • name99 - Monday, June 24, 2013 - link

    "802.11ac ... better spatial efficiency within those channels (256QAM vs. 64QAM in 802.11n). Today, that means a doubling of channel bandwidth and a 4x increase in data encoded on a carrier"

    This is a deeply flawed statement in two ways.

    (a) The modulation form describes (essentially) how many bits can be packed into a single up/down segment of a sinusoid wave form, ie how many bits/Hz. It is constrained by the amount of noise in the channel (ie the signal to noise ratio) which smeers different amplitudes together so that you can't tell them apart.
    It can be improved somewhat over 802.11n performance by using a better error correcting code (which essentially distributes the random noise level over a number of bits, so that a single large amount of noise rather than destroying that bit information gets spread into a smaller amount of noise over multiple bits).
    802.11ac uses LDPC, a better error correcting code, which allows it to use more aggressive modulation.

    Point is, in all this the improved modulation has nothing to do with spatial encoding and spatial efficiency.

    (b) The QAM64 and QAM256 refer to the number of possible states encoded per bit, not in any way to the number of bits encoded. So QAM64 encodes 6 bits per Hz, QAM256 encodes 8 bits per Hz. the improvement is 8/6=1.33 which is nice, but is not "a 4x increase in data encoded on a carrier".

    We are close to the end of the line with fancy modulation. From now on out, pretty much all the heavy lifting comes from
    (1) wider spectrum (see the 80 and 160MHz of 802.11ac) and
    (2) smaller, more densely distributed base stations.
    We could move from 3 up to 4 spatial streams (perhaps using polarization to help out) but that's tough to push further without much larger antennas (and a rapidly growing computational budget).

    There is one BIG space for a one-time 2x improvement, namely tossing the 802.11 distributed MAC, which wastes half the time waiting randomly for one party or another to talk, and switching to a centrally controlled MAC (like the telcos) along with a very narrow RACH (random access channel) for lightweight tasks like paging and joining.
    My guess/hope is that the successor to 802.11ac will consist primarily of the two issues I've described above (and so will look a lot more like new SW than new DSP algorithms), namely a central arbiter for a network along with the idea that, from the start, the network will consist of multiple small low-power cells working together, about one per room, rather than a single base station trying to reach out to 100 yards or more.
  • bittwiddler - Monday, June 24, 2013 - link

    • The keyboard key size and spacing is the same on the 11 and 13" MBAs.
    • The 11" MBA is exempt from being removed from luggage during TSA screenings, unlike the 13.
    • The 11" screen is lower height than most and doesn't get caught by the clip for the airplane seat tray table.
    • When it comes to business travel computing, I'm not interested in a race to the bottom.
  • Sabresiberian - Monday, June 24, 2013 - link

    One thing I would NOT like is for Apple to make a move to a 16:9 screen. I'd certainly rather have 1440x900 on a 13" screen than anything denser that was 16:9. I mean, I'm one of the guys that has been harping on pixel density and refresh rates since before we had modern smart phones (the move to LCDs set us back a decade or more in that regard), but on a screen smaller than 27", 16:9 is just bad. In my not-so-humble opinion.

    4:3 is better for something smaller than 17", but I can live with 16:10. :)
  • Kevin G - Monday, June 24, 2013 - link

    Re-reading trough the review I have a question about the display: does it use panel self refresh? I recall Intel hyping up this technology several years ago and the Haswell slides in this review indicate support for it. The question is, does Apple take advantage of it?
  • Kevin G - Monday, June 24, 2013 - link

    I think that I can answer my own question. I couldn't find the data sheet for the review panel LSN133BT01A02 but references on the web point towards an early 2012 release for it. Thus it looks like it appeared on the market before panel self refresh was slated for wide spread introduction alongside Haswell.
  • hobagman - Monday, June 24, 2013 - link

    Hi Anand & all -- could I ask a more CPU related question I've been wondering about a lot -- how come the die shots always look so colorful and diverse, when isn't the top layer all just interconnects? Or are the die shots actually taken before they do the interconnects, consisting in the top 10-15 layers? Would really appreciate an explanation of this ...
  • hobagman - Monday, June 24, 2013 - link

    I mean, what are we actually seeing when we look at the die shot? Are those all different transistor regions, and if so, we must be looking at the bottom layers. Or is it that the interconnects in the different regions look different ... or ... ?
  • SkylerSaleh - Tuesday, June 25, 2013 - link

    When making the ASIC, thin layers of glass are grown on the silicon, etched, and filled with metal to build the interconnects. This leaves small sharp geometric shapes in the glass, which reacts with the light similarly to how a prism would, causing the wafer to appear colorful.
  • cbrownx88 - Monday, June 24, 2013 - link

    Please please please revisit with the i7 config - been wanting to make a purchase but have been waiting for this review (and now waiting on the update lol).

Log in

Don't have an account? Sign up now