An Update on Apple’s A7: It's Better Than I Thought

When I reviewed the iPhone 5s I didn’t have much time to go in and do the sort of in-depth investigation into Cyclone (Apple’s 64-bit custom ARMv8 core) as I did with Swift (Apple’s custom ARMv7 core from A6) the year before. I had heard rumors that Cyclone was substantially wider than its predecessor but I didn’t really have any proof other than hearsay so I left it out of the article. Instead I surmised in the 5s review that the A7 was likely an evolved Swift core rather than a brand new design, after all - what sense would it make to design a new CPU core and then do it all over again for the next one? It turns out I was quite wrong.

Armed with a bit of custom code and a bunch of low level tests I think I have a far better idea of what Apple’s A7 and Cyclone cores look like now than I did a month ago. I’m still toying with the idea of doing a much deeper investigation into A7, but I wanted to share some of my findings here.

The first task is to understand the width of the machine. With Swift I got lucky in that Apple had left a bunch of public LLVM documentation uncensored, referring to Swift’s 3-wide design. It turns out that although the design might be capable of decoding, issuing and retiring up to three instructions per clock, in most cases it behaved like a 2-wide machine. Mix FP and integer code and you’re looking at a machine that’s more like 1.5 instructions wide. Obviously Swift did very well in the market and its competitors at the time, including Qualcomm’s Krait 300, were similarly capable.

With Cyclone Apple is in a completely different league. As far as I can tell, peak issue width of Cyclone is 6 instructions. That’s at least 2x the width of Swift and Krait, and at best more than 3x the width depending on instruction mix. Limitations on co-issuing FP and integer math have also been lifted as you can run up to four integer adds and two FP adds in parallel. You can also perform up to two loads or stores per clock.

I don’t yet have a good understanding of the number of execution ports and how they’re mapped, but Cyclone appears to be the widest ARM architecture we’ve ever seen at this point. I’m talking wider than Qualcomm’s Krait 400 and even ARM’s Cortex A15.

I did have some low level analysis in the 5s review, where I pointed out the significantly reduced memory latency and increased bandwidth to the A7. It turns out that I was missing a big part of the story back then as well…

A Large System Wide Cache

In our iPhone 5s review I pointed out that the A7 now featured more computational GPU power than the 4th generation iPad. For a device running at 1/8 the resolution of the iPad, the A7’s GPU either meant that Apple had an application that needed tons of GPU performance or it planned on using the A7 in other, higher resolution devices. I speculated it would be the latter, and it turns out that’s indeed the case. For the first time since the iPad 2, Apple once again shares common silicon between the iPhone 5s, iPad Air and iPad mini with Retina Display.

As Brian found out in his investigation after the iPad event last week all three devices use the exact same silicon with the exact same internal model number: S5L8960X. There are no extra cores, no change in GPU configuration and the biggest one: no increase in memory bandwidth.

Previously both the A5X and A6X featured a 128-bit wide memory interface, with half of it seemingly reserved for GPU use exclusively. The non-X parts by comparison only had a 64-bit wide memory interface. The assumption was that a move to such a high resolution display demanded a substantial increase in memory bandwidth. With the A7, Apple takes a step back in memory interface width - so is it enough to hamper the performance of the iPad Air with its 2048 x 1536 display?

The numbers alone tell us the answer is no. In all available graphics benchmarks the iPad Air delivers better performance at its native resolution than the outgoing 4th generation iPad (as you'll soon see). Now many of these benchmarks are bound more by GPU compute rather than memory bandwidth, a side effect of the relative lack of memory bandwidth on modern day mobile platforms. Across the board though I couldn’t find a situation where anything was smoother on the iPad 4 than the iPad Air.

There’s another part of this story. Something I missed in my original A7 analysis. When Chipworks posted a shot of the A7 die many of you correctly identified what appeared to be a 4MB SRAM on the die itself. It's highlighted on the right in the floorplan diagram below:


A7 Floorplan, Courtesy Chipworks

While I originally assumed that this SRAM might be reserved for use by the ISP, it turns out that it can do a lot more than that. If we look at memory latency (from the perspective of a single CPU core) vs. transfer size on A7 we notice a very interesting phenomenon between 1MB and 4MB:

That SRAM is indeed some sort of a cache before you get to main memory. It’s not the fastest thing in the world, but it’s appreciably quicker than going all the way out to main memory. Available bandwidth is also pretty good:

We’re only looking at bandwidth seen by a single CPU core, but even then we’re talking about 10GB/s. Lookups in this third level cache don’t happen in parallel with main memory requests, so the impact on worst case memory latency is additive unfortunately (a tradeoff of speed vs. power).

I don’t yet have the tools needed to measure the impact of this on-die memory on GPU accesses, but in the worst case scenario it’ll help free up more of the memory interface for use by the GPU. It’s more likely that some graphics requests are cached here as well, with intelligent allocation of bandwidth depending on what type of application you’re running.

That’s the other aspect of what makes A7 so very interesting. This is the first Apple SoC that’s able to deliver good amounts of memory bandwidth to all consumers. A single CPU core can use up 8GB/s of bandwidth. I’m still vetting other SoCs, but so far I haven’t come across anyone in the ARM camp that can compete with what Apple has built here. Only Intel is competitive.

 

Introduction, Hardware & Cases CPU Changes, Performance & Power Consumption
Comments Locked

444 Comments

View All Comments

  • michal1980 - Wednesday, October 30, 2013 - link

    Did you read? I was talking about windows 8.1, you know the big upgrade given away 2 weeks ago

    I know that in iSheep land windows doesn't exist.
  • abazigal - Saturday, November 2, 2013 - link

    He's probably writing it even as we speak, and it will likely be posted in a matter a time.

    Anand has a ton of devices to review, so they have to set a priority. Not to mention that he has pretty much stated that he works on an iMac, so I imagine using Win8 actually takes away from his productivity time.
  • algalli - Wednesday, October 30, 2013 - link

    Your right Windows 8.1 will be used by tens of people not hundreds of millions of people, at least in the tablet world
  • jecastejon - Wednesday, October 30, 2013 - link

    And by the same standard, objectiveness and evidences you present I say you are paid by Apples's competitors.
  • darwiniandude - Wednesday, October 30, 2013 - link

    Really?? Strange... The detailed review I just read complained (politely) about Apple not letting them dissect (cut open) review (loan) units. It also mentioned GUI performance frame rate drops in the multitasking UI and complained that due to 64bit they really need to ship with 2GB ram rather than the 1GB they come with. I can guarantee most other reviews out there will not mention these particular technical negatives. Anandtech reviews are thorough.
  • Rickschwar - Thursday, October 31, 2013 - link

    Although I haven't seen too much Apple bias from AnandTech in the past, this is one of the most biased reviews I have ever read. This is surprising because normally AnandTech is the “gold standard” for all things technical. The reviewer talks about the iPad Air like it’s a revolutionary product, when there is little new about it. Apple was playing catch up in many ways and other tablets have many advantages over it. For example, the iPad 4 was thicker than many Android tablets. In fact, at least ten Android tablets were thinner than the iPad 4. The Air is only 1.9 mm thinner than the iPad 4 and tablets like the older Sony Experia Z are still significantly thinner than the iPad Air is (7.5mm vs. 6.9mm). Of course this wasn't mentioned in the article. Even when it comes to weight, the iPad Air isn’t dramatically lighter than the Experia Z (469g vs. 495g). That’s not mentioned in the article either.

    Since I’m in the market for a new tablet and I’ve owned two iPads in the past, I was hoping for big things with the new iPad, but for me and others it was a “meh” release. The same old display, the same A7 processor, and little real innovation. CNET agrees saying “Functionally, the iPad Air is nearly identical to last year’s model, offering only faster performance and better video chatting.”

    The most ironic part for my is the fact than more 90% of this article is based on benchmarks -- even though AnandTech has made it clear how easy it is to game benchmarks and others (including an article I wrote over a year ago published at Mostly-tech.com) have made compelling cases that benchmarks do not predict real world performance. This article mostly ignores real world performance and pretends that Android tablets don’t exist.

    After this article I will never look at AnandTech the same again. At least the CNET and Engadget reviews covered some of the limitations of this product.

    - Rick
  • abazigal - Saturday, November 2, 2013 - link

    The ironic thing about your statement is that for the moment at least, only Android OEMs have been found guilty of gaming benchmarks, not Apple. So doing any benchmark tests at this juncture would actually favour Apple's competitors, despite this being a review of an Apple product. So I don't see what reason you have to complain, when the odds are stacked in Android's favour anyways.

    Besides, Anand has thoroughly dissected the A7 chip in his 5s review, and concluded that it is actually faster and more power-efficient compared to the higher-clocked, quad-core processors found in Android phones and tablets. The fact remains that Android and most mobile apps generally aren't optimised with 4-cores in mind either. So for all intents and purposes, Android tablets may as well not exist, since they will likely lose to the iPad in terms of real-world performance anyways.

    Also, the thing with these products is that they are ultimately a package deal. People don't just look at 1 single defining factor and buy a device based solely on that. Likewise, I am definitely not going to blindly buy the thinnest tablet in the market without first considering other factors like specs and availability of apps. You are not going to find that mythical Android tablet which is thinner, lighter, has a longer battery life, better screen, while boasting a larger market of apps and content.

    I am sorry, but you are not going to find a more objective and detailed review anywhere else. You want the iPad air to be bashed, go hand out at some pro-android forum instead.
  • ADGrant - Saturday, November 2, 2013 - link

    "Same old A7 processor". You look like a complete idiot when you post something like that. The A7 was announced less than a month ago.
  • sunflowerfly - Thursday, October 31, 2013 - link

    Anand biased? I do not believe that. They have no trouble pointing out Apple's flaws, and every product has them, nothing is perfect. The best products should win, and right now that happens to be Apple a lot of the time.
  • ssiu - Tuesday, October 29, 2013 - link

    No 2GB RAM; hope dashed :(

    Does that mean an iPad 4 (which can only run 32-bit code) will end up "less RAM starved" than iPad Air running 64-bit code?

Log in

Don't have an account? Sign up now