A couple of weeks ago I began this series on ARM with a discussion of the company’s unique business model. In covering semiconductor companies we’ve come across many that are fabless, but it’s very rare that you come across a successful semiconductor company that doesn’t even sell a chip. ARM’s business entirely revolves around licensing IP for its instruction set as well as its own CPU (and now GPU and video) cores.

Before we get into discussions of specific cores, it’s important to talk about ARM’s portfolio as a whole. In the PC space we’re used to focusing on Intel’s latest and greatest microarchitectures, which are then scaled in various ways to hit lower price targets. We might see different core counts, cache sizes, frequencies and maybe even some unfortunate instruction set tweaking but for the most part Intel will deliver a single microarchitecture to cover the vast majority of the market. These days, this microarchitecture is simply known as Core.

Back in 2008, Intel introduced a second microarchitecture under the Atom brand to target lower cost (and lower power) markets. The combination of Atom and Core spans the overwhelming majority of the client computing market for Intel. The prices of these CPUs range from the low double digits with Atom to many hundreds of dollars for the highest end Core processors (the most expensive desktop Haswell is $350, however mobile now extends up to $1100). There are other designs that target servers (which are then repurposed for ultra high-end desktops), but those are beyond the scope of this discussion for now.

If we limit our discussion to personal computing devices (smartphones, tablets, laptops and desktops), where Intel uses two microarchitectures ARM uses three. The graphic below illustrates the roadmap:

You need to somewhat ignore the timescale on the x-axis since those dates really refer to when ARM IP is first available to licensees, not when products are shipping to consumers, but you get an idea for the three basic vectors of ARM’s Cortex A-series of processor IP. Note that there are also Cortex R (embedded) and Cortex M (microcontroller) series of processor IP offered as well, but once again those are beyond the scope of our discussion here.

If we look at currently available cores, there’s the Cortex A15 on the high end, Cortex A9 for the mainstream and Cortex A7 for entry/low cost markets. If we’re to draw parallels with Intel’s product lineup, the Cortex A15 is best aligned with ultra low power/low frequency Core parts (think Y-series SKUs), while the Cortex A9 vector parallels Atom. Cortex A7 on the other hand targets a core size/cost/power level that Intel doesn’t presently address. It’s this third category labeled high efficiency above that Intel doesn’t have a solution for. This answers the question of why ARM needs three microarchitectures while Intel only needs two: in mobile, ARM targets a broader spectrum of markets than Intel.

Dynamic Range

If you’ve read any of our smartphone/tablet SoC coverage over the past couple of years you’ll note that I’m always talking about an increasing dynamic range of power consumption in high-end smartphones and tablets. Each generation performance goes up, and with it typically comes a higher peak power consumption. Efficiency improvements (either through architecture, process technology or both) can make average power in a reasonable workload look better, but at full tilt we’ve been steadily marching towards higher peak power consumption regardless of SoC vendor. ARM provided a decent overview of the CPU power/area budget as well as expected performance over time of its CPU architectures:

Looking at the performance segment alone, we’ll quickly end up with microarchitectures that are no longer suited for mobile, either because they’re too big/costly or they draw too much power (or both).

The performance vector of ARM CPU IP exists because ARM has its sights set higher than conventional smartphones. Starting with the Cortex A57, ARM hopes to have a real chance in servers (and potentially even higher performance PCs, Windows RT and Chrome OS being obvious targets).

Although we see limited use of ARM’s Cortex A15 in smartphones today (some international versions of the Galaxy S 4), it’s very clear that for most phones a different point on the power/performance curve makes the most sense.

The Cortex A8 and A9 were really the ARM microarchitectures that drove smartphone performance over the past couple of years. The problem is that while ARM’s attentions shifted higher up the computing chain with Cortex A15, there was no successor to take the A9’s place. ARM’s counterpoint would be that Cortex A15 can be made suitable for lower power operation, however its partners (at least to date) seemed to be focused on extracting peak performance from the A15 rather than pursuing a conservative implementation designed for lower power operation. In many ways this makes sense. If you’re an SoC vendor that’s paying a premium for a large die CPU, you’re going to want to get the most performance possible out of the design. Only Apple seems to have embraced the idea of using die area to deliver lower power consumption.

The result of all of this is that the Cortex A9 needed a successor. For a while we’d been hearing about a new ARM architecture that would be faster than Cortex A9, but lower power (and lower performance) than Cortex A15. Presently, the only architecture in between comes from Qualcomm in the form of some Krait derivative. For ARM to not let its IP licensees down, it too needed a solution for the future of the mainstream smartphone market. Last month we were introduced to that very product: ARM’s Cortex A12.

Slotting in numerically between A9 and A15, the initial disclosure unfortunately didn’t come with a whole lot of actual information. Thankfully, we now have some color to add.

Introduction to Cortex A12 & The Front End
POST A COMMENT

65 Comments

View All Comments

  • haukionkannel - Wednesday, July 17, 2013 - link

    I am interested in how this A12 compares to A53 in speed wise... A53 has wider registers (64 bit) but A12 run in higher freguensis and has more cores? Reply
  • Wilco1 - Wednesday, July 17, 2013 - link

    A12 will beat A53 by about 40%: A53 delivers performance comparable to A9 and A12 is 40% faster than A9. Note that 64-bit is not relevant and certainly doesn't provide a big speedup - even on x86 almost all software is still 32-bit as there is little to gain from going to 64-bit. Reply
  • jwcalla - Wednesday, July 17, 2013 - link

    Well this isn't entirely accurate as some of us are running almost completely pure 64-bit systems. And it does appear that video encoding and decoding are common operations that can benefit from 64-bit software from some of the benchmarks I've seen, but that might actually have been from compiler options so maybe not. But otherwise yes, the 64-bit ARM chips are only really important for server type workloads where people already have 64-bit software that they don't want to rework. Reply
  • Wilco1 - Wednesday, July 17, 2013 - link

    64-bit has pros and cons. x64 provides more registers so some applications run faster when built for 64-bit - that may well the video codecs you mention. However there are downsides as well, as all your pointers double in size, which slows down pointer heavy code. On 64-bit ARM things will be similar.

    Note the main reason for 64-bit is allowing more than 4GB of memory. The latest 32-bit ARMs already support this, so for mobiles/tablets etc there is no need to change to 64-bit.
    Reply
  • wumpus - Thursday, July 18, 2013 - link

    Er, no. Just no.
    From what I can tell, 32 bit ARM chips use (from a developer's view) the exact same mechanism as x86 and PAE. This might use 4G of RAM efficiently (and for those OSs that like to leave all apps in RAM, it might work well for a bit more).
    Trying to address more memory than an integer register can map to is always going to be an unholy kludge (although I would personally recommend all computer architects design such a system into the architecture because it *will* happen if the architecture succeeds at all). Since ARM chips tend to go into machines that rarely allow memory to be upgraded, no vendor really should be selling machines with >4G RAM and 32 bit accessing. The size/power/cost tradeoff isn't worth it

    google "Support for ARM LPAE". All my links go straight to the pdfs and I wind up with all the google code inbedded in my links.
    Reply
  • Calinou__ - Thursday, July 18, 2013 - link

    PAE doesn't allow for more than 3 GB per process. ;) Reply
  • Wilco1 - Thursday, July 18, 2013 - link

    A15 based servers will have 8-32GB. So yes it does go well over 4GB, that's the whole point of PAE. Mobiles will end up with 4GB RAM soon and because of PAE there is no need to use 64-bit CPUs (which would be way overkill for a mobile).

    Yes I know about ARM LPAE and that it is supported in Linux.
    Reply
  • fteoath64 - Friday, July 19, 2013 - link

    "PAE there is no need to use 64-bit CPUs". Agreed. More effort to be spend on optimizing for speed and power efficiency. Reply
  • jensend - Friday, July 19, 2013 - link

    "Cortex A12 retains the two integer pipelines of the Cortex A9, but adds support for integer divides (like the A7 and A15, other A-series architectures generally lacked support for hardware int divides)"

    The parenthetical material is very unclear. It sounds like it's saying "The A12 has hardware divides, the A15 and A7 don't, and other A-series archs likewise don't." A simple edit makes the sentence much more clear and slightly more concise:

    "Cortex A12 retains the two integer pipelines of the Cortex A9; however, like the A7 and A15, it adds support for hardware integer divides (which previous A-series architectures generally lacked)."
    Reply
  • phoenix_rizzen - Friday, July 19, 2013 - link

    An even simpler edit is to just change the parenthetical comma into a semi-colon:

    "Cortex A12 retains the two integer pipelines of the Cortex A9, but adds support for integer divides (like the A7 and A15; other A-series architectures generally lacked support for hardware int divides)"
    Reply

Log in

Don't have an account? Sign up now