The untold story of Intel's desktop (and notebook) CPU dominance after 2006 has nothing to do with novel new approaches to chip design or spending billions on keeping its army of fabs up to date. While both of those are critical components to the formula, its Intel's internal performance modeling team that plays a major role in providing targets for both the architects and fab engineers to hit. After losing face (and sales) to AMD's Athlon 64 in the early 2000s, Intel adopted a "no more surprises" policy. Intel would never again be caught off guard by a performance upset.

Over the past few years however the focus of meaningful performance has shifted. Just as important as absolute performance, is power consumption. Intel has been going through a slow waking up process over the past few years as it's been adapting to the new ultra mobile world. One of the first things to change however was the scope and focus of its internal performance modeling. User experience (quantified through high speed cameras mapping frame rates to user survey data) and power efficiency are now both incorporated into all architecture targets going forward. Building its next-generation CPU cores no longer means picking a SPECCPU performance target and working towards it, but delivering a certain user experience as well.

Intel's role in the industry has started to change. It worked very closely with Acer on bringing the W510, W700 and S7 to market. With Haswell, Intel will work even closer with its partners - going as far as to specify other, non-Intel components on the motherboard in pursuit of ultimate battery life. The pieces are beginning to fall into place, and if all goes according to Intel's plan we should start to see the fruits of its labor next year. The goal is to bring Core down to very low power levels, and to take Atom even lower. Don't underestimate the significance of Intel's 10W Ivy Bridge announcement. Although desktop and mobile Haswell will appear in mid to late Q2-2013, the exciting ultra mobile parts won't arrive until Q3. Intel's 10W Ivy Bridge will be responsible for at least bringing some more exciting form factors to market between now and then. While we're not exactly at Core-in-an-iPad level of integration, we are getting very close.

To kick off what is bound to be an exciting year, Intel made a couple of stops around the country showing off that even its existing architectures are quite power efficient. Intel carried around a pair of Windows tablets, wired up to measure power consumption at both the device and component level, to demonstrate what many of you will find obvious at this point: that Intel's 32nm Clover Trail is more power efficient than NVIDIA's Tegra 3.

We've demonstrated this in our battery life tests already. Samsung's ATIV Smart PC uses an Atom Z2760 and features a 30Wh battery with an 11.6-inch 1366x768 display. Microsoft's Surface RT uses NVIDIA's Tegra 3 powered by a 31Wh battery with a 10.6-inch, 1366x768 display. In our 2013 wireless web browsing battery life test we showed Samsung with a 17% battery life advantage, despite the 3% smaller battery. Our video playback battery life test showed a smaller advantage of 3%.

AnandTech Tablet Bench 2013 - Web Browsing Battery Life

For us, the power advantage made a lot of sense. We've already proven that Intel's Atom core is faster than ARM's Cortex A9 (even four of them under Windows RT). Combine that with the fact that NVIDIA's Tegra 3 features four Cortex A9s on TSMC's 40nm G process and you get a recipe for worse battery life, all else being equal.

Intel's method of hammering this point home isn't all that unique in the industry. Rather than measuring power consumption at the application level, Intel chose to do so at the component level. This is commonly done by taking the device apart and either replacing the battery with an external power supply that you can measure, or by measuring current delivered by the battery itself. Clip the voltage input leads coming from the battery to the PCB, toss a resistor inline and measure voltage drop across the resistor to calculate power (good ol' Ohm's law).

Where Intel's power modeling gets a little more aggressive is what happens next. Measuring power at the battery gives you an idea of total platform power consumption including display, SoC, memory, network stack and everything else on the motherboard. This approach is useful for understanding how long a device will last on a single charge, but if you're a component vendor you typically care a little more about the specific power consumption of your competitors' components.

What follows is a good mixture of art and science. Intel's power engineers will take apart a competing device and probe whatever looks to be a power delivery or filtering circuit while running various workloads on the device itself. By correlating the type of workload to spikes in voltage in these circuits, you can figure out what components on a smartphone or tablet motherboard are likely responsible for delivering power to individual blocks of an SoC. Despite the high level of integration in modern mobile SoCs, the major players on the chip (e.g. CPU and GPU) tend to operate on their own independent voltage planes.


A basic LC filter

What usually happens is you'll find a standard LC filter (inductor + capacitor) supplying power to a block on the SoC. Once the right LC filter has been identified, all you need to do is lift the inductor, insert a very small resistor (2 - 20 mΩ) and measure the voltage drop across the resistor. With voltage and resistance values known, you can determine current and power. Using good external instruments you can plot power over time and now get a good idea of the power consumption of individual IP blocks within an SoC.


Basic LC filter modified with an inline resistor

Intel brought one of its best power engineers along with a couple of tablets and a National Instruments USB-6289 data acquisition box to demonstrate its findings. Intel brought along Microsoft's Surface RT using NVIDIA's Tegra 3, and Acer's W510 using Intel's own Atom Z2760 (Clover Trail). Both of these were retail samples running the latest software/drivers available as of 12/21/12. The Acer unit in particular featured the latest driver update from Acer (version 1.01, released on 12/18/12) which improves battery life on the tablet (remember me pointing out that the W510 seemed to have a problem that caused it to underperform in the battery life department compared to Samsung's ATIV Smart PC? it seems like this driver update fixes that problem).

I personally calibrated both displays to our usual 200 nits setting and ensured the software and configurations were as close to equal as possible. Both tablets were purchased by Intel, but I verified their performance against my own review samples and noticed no meaningful deviation. All tests and I've also attached diagrams of where Intel is measuring CPU and GPU power on the two tablets:


Microsoft Surface RT: The yellow block is where Intel measures GPU power, the orange block is where it measures CPU power


Acer's W510: The purple block is a resistor from Intel's reference design used for measuring power at the battery. Yellow and orange are inductors for GPU and CPU power delivery, respectively.

The complete setup is surprisingly mobile, even relying on a notebook to run SignalExpress for recording output from the NI data acquisition box:

Wiring up the tablets is a bit of a mess. Intel wired up far more than just CPU and GPU, depending on the device and what was easily exposed you could get power readings on the memory subsystem and things like NAND as well.

Intel only supplied the test setup, for everything you're about to see I picked and ran whatever I wanted, however I wanted. Comparing Clover Trail to Tegra 3 is nothing new, but the data I gathered is at least interesting to look at. We typically don't get to break out CPU and GPU power consumption in our tests, making this experiment a bit more illuminating.

Keep in mind that we are looking at power delivery on voltage rails that spike with CPU or GPU activity. It's not uncommon to run multiple things off of the same voltage rail. In particular, I'm not super confident in what's going on with Tegra 3's GPU rail although the CPU rails are likely fairly comparable. One last note: unlike under Android, NVIDIA doesn't use its 5th/companion core under Windows RT. Microsoft still doesn't support heterogeneous computing environments, so NVIDIA had to disable its companion core under Windows RT.

Idle Power
POST A COMMENT

163 Comments

View All Comments

  • dealcorn - Wednesday, December 26, 2012 - link

    I applaud the mention of Intel's internal performance modeling team at the start of the article but where was the picture? This is a disturbing article and a picture helps humanize the story and soften the blow. Large portions of the readership have bought into the ARM mythology regarding efficiency so the factual content of the article is disturbing and will trigger a denial response. However, there are some rather obvious conclusions that should have been stated to assist readers in assessing the mythology. I may overshoot slightly.

    5 years ago the idea that Intel could compete effectively in the non laptop mobility market was laughable because Intel was "clueless" about all that SOC stuff. Intel's process advantage gave it a big competitive advantage but it's intimate knowledge of how to tweak X86 to achieve varying performance targets was worthless as long as ARM wielded substantial advantages in efficiency and cost. Clover Trail is proof that Intel has learned a lot in the last 5 years. Today, even without exploiting Intel's advantages in process technology and X86 tuning, Clover Trail is roughly comparable to ARM in efficiency. This gain in efficiency is solely the result of Intel being smarter today than they were 5 years ago. Clover Trail is build on a nearly obsolete (by Intel standards) process geometry using a 5 year old core designed during Intel’s “Clueless” era. This should be profoundly disturbing to ARM supporters because ARM has had a near mono-maniacal focus on efficiency for 22 years. Constant, incremental improvement is the name of the game and ARM is a well funded, adequately staffed, old hand at this game with some of the finest talent and best IP in the industry. That the new comer (Chipzilla) can reach rough parity with ARM in the space of 5 years based solely on getting smarter, rather than some fancy process advantage, means that Intel is on a steeper learning curve than the gang over at ARM and the rest of the eco-sludge system. This is scary because Intel is not going to hit it’s stride until 22 nm when it gets to combine what it has learned with some of it’s process advantage and a long overdue re-design of the Atom core (i.e., OoO execution). The full process advantage does not hit until 14 nm which Intel should achieve about a year after hitting 22 nm. Today all the ARM fabs can talk a good game about reaching process parity with Intel because talk is cheap. Let’s count how many ARM fabs actually achieve mass SOC production at 14 nm within 4 years of Intel hitting this milestone.

    People like to talk about eco-sludge but it is understood that 14 nm fabs are expensive toys that not everyone can afford. The market is not big enough to buy 14 nm fab’s for every ARM player. There is a mass extinction event coming and the stench of rotting fabs will soon permeate the ecosystem. Basically everyone other than ARM and its fabs will seek other opportunities (i.e. Chipzilla) as soon as the stench gets unbearable. Other than the surviving fabs and ARM, the remaining eco-sludge system should transition to Intel fairly easily.

    Intel’s goal in addressing this market is domination, not rough parity. ARM is likely to clearly lose it’s efficiency advantage at either 22 nm or 14 nm so its only playable card is that it is cheap which is a credible strategy, just ask Rory. Intel plays in a different league so this is a transitional strategy at best. By that time. Rory should be a free agent so if ARM wants the benefit of his perspective, he may be available.

    It is wired deep in the Intel mojo that each process geometry step should achieve perfect economic scaling which means that the cost to produce one transistor in the new fab should be half the production cost of producing the transistor in the old fab. You never achieve perfect economic scaling but it is a point of pride at Intel then its newest transistors are always the cheapest to produce and ever time you move production from an old fab to a new, fully utilized fab, your production costs drop. Recall all the jabber when FinFET was introduced about how cool it was and that the incremental cost to achieve FinFET was basically a rounding error. Now contrast that Intel jabber with the jabber coming from assorted ARM fabs that the newest process technologies will be more expensive. Some of that jabber is just warming up for the impending death moans associated with any mass extinction event. However, non Intel fabs are making a sincere attempt to reach rough parity with Intel’s process technology and it is expensive which will be reflected in the cost to produce contemporary ARM chips using a contemporary process technology. It will work but the ARM chips will not be as relatively cheap as they were before and they will be less power efficient than Atom.

    That is the environment in which Intel plans to play several trump cards which are already known. If Intel is able able to incorporate the radio technology they already demo’ed into Atom at 14 nm and it does favorably affect BOM and efficiency, is there any doubt that Intel can ride that to a 50% market share or more in segments that demand radio? If Intel is able to incorporate V-Pro into devices for the corporate market which already values V-Pro as a known good technology, is there any doubt that Intel will be able to ride that technology to a 50% market share in the corporate device market? Of course ARM is working on a “me too” technology that is unproven. However, there is a joke waiting to be told and the punch line is: “Nobody ever got fired for buying V-Pro.” IT managers do not want their firing to be the butt of a V-Pro joke. Better to buy what is known good, and preserve your ERISA vesting. I did not pay attention to whatever Intel is cooking up with VISA, but somehow, between Intel. McGaffe, VISA and the Bunnies, I expect Intel should be able to figure out something to say that will help market share in the consumer segment. Never underestimate the Bunnies because when they get properly motivated, they are a nearly unstoppable force. Based on what I see in my local market the best Samsung can come up with to compete is hot, nameless Asian chicks and while it pains me to say so, in the consumer space they can not stand up to the Bunnies. ARM itself has no branding in the consumer space.

    Intel's new found enthusiasm for efficiency and low cast has not yet peaked. ARM peaked some time ago and is struggling to maintain its momentum with a slower learning curve. Now is a time when execution matters. While Intel's execution record is far from flawless, a case may be made that it has the best execution in the industry. Time will tell how it works out but insofar as ARM target market ambitions go beyond the uncontested garage door opener market, it is going to be an uphill battle every step of the way even though they start with a dominant market share.
    Reply
  • ET - Wednesday, December 26, 2012 - link

    I love my Nexus 7, and I think 7" tablets are a great form factor. However I'd appreciate full Windows compatibility, making this into a tiny tablet. Looks like Intel's chip might have what it takes to provide a good solution for this form factor. Reply
  • lchen66666 - Wednesday, December 26, 2012 - link


    Google did good job with its nexus 7 to secure the large market share for Android to compete with iPad. At same time the ecosystem for ARM tablets was well established. All low cost components are everywhere for the ARM platform. All Apps for Android and IOS are everywhere. The game is completely different than 10 years back AMD vs INTEL 64 bits CPU because they are pretty much in the same ecosystem.

    In order to gain the tablet market share for Surface(x86 version), the devices have to be priced very competitive or even lower than corresponding Android devices because Windows is behind in the tablet Apps. Many web sites don't even have corresponding version for Windows. This means the device with that that chip has to be around $300(definitely not $500) with at least full HD(1080P) resolution(not 1366x768 type of resolution). This certainly cannot be done without Intel and Microsoft cutting their part price.
    Intel latest ATOM chip has to be priced lower than $30/each. Microsoft Windows 8 for tablets license has to be less than $10/each. Otherrwise, forgot about this for the consumer market. The massive consumer won't not switch to Windows. However, there are still market for the SurfacePro(with Intel i5), many corporate users may buy it for the reason of having single device.

    Reply
  • rburnham - Wednesday, December 26, 2012 - link

    After spending a few days with a new Acer W510 tablet, I see no reason to own an ARM tablet anymore. Well done Intel. Reply
  • GeorgeH - Wednesday, December 26, 2012 - link

    A very nice article overall, but the average power efficiency numbers aren't a very good measurement to make. Grab a P4 system off the shelf, run a few benchmarks, then turn the PC off. Grab a TI-89, make it do the same tasks, then turn it off. Compare power as in this article and you could end up with a statement like "the P4 only consumed 0.1mW; we can't wait to see it in calculators!"

    I understand why the measurements were made in the way that they were and I don't think the example above really applies here, but it's irksome to see "A is much better than B" statements using a methodology that could easily be used (intentionally or not) to reach ridiculous conclusions.

    However, given that both test systems were supplied by Intel, I do think it could be argued that Intel pulled the wool over your eyes a little bit to yield numbers that put Atom in the best possible light.
    Reply
  • lancedal - Wednesday, December 26, 2012 - link

    Intel enjoyed a big margin on their CPU. I don't know how will they compete in the sub $30 SoC market. Their biggest goal is probably to prevent an invasion of ARM into the server market. But in that market, the priorities are probably performance, pricing, then power in that order.

    As another poster mentioned, ARM not only competing effectively against Intel on power, performance, and pricing, their business model enables their customer to customize the SoC to meet their target. For example, Apple focuses on performance and power at the cost of silicon area while Amazon focuses on cost mainly. Samsung is probably somewhere in between. ARM business model makes that possible as it allows their customer to assembe their SoCs the way they wanted.

    As a consumer, I really like the idea of an x86 compatible device with the idea of total compatibility between my mobile devices and my computing devices. However, that seems to be less and less important as even MS is using the cloud to connect between computing device and mobile device. I was really disapointed to realize that when I bought my 1st Windows Phone 8 devices and it has absolutely no direct-sync ability with my laptop. Not even Outlook.
    Reply
  • Exophase - Thursday, December 27, 2012 - link

    Anand,

    It'd be great if you could provide graphs showing what the utilization is for each core, as well as what frequency the cores are being run at. I'm assuming MS has kept this monitoring capability in Windows 8 and Windows RT.

    This way we could see how well Tegra's 3 synchronous cores are being managed, including how eager the kernel is to keep cores enabled (and at the same frequency/voltage as all the others, only with clock gating) and how aggressive it is with frequency, and where if anywhere the companion core could have been viably used. It'd also highlight how much the Atom is being scheduled to use HT instead of separate cores and how much of a role turbo boost is playing. And it'd clear the question (or at least my question) of whether or not Clover Trail can asynchronously clock its two cores; I'd assume it can.

    Right now it's popular for people to assume that one Atom core beats two Cortex-A9 cores, because they assume the workloads are heavily threaded. It doesn't help when your articles claim that JS benchmarks like Kraken are well threaded when they're still strictly single threaded.
    Reply
  • 68k - Saturday, December 29, 2012 - link

    The metrics you are asking for would be interesting to see, but I would say that it is far more "popular" for people to assume that x86 does not stand a chance of getting close to work done per Joule than assuming that a single core Atom is faster than a dual core A9.

    And looking at the numbers: I would say that a single core Atom core would beat a dual core A9 when clocked as in this comparison. HT on Atom yields far more compared to HT on the "Core" CPUs, which is not too hard to understand as Atom is in-order. So while HT gives 20-30% on "core" CPUs, it usually result in >50% more throughput on Atom.

    On top of that, one can be far more naive when programming non-embarrassingly parallel things on HT compared to doing the same thing across physical CPU cores as both HT share the L1 cache which SIGNIFICANTLY lower the cost of locks which will cause cache-line ping/pong even if the lock is non-contended. So the cost of acquiring the lock is at least one order of magnitude cheaper between HTs compared to between two cores sharing L2$ but having separate L1$.

    But most importantly: how many programs used on a phone or table actually benefits from multiple CPU cores? Having the best and most power-efficient single core performance is far more important than having the best performance across 4 cores.
    Reply
  • raysmith1971 - Thursday, December 27, 2012 - link

    Shouldnt be bothered by this article , but this article is pretty much exactly the same on the three tech sites that I read, namely tomshardware guide, xbitlabs, and anandtech. Seems that either intel is suddenly amazing compared to ARM or that Intel is smooching with the tech sites and they in turn are fawning over with Intel. Dont get me wrong I like Intel and have their chips in my systems rather than AMD , but it seems that this article like on the other sites is more a product placement ad, rather than an actual review, and because of this I may be reading these sites 'reviews' with a little more scepticism from now on. Reply
  • Braumin - Thursday, December 27, 2012 - link

    What's amazing his how people have just been ignoring Intel, and holding to this untrue belief that ARM cores are inherently more power efficient than Intel ones due to some magical pixie dust and RISC vs CISC.

    Medfield was released almost a year ago, and even then, a single core Atom showed itself to be faster than current gen ARM processors, and offer competitive battery life.

    Why then do you think Intel is "suddenly amazing" compared to ARM?

    The reviews of all of the clover trail tablets have shown that they are a significantly better experience than the ARM ones.
    Reply

Log in

Don't have an account? Sign up now