Late last month, Intel dropped by my office with a power engineer for a rare demonstration of its competitive position versus NVIDIA's Tegra 3 when it came to power consumption. Like most companies in the mobile space, Intel doesn't just rely on device level power testing to determine battery life. In order to ensure that its CPU, GPU, memory controller and even NAND are all as power efficient as possible, most companies will measure power consumption directly on a tablet or smartphone motherboard.

The process would be a piece of cake if you had measurement points already prepared on the board, but in most cases Intel (and its competitors) are taking apart a retail device and hunting for a way to measure CPU or GPU power. I described how it's done in the original article:

Measuring power at the battery gives you an idea of total platform power consumption including display, SoC, memory, network stack and everything else on the motherboard. This approach is useful for understanding how long a device will last on a single charge, but if you're a component vendor you typically care a little more about the specific power consumption of your competitors' components.

What follows is a good mixture of art and science. Intel's power engineers will take apart a competing device and probe whatever looks to be a power delivery or filtering circuit while running various workloads on the device itself. By correlating the type of workload to spikes in voltage in these circuits, you can figure out what components on a smartphone or tablet motherboard are likely responsible for delivering power to individual blocks of an SoC. Despite the high level of integration in modern mobile SoCs, the major players on the chip (e.g. CPU and GPU) tend to operate on their own independent voltage planes.


A basic LC filter

What usually happens is you'll find a standard LC filter (inductor + capacitor) supplying power to a block on the SoC. Once the right LC filter has been identified, all you need to do is lift the inductor, insert a very small resistor (2 - 20 mΩ) and measure the voltage drop across the resistor. With voltage and resistance values known, you can determine current and power. Using good external instruments (NI USB-6289) you can plot power over time and now get a good idea of the power consumption of individual IP blocks within an SoC.


Basic LC filter modified with an inline resistor

The previous article focused on an admittedly not too interesting comparison: Intel's Atom Z2760 (Clover Trail) versus NVIDIA's Tegra 3. After much pleading, Intel returned with two more tablets: a Dell XPS 10 using Qualcomm's APQ8060A SoC (dual-core 28nm Krait) and a Nexus 10 using Samsung's Exynos 5 Dual (dual-core 32nm Cortex A15). What was a walk in the park for Atom all of the sudden became much more challenging. Both of these SoCs are built on very modern, low power manufacturing processes and Intel no longer has a performance advantage compared to Exynos 5.

Just like last time, I ensured all displays were calibrated to our usual 200 nits setting and ensured the software and configurations were as close to equal as possible. Both tablets were purchased at retail by Intel, but I verified their performance against our own samples/data and noticed no meaningful deviation. Since I don't have a Dell XPS 10 of my own, I compared performance to the Samsung ATIV Tab and confirmed that things were at least performing as they should.

We'll start with the Qualcomm based Dell XPS 10...

Modifying a Krait Platform: More Complicated
POST A COMMENT

140 Comments

View All Comments

  • kumar0us - Friday, January 04, 2013 - link

    My point was that for a CPU benchmark say Sunspider, the code generated by x86 compilers would be better than ARM compilers.

    Could better compilers available for x86 platform be a (partial) reason for faster performance of intel. Or compilers for ARM platform are mature and fast enough that this angle could be discarded?
    Reply
  • iwod - Friday, January 04, 2013 - link

    Yes, not just compiler but general optimization in software on x86. Which is giving some advantage on Intel's side. However with the recent surge of ARM platform and software running on it my ( wild ) guess is that this is less then 5% in the best case scenario. And it is only the worst case, or individual cases like SunSpider not running fully well. Reply
  • jwcalla - Friday, January 04, 2013 - link

    Yes. And it was a breath of fresh air to see Anand mention that in the article.

    Look at, e.g., the difference in SunSpider benchmarks between the iPad and Nexus 10. Completely different compilers and completely different software. As the SunSpider website indicates, the benchmark is designed to compare browsers on the same system, not across different systems.
    Reply
  • monstercameron - Friday, January 04, 2013 - link

    it would be interesting to throw an amd system into the benchmarking, maybe the current z-01 or the upcoming z-60... Reply
  • silverblue - Friday, January 04, 2013 - link

    AMD has thrown a hefty GPU on die, which, coupled with the 40nm process, isn't going to help with power consumption whatsoever. The FCH is also separate as opposed to being on-die, and AMD tablets seem to be thicker than the competition.

    AMD really needs Jaguar and its derivatives and now. A dual core model with a simple 40-shader GPU might be a competitive part, though I'm always hearing about the top-end models which really aren't aimed at this market. Perhaps AMD will use some common sense and go for small, volume parts over the larger, higher performance offerings, and actually get themselves into this market.
    Reply
  • BenSkywalker - Friday, January 04, 2013 - link

    There is an AMD design in their, Qualcomm's part.

    A D R E N O
    R A D E O N

    Not a coincidence, Qualcomm bought AMD's ultra portable division off from them for $65 million a few years back.

    Anand- If this is supposed to be a CPU comparison, why go overboard with the terrible browser benchmarks? Based on numbers you have provided, Tegra 3 as a generic example is 100% faster under Android then WinRT depending on the bench you are running. If this was an article about how the OSs handle power tasks I would say that is reasonable, but given that you are presenting this as a processor architecture article I would think that you would want to use the OS that works best with each platform.
    Reply
  • powerarmour - Friday, January 04, 2013 - link

    Agreed, those browser benchmarks seem a pretty poor way to test general CPU performance, in fact browser benchmarks in general just test how optimized a particular browser is on a particular OS mainly.

    In fact I can beat most of those results with a lowly dual-A9 Galaxy Nexus smartphone running Android 4.2.1!
    Reply
  • Pino - Friday, January 04, 2013 - link

    I remember AMD having a dual core APU (Ontario) with a 9W TDP, on a 40nm process, back in 2010.

    They should invest on a SOC
    Reply
  • kyuu - Friday, January 04, 2013 - link

    That's what Temash is going to be. They just need to get it on the market and into products sooner rather than later. Reply
  • jemima puddle-duck - Friday, January 04, 2013 - link

    Impressive though all this engineering is, in the real world what is the unique selling point for this? Normal people (not solipsistic geeks) don't care what's inside their phone, and the promise of their new phone being slighty faster than another phone is irrelevant. And for manufacturers, why ditch decades of ARM knowledge to lock yourself into one supplier. The only differentiator is cost, and I don't see Intel undercutting ARM any time soon.

    The only metric that matters is whether normal human beings get any value from it. This just seems like (indirect) marketing by Intel for a chip that has no raison d'etre. I'm hearing lots of "What" here, but no "Why". This is the analysis I'm interested in.

    All that said, great article :)
    Reply

Log in

Don't have an account? Sign up now