Today AMD is taking the wraps off their upcoming mobile APUs, joining the already discussed desktop Kaveri. While Kaveri will also be coming to laptops at some point in the first half of 2014, the focus during the mobile APU briefing was squarely on the replacements for the current Temash and Kabini APUs, codenamed Mullins and Beema.

We looked at Kabini earlier this year, but while sales of laptops and tablets with the Kabini/Temash APUs have reportedly been quite good, we haven’t had the chance to test any retail laptops. With Intel’s Bay Trail set to give Atom a much-needed kick in the pants as far as performance is concerned, AMD hasn’t been standing still and their next generation of “small core” APUs looks ready to give Silvermont some stiff competition. Here’s what we know right now.

First and foremost, these are actually new cores as opposed to mere tweaks of existing designs. Temash and Kabini used “Jaguar” cores, built on a 28nm process node; Mullins and Beema will also use 28nm technology, with “Puma” cores, but along with improvements to the design to reduce the power use, AMD is also incorporating an ARM Cortex-A5 core with TrustZone technology to help with security. Here’s the quick overview of the current and roadmap:

AMD hasn’t disclosed how much the underlying architecture has changed, and I would guess the Puma cores are actually quite similar to Jaguar cores, but the net result is a 2X improvement in performance per Watt according to AMD. They arrive at that number by dividing the performance in a few common benchmarks by the rated TDP of the APUs. Now that’s a bit contrived, as a 25W TDP APU may not actually be drawing 25W during the tests, but we’ll just ignore the marketing for now and focus on the important metrics. Update: It sounds like most of the performance gains come from frequency increases, while power improvements happen at the SoC level.

First, we have Beema replacing Kabini, and with the change we get the AMD Security Processor (ARM Cortex-A5) and a reduction in TDP on some parts, with 10W being the minimum. Mullins does the same for Temash, only AMD uses SDP (Scenario Design Power) rather than TDP (Thermal Design Power), and the new APUs are ~2W compared to 3-4W for Temash. Apparently the TDP for Temash is 8W and the TDP for Mullins is 4.5W, and that’s what AMD uses for their performance per watt calculations.

While a “2X increase” sounds good, there are many ways to get there. Simply dropping the power use by half but maintaining performance would be one way, or doubling performance at the same power use would yield the same 2X increase. Thankfully, AMD provided details of their performance testing for the old and new APUs as well, which I’ve summarized in the table below, and we can see that performance has increased quite a bit along with the drop in TDP.

AMD APU Performance Results
  Temash A6-1450 (8W) Mullins (4.5W) %Increase Kabini A6-5200 (25W) Beema (15W) %Increase
PCMark 8 Home 1343 1809 35% 1861 2312 24%
3DMark 11 468 570 22% 685 823 20%

Even if we completely ignore the TDP aspect, the performance improvements coming with the new Mullins and Beema APUs look to be quite good. The iGPU performance is up around 20% for both the low-power Beema and the ultra-low-power Mullins APUs, while the CPU/overall performance is a more substantial 35% increase with Mullins and 24% with Beema.

AMD hasn’t disclosed clock speeds or anything else for the upcoming APUs, but given A6-1450 is clocked at 1000-1400MHz with the GPU core running at 300-400MHz, it is possible AMD was able to arrive at the above performance increases simply with higher clock speeds. Also possible is that similar to the Bobcat to Jaguar transition, AMD tweaked other elements of the Puma core (e.g. the scheduler could have more entries).

Core counts on the CPU side have remained the same: 2-4 cores. With the lower SDP of Mullins, AMD also notes that fanless quad-core tablets and laptops will now be possible, which definitely opens some additional doors. When we looked at Kabini performance (granted in a 15W TDP), we found the CPU performance was typically well ahead of Atom at the time, and even Silvermont/Bay Trail are only moderately ahead (and in some cases still slower). How things shake out with Mullins in the 2W market will be something to watch.

While we don’t know if the iGPU has added any additional cores, it remains GCN based and very likely uses the same cores as before, only with higher clocks. Consider that AMD’s GCN architecture breaks things down into Compute Units (CUs) with 64-cores per CU. The existing Kabini/Temash APUs have two CUs and 128 cores, while Hawaii as an example includes a staggering 44 active CUs in the R9 290X; Kaveri goes for the middle ground with up to 8 CUs (512 cores). In order to increase the number of cores in Beema/Mullins, AMD would have to make the jump from 2 to 3 CUs, a 50% increase; given the ~20% performance increases above, it’s far more likely these come from the same number of cores/CUs running at higher clocks than more cores running at lower clocks.

Wrapping things up, there are a few other items we wanted to quickly touch on. First is Kaveri for notebooks, which as noted above will be shipping in H1’14. Kaveri is a GCN 1.1 part, similar to Hawaii only with fewer cores, and it also supports HSA features and AMD’s new TrueAudio. Again, notice that neither of those elements are listed for Mullins/Beema, indicating they’re using the same basic GCN 1.0 GPU design as Temash/Kabini. Kaveri will also be making the transition to 28nm from Trinity/Richland’s 32nm, and we could see a fairly decent bump in performance – but AMD isn’t saying much on the subject of mobile Kaveri performance just yet.

The other items we wanted to quickly discuss (and you can see these and a few other pieces of information in the slide gallery below) are some of the other additions AMD is making with Mullins/Beema. There are three points to discuss: AMD DockPort, Microsoft InstantGo, and the Platform Security Processor.

While DockPort sounds interesting (a non-Intel alternative to Thunderbolt that basically combines DisplayPort 1.2 with USB 3 into a single cable), AMD said precious little about DockPort in their presentation. Someone asked about it, and AMD said it was “up to laptop manufacturers” and that was about it. There’s the above slide as well, showing how a single cable could drive three external displays along with a variety of peripheral devices, but we’ll have to wait and see how many companies are willing to jump on the DockPort bandwagon.

Microsoft InstantGo is another feature that AMD supports. Formerly called Connected Standby, InstantGo allows your laptop to wake up from sleep mode periodically to pull down network updates – email, live tiles, etc. It also allows devices to go from deep sleep to “on” in under 500 milliseconds, basically matching what we get with tablets and smartphones. Much of the implementation of InstantGo will again be left to the device manufacturers (i.e. the “up to 14 days in standby mode” will depend on the battery capacity and other power optimizations made by the OEMs).

Last up is the Security Processor, which consists of an ARM Cortex-A5 core with support for the ARM TrustZone. We discussed this in more detail previously, but the short summary is that the technology is designed to provide a Trusted Execution Environment to help protect against malware and viruses, as well as providing new ways to deal with user authentication, payment processing, etc. How much use the Security Processor will see in the short term is difficult to say, but if ARM can get some traction with it in the smartphone/tablet space, it’s inclusion in AMD’s Mullins/Beema APUs could prove beneficial.

Wrapping things up, Mullins and Beema will be coming out in 2014, but AMD hasn’t given a precise time frame. We have a date for desktop Kaveri (January 14, 2014), but everything else is “first half of 2014”. Given the added pressure AMD is facing from Intel’s Bay Trail, hopefully the Mullins/Beema APUs will arrive sooner rather than later, but that may simply be wishful thinking on my part. As usual, the real challenge is in getting the APUs into a compelling product – one that offers the right features at the right price point. With tablets and Chromebooks taking over the sub-$300 market, creating something that clearly stands out from the crowd is becoming difficult.

Source: AMD Announcment

Comments Locked

47 Comments

View All Comments

  • Mugur - Thursday, November 14, 2013 - link

    Xbox One has DDR3.
  • SaberKOG91 - Thursday, November 14, 2013 - link

    Thanks! I sort of remembered that one, but should have checked to be sure. DDR3 + eDRAM vs GDDR5 on the PS4. It would be interesting to see a performance comparison, but I tend to side with GDDR5 or HMC as better memory solutions.
  • fteoath64 - Saturday, November 16, 2013 - link

    Using eDram will remove most of the ram latencies but it comes with great cost in die-space and power usage. You need x-MB of space and a controller logic to interface the bus. The Xbox 360 cpu had 10MB of eDram which solves its problems. I can predict that 12MB or 16MB of eDram for AMD can do so as well. The eDram of the Iris Pro-e takes 20 watts for 128MB. So 16MB will be less than 3 watts additional. That might be the way.
  • iwod - Wednesday, November 13, 2013 - link

    Why do they not make a network processor SoC with PUMA? Something for NAS and other similar devices.
  • SaberKOG91 - Thursday, November 14, 2013 - link

    This would be a good place for a G-series version of Mullins.
  • Lolimaster - Wednesday, November 13, 2013 - link

    For the massive gains in performance/watt it seems better IPC and more gpu core at lower clocks.

    192SP on the new low power APU's.

    It remains to see if they finally added dual channel support which bottlenecks Kabini/Temash hard regarding 3D games.
  • aryonoco - Wednesday, November 13, 2013 - link

    Me thinks that Mullins could be a perfect SoC for fanless Chromebook. Can AMD pick the phone up and call Google please?
  • milli - Thursday, November 14, 2013 - link

    Is it possible that AMD switched to TSMC's 28HPM process? It would be the easiest way to increase speed without sacrificing power. Which process was AMD using for Temash/Kabini? 28HP or 28LP?
  • ddriver - Thursday, November 14, 2013 - link

    Finally something that may be a reasonable choice for a tablet.
  • Hrel - Thursday, November 14, 2013 - link

    dockport sounds cool. that's not AMD exclusive is it?

    thunderbolt has licensing fees that make it FAR too expensive to be practical. So I'd like to see EVERY laptop made have this dockport feature. As desktops fade away the need for a cheap, standardized docking technology is ever increasing.

Log in

Don't have an account? Sign up now