Integer & FP Execution

On the integer execution side, units and pipelines look largely unchanged from Bobcat. The big performance addition here is the use of Llano’s hardware divider. Bobcat had a microcoded integer divider capable of one bit per cycle, while Jaguar moves to a 2-bits-per-cycle divider. The hardware is all clock gated, so when it’s not in use there’s no power penalty.

The schedulers and re-order buffer are incrementally bigger in Jaguar. Some scheduling changes and other out of order resource increases are at work here as well.

Integer performance wasn’t a huge problem with Bobcat to begin with, but floating point performance was a different issue entirely. In our original Brazos review we found that heavily threaded FP workloads were barely faster on Bobcat than they were on Atom. A big part of that had to do with Atom’s support for Hyper Threading. AMD addressed both issues by beefing up FP execution and doubling up the maximum number of CPU cores with Jaguar (more on this later).

Bobcat’s FP execution units were 64-bits wide. Any 128-bit FP operations had to be chunked up and worked on in stages. In Jaguar, AMD moved all of its units to 128-bits wide. AVX operations complete as 2 x 128-bit operations, while all other 128-bit operations can execute without multiple passes through the pipeline. The increase in vector width is responsible for the gains in FP performance.

The move to 128-bit vectors in the FPU forced AMD to add another pipeline stage here as well. The increase in FPU size meant that some signals needed a little extra time to get from one location to the next, hence the extra stage.

Load/Store

The out-of-order load/store unit in Bobcat was the first one AMD had ever done (Bobcat beat Bulldozer to market, so it gets the claim to fame there). As such there was a good amount of room for improvement, which AMD capitalized on in Jaguar. The second gen OoO load/store unit is responsible for a good amount of the ~15% gains in IPC that AMD promises with Jaguar.

Jaguar: Improved 2-wide Out-of-Order The Jaguar Compute Unit & Physical Layout/Synthesis
Comments Locked

78 Comments

View All Comments

  • Krysto - Friday, May 24, 2013 - link

    Still don't see why OEM's would choose AMD's APU's in Android tablets over ARM, though.

    It's weaker CPU wise, and most likely weaker GPU wise, too. We'll see when they come out if their GPU's can stand up to Adreno 330, PowerVR Series 6 and Mali T628. Plus, it requires quite a bit of power.

    In my book no chip that can't be used in a smartphone (and I'm talking about the exact same model, not the "brand") should be called a "mobile chip".

    This idea about "tablet chips" is nonsense. Tablet chips is just another way of saying our chip is not efficient enough, so we're just going to compensate for that with a much larger battery, that adds more to weight, charging time, and of course price.
  • ReverendDC - Monday, May 27, 2013 - link

    Even an Atom chip is more powerful per cycle than ARM, and AMD's stuff is more powerful than Atom. I'm not exactly sure what you are using to state that ARM is more powerful, but AnandTech did a great comparison themselves.

    By the way, the comparisons in some cases are for quad core ARM vs. single core Atom at comparable speeds. Again, really not sure where your "facts" come from.
  • BernardBlack - Wednesday, May 29, 2013 - link

    ARM actually isn't all that...and it's quite the other way around.

    As I have seen it stated elsewhere, "Simply: x86 IPC eats ARM for lunch while actual performance and power usage will scale together. That is why ARM currently has no real business competing against x86"
  • BernardBlack - Wednesday, May 29, 2013 - link

    It's all about instructions per cycle and that is what AMD and Intel do best.
  • BernardBlack - Wednesday, May 29, 2013 - link

    not to mention, the IPC's in these processors follow suit with their server Opteron processors, which means, they achieve even greater IPC's per cycle. This is how you are able to have 1.6ghz CPU's that can compete with many common 3-4ghz desktop processors.
  • eanazag - Friday, May 31, 2013 - link

    They could use the exact same hardware design to sell Android and Windows tablets.
  • Wolfpup - Wednesday, June 12, 2013 - link

    Huh? AMD's Bobcat parts are more powerful than ARM's stuff. ARM's only just now managing to sort of compete with first gen Atom at best, and that with a CPU that's not actually used in much.

    And "tablet chips" is NOT nonsense. You have more power budget, and higher expectations for performance in a tablet. If it's "nonsense", why does Apple put bigger chips in their tablets? Why can tablets run Core i CPUs? Why do they typically get bigger chips first even with Android?
  • Wolfpup - Wednesday, June 12, 2013 - link

    To add to that, THIS article explicitly says "Jaguar is presently without competition...nothing from ARM is quick enough."

    So really, where the heck are you getting the idea ARM has more powerful chips?
  • kyuu - Thursday, May 23, 2013 - link

    I think the main point of this is getting into the mobile market. Temash looks like a great chip for a tablet. The only problem is OEMs not biting because they think they have to put an Intel sticker on the box for it to sell.

    Personally, I'm waiting on a good tablet with Temash to finally jump into the Win8 tablet club. Whatever OEM makes a good one first will be getting my money.
  • mikato - Friday, May 24, 2013 - link

    I think with tablets, OEMs are even less likely to think they need to put an Intel sticker on it. Joe Schmo knows Intel doesn't mean as much for tablets. The most well known tablets aren't Intel. This is an opening for AMD to be able to get in the game late if they want to.

    Actually, do they even put stickers on tablets?

Log in

Don't have an account? Sign up now