Sensible Scaling: OoO Atom Remains Dual-Issue

The architectural progression from Apple, ARM and Qualcomm have all been towards wider, out-of-order cores, to varying degrees. With Swift and Krait, Apple and Qualcomm both went wider. From Cortex A8 to A9 ARM went OoO and then from A9 to A15 ARM introduced a significantly wider architecture. Intel bucks the trend a bit by keeping the overall machine width unchanged with Silvermont. This is still a 2-wide architecture.

At the risk of oversimplifying the decision here, Intel had to weigh die area, power consumption as well as the risk of making Atom too good when it made the decision to keep Silvermont’s design width the same as Bonnell. A wider front end would require a wider execution engine, and Intel believed it didn’t need to go that far (yet) in order to deliver really good performance.

Keeping in mind that Intel’s Bonnell core is already faster than ARM’s Cortex A9 and Qualcomm’s Krait 200, if Intel could get significant gains out of Silvermont without going wider - why not? And that’s exactly what’s happened here.

If I had to describe Intel’s design philosophy with Silvermont it would be sensible scaling. We’ve seen this from Apple with Swift, and from Qualcomm with the Krait 200 to Krait 300 transition. Remember the design rule put in place back with the original Atom: for every 2% increase in performance, the Atom architects could at most increase power by 1%. In other words, performance can go up, but performance per watt cannot go down. Silvermont maintains that design philosophy, and I think I have some idea of how.

Previous versions of Atom used Hyper Threading to get good utilization of execution resources. Hyper Threading had a power penalty associated with it, but the performance uplift was enough to justify it. At 22nm, Intel had enough die area (thanks to transistor scaling) to just add in more cores rather than rely on HT for better threaded performance so Hyper Threading was out. The power savings Intel got from getting rid of Hyper Threading were then allocated to making Silvermont an out-of-order design, which in turn helped drive up efficient use of the execution resources without HT. It turns out that at 22nm the die area Intel would’ve spent on enabling HT was roughly the same as Silvermont’s re-order buffer and OoO logic, so there wasn’t even an area penalty for the move.

The Original Atom microarchitecture

Remaining a 2-wide architecture is a bit misleading as the combination of the x86 ISA and treating many x86 ops as single operations down the pipe made Atom physically wider than its block diagram would otherwise lead you to believe. Remember that with the first version of Atom, Intel enabled the treatment of load-op-store and load-op-execute instructions as single operations post decode. Instead of these instruction combinations decoding into multiple micro-ops, they are handled like single operations throughout the entire pipeline. This continues to be true in Silvermont, so the advantage remains (it also helps explain why Intel’s 2-wide architecture can deliver comparable IPC to ARM’s 3-wide Cortex A15).

While Silvermont still only has two x86 decoders at the front end of the pipeline, the decoders are more capable. While many x86 instructions will decode directly into a single micro-op, some more complex instructions require microcode assist and can’t go through the simple decode paths. With Silvermont, Intel beefed up the simple decoders to be able to handle more (not all) microcoded instructions.

Silvermont includes a loop stream buffer that can be used to clock gate fetch and decode logic in the event that the processor detects it’s executing the same instructions in a loop.

Execution

Silvermont’s execution core looks similar to Bonnell before it, but obviously now the design supports out-of-order execution. Silvermont’s execution units have been redesigned to be lower latency. Some FP operations are now quicker, as well as integer multiplies.

Loads can execute out of order. Don’t be fooled by the block diagram, Silvermont can issue one load and one store in parallel.

 

OoOE & The Pipeline ISA, IPC & Frequency
Comments Locked

174 Comments

View All Comments

  • Kevin G - Monday, May 6, 2013 - link

    Actually I've gotten the impression from Anandtech that Intel has been so tardy on providing chips for the mobile market that they may have lost the fight before even showing up. Intel may have good designs and the best foundries but that doesn't matter if ARM competitors arrive first with 'good enough' designs to gobble up all the market share. There is a likely a bit of frustration here constantly hearing about good tech that never reaches its potential.

    There was the recent line in the news article here about Intel's CEO choice about how Intel is foundry that makes x86 processors. That choice was likely selected due to Intel's future of becoming an open foundry to 3rd party designs. Intel has done this to a limited degree already. They recently signed a deal with Microsemi to manufacture FPGA's on Intel's 22 nm process. Presumably future Microsemi ARM based SoC + FGPA chips will also be manufactured by Intel as well.
  • Kidster3001 - Tuesday, May 7, 2013 - link

    Intel has publicly stated that it's foundry business will never make products for a competitor. That means no ARM SoC's in Intel fabs.
  • Kevin G - Tuesday, May 7, 2013 - link

    Intel isn't active in the FPGA area, well there than manufacturing them for a handful of 3rd parties. The inclusion of an ARM core inside a SOC + FGPA design wouldn't be seen as a direct competitor. Indirectly it definitely would be a competitor but then again just the FPGA alone would be an indirect competitor.
  • name99 - Monday, May 6, 2013 - link

    Actually the REAL history is
    - Intel article appears. All the ARM fans whine about how unfair and awful it is, and how it refers to a chip that will only be released in six months.
    - ARM article appears. All the Intel fans whine about how unfair and awful it is, and how it refers to a chip that will only be released in six months.
    - Apple (CPU) article appears. Non-Apple ARM and Intel fans both whine about how unfair it is (because of tight OS integration or something, and Apple is closed so it doesn't count).

    Repeat every six months...
  • Bob Todd - Tuesday, May 7, 2013 - link

    Winner winner chicken dinner. I love how butt hurt people get about any article comparing CPU or GPU performance of two or more competitors (speculatively or not). I have devices with Krait, Swift, Tegra 3, Bobcat, Llano, Ivy Bridge, etc. They all made sense at the time for one reason or another or I wouldn't have them. I'm excited about Slivermont, just like I'm excited about Jaguar, and whatever Apple/Samsung/Qualcom/Nvidia cook up next on the ARM side. It's an awesome time to be into mobile gadgets. Now I'll sit back and laugh at the e-peen waiving misguided fanboyism...
  • axien86 - Monday, May 6, 2013 - link


    Acer is shipping new V5 ultraportables based on AMD's Jaguar high performance per watt technology in 30 days. AMD is 10 to 20 times smaller than Intel, but with design wins from Sony, Microsoft and now many other OEMs, they are delivering real performance for real value.

    By contrast Intel really has nothing to show, but endless public relations to compensate for a history of company that has been upstaged by smaller companies like AMD in forging real innovations in computing.
  • A5 - Monday, May 6, 2013 - link

    If by "high performance per watt" you mean "less performance in a higher TDP" than sure. Intel trounces AMD in notebooks for a reason.

    As for the Sony/MS stuff, I doubt Intel even bid for those contracts.
  • kyuu - Monday, May 6, 2013 - link

    I hope you're kidding. Bobcat-based designs have been superior to Atom for forever, and if you take graphics performance into account, then Atom has been nothing short of laughable. I wouldn't be surprised if Silvermont beats Jaguar in CPU performance, but it'll be a small delta, and Jaguar is coming out a full half-year ahead of Silvermont.

    It's also nice that Intel might get GPU performance around the level of the iPad 4's SoC by the end of the year, but I believe AMD's mobile graphics already handily surpass that and the ARM world will have moved on to solutions that handily surpass that by then as well. So, yet again, Intel will be well behind the GPU curve. It won't be laughably bad anymore, though, at least.

    And I really love that last line. "Intel didn't get some design wins? Well, psh, they totally didn't even want those anyway."
  • kyuu - Monday, May 6, 2013 - link

    Oh, and also not sure why you brought notebooks up when we're talking about architectures for very low-power devices like tablets, netbooks, and maybe some ultrathins. No one would claim that Trinity/Richland is at the same level of CPU performance as Ivy Bridge/Haswell. Personally, though, I'd still prefer an AMD solution for a notebook for the superior graphics, lower price, and more-than-adequate CPU performance.
  • xTRICKYxx - Tuesday, May 7, 2013 - link

    This is where I want AMD to come into play. Their low power CPU's are so much better than Atom ever was, and always had superior graphics.

Log in

Don't have an account? Sign up now