The IBM POWER8 Review: Challenging the Intel Xeon
by Johan De Gelas on November 6, 2015 8:00 AM EST- Posted in
- IT Computing
- CPUs
- Enterprise
- Enterprise CPUs
- IBM
- POWER
- POWER8
Energy and Pricing
Unfortunately, accurately and fairly comparing energy consumption at the system level between the S822L and other systems wasn't something we were able to do, as there were quite a few differences in the hardware configuration. For example, the IBM S822L had two SAS controllers and we had no idea how power hungry that chip under the copper heatsink was. Still there is no doubt that the dual CPU system is by far the most important power consumer when the server system is under load. In case of the IBM system, the Centaur chips will take their fair share too, but those chips are not optional. So we can only get a very rough idea how the power consumption compares.
Xeon E5 299 v3/POWER8 Comparison (System) | ||
Feature | 2x Xeon E5-2699v3 | 2x IBM POWER8 3.4 10c IBM S822L |
Idle | 110-120W | 360-380W |
Running NAMD (FP) |
540-560W |
700-740W |
Running 7-zip (Integer) |
300-350W |
780-800W |
The Haswell core was engineered for mobile use, and there is no denying that Intel's engineers are masters at saving power at low load.
The mightly POWER8 is cooled by a huge heatsink
IBM's POWER8 has pretty advanced power management, as besides p-states, power gating cores and the associated L3-cache should be possible. However, it seems that these features were not enabled out-of-the box for some reason as idle power was quite high. To be fair, we spent much more time on getting our software ported and tuned than on finding the optimal power settings. In the limited time we had with the machine, producing some decent benchmarking numbers was our top priority.
Also, the Centaur chips consume about 16W per chip (Typical, 20W TDP) and as we had 8 of them inside our S822L, those chips could easily be responsible for consuming around 100W.
Interestingly, the IBM POWER8 consumes more energy processing integers than floating point numbers. Which is the exact opposite of the Xeon, which consumes vastly more when crunching AVX/FP code.
Pricing
Though the cost of buying a system might be only "a drop in the bucket" in the total TCO picture in traditional IT departements running expensive ERP applications, it is an important factor for almost everybody else who buys Xeon systems. It is important to note that the list prices of IBM on their website are too high. It is a bad habit of a typical tier-one OEM.
Thankfully we managed to get some "real street prices", which are between 30% (one server) and 50% (many) lower. To that end we compared the price of the S822L with a discounted DELL R730 system. The list below is not complete, as we only show the cost of the most important components. The idea is to focus on the total system price and show which components contribute the most to the total system cost.
Xeon E7v3/POWER8 Price Comparison | ||||
Feature | Dell R730 | IBM S822L | ||
Type | Price | Type | Price | |
Chassis | R730 | N/A | S822L | N/A |
Processor | 2x E5-2697 | $5000 | 2x POWER8 3.42 | $3000 |
RAM | 8x 16GB DDR4 DIMM |
$2150 | 8x 16 GB CDIMM (DDR3) | $8000 |
PSU | 2x 1100W | $500 | 2x 1400W | $1000 |
Disks | SATA or SSD | Starting at $200 |
SAS HD/SSD | +/- $450 |
Total system price (approx.) | $10k | $15k |
With more or less comparable specs, the S822L was about 50% more expensive. However, it was almost impossible to make an apples-to-apples comparison. The biggest "price issue" are the CDIMMs, which are almost 4 times as expensive as "normal" RDIMMs. CDIMMs offer more as they include an L4-cache and some extra features (such as a redundant memory chip for each 9 chips). For most typical current Xeon E5 customers, the cost issue will be important. For a few, the extra redundancy and higher bandwidth will be interesting. Less important, but still significant is the fact that IBM uses SAS disks, which increase the cost of the storage system, especially if you want lots of them.
This cost issue will be much less important on most third party POWER8 systems. Tyan's "Habanero" system for example integrates the Centaur chips on the motherboard, making the motherboard more expensive but you can use standard registered DDR3L RDIMMs, which are much cheaper. Meanwhile the POWER8 processor tends to be very reasonably priced, at around $1500. That is what Dell would charge for an Intel Xeon E5-2670 (12 cores at 2.3-2.6 GHz, 120W). So while Intel's Xeon are much more power efficient than the POWER8 chips, the latter tends to be quite a bit cheaper.
146 Comments
View All Comments
usernametaken76 - Thursday, November 12, 2015 - link
Technically this is not true. IBM had a working version of AIX running on PS/2 systems as late as the 1.3 release. Unfortunately support was withdrawn and future releases of AIX were not compiled for x86 compatible processors. One can still find a copy of this release if one knows where to look. It's completely useless to anyone but a museum or curious hobbyist, but it's out there.zenip - Friday, November 13, 2015 - link
...>--click here-Steven Perron - Monday, November 23, 2015 - link
Hello Johan,I was reading this article, and I found it interesting. Since I am a developer for the IBM XL compiler, the comparisons between GCC and XL were particularly interesting. I tried to reproduce the results you are seeing for the LZMA benchmark. My results were similar, but not exactly the same.
When I compared GCC 4.9.1 (I know a slightly different version that you) to XL 13.1.2 (I assume this is the version you used), I saw XL consistently ahead of GCC, even when I used -O3 for both compilers.
I'm still interested in trying to reproduce your results, so I can see what XL can do better, so I have a couple questions on areas that could be different.
1) What version of the XL compiler did you use? I assumed 13.1.2, but it is worth double checking.
2) Which version of the 7-zip software did you use? I picked up p7zip 15.09.
3) Also, I noticed when the Power 8 machine was running at full capacity (for me that was 192 threads on a 24 core machine), the results would fluctuate a bit. How many runs did you do for each configuration? Were the results stable?
4) Did you try XL at the less aggressive and more stable options like "-O3" or "-O3 -qhot"?
Thanks for you time.
Toyevo - Wednesday, November 25, 2015 - link
Other than the ridiculous price of CDIMMs the power efficiency just doesn't look healthy. For data centers leasing their hardware like Amazon AWS, Google AppEngine, Azure, Rackspace, etc, clients who pay for hardware yet fail to use their allocation significantly help the bottom line of those companies by reduced overheads. For others high usage is a mandatory part of the ROI equation during its period as an operating asset, thus power consumption is a real cost. Even with our small cluster of 12 nodes the power efficiency is a real consideration, let alone companies standardizing toward IBM and utilising 100s or 1000s of nodes that are arguably less efficient.Perhaps you could devise some sort of theoretical total cost of ownership breakdown for these articles. My biggest question after all of this is, which one gets the most work done with the lowest overheads. Don't get me wrong though, I commend you and AnandTech on the detail you already provide.
AstroGuardian - Tuesday, December 8, 2015 - link
It's good to have someone challenging Intel, since AMD crap their pants on regular basisdba - Monday, July 25, 2016 - link
Dear Johan:Can you extrapolate how much faster the Sparc S7 will be in your Cluster Benchmarking,
if the 2 on Die Infiniband ports are Activated, 5, 10, 20% ???
Thank You, dennis b.