How AMD's Istanbul might close the gap with Nehalem EP
by Johan De Gelas on February 25, 2009 12:00 AM EST- Posted in
- IT Computing general
The Istanbul cores are the same as those that can be found in the AMD's latest Shanghai CPU. But the "uncore" part of Istanbul is more interesting. By now, you have probably heard about AMD's "HT-assist" technology, a probe or snoop filter. Every time a new cacheline is brought into the L3-cache of for example CPU 1 on the current Shanghai Platform, a broadcast message is sent to all L3-caches of all CPUs, and CPU 1 has to wait until those CPUs answer.
In the case of Istanbul, the CPU will simply check it's snoop filter in it's own L3-cache, and if none of the other CPUs have that certain cacheline, it can go ahead. This lowers the latency of bringing in a new cacheline and raises the effective bandwidth.
To better understand this, we combined our own stream benchmarking with the one that AMD presented. All AMD systems are using DDR-2 800.
As each Stream thread works on its own data, there is no reason to send out coherency synchronization requests. These requests slow the process of getting new cachelines in the L3 and hence lower effective memory bandwidth. What is interesting is that this will not only benefit the applications that use the HT interconnects a lot for coherency traffic, but also applications like stream which do not need the HT interconnects. Also notice that HT 3.0 does not improve memory bandwidth, as Stream will try to keep its thread data local. Our testing used SUSE SLES 10 SP2 and AMD used Windows 2008. Both OSs are well optimized and NUMA aware.
This means that especially HPC applications, with many threads all working on their own data, will benefit from the higher effective bandwidth. Besides HT assist, AMD has now confirmed to us that the memory controller has been tuned quite a bit. This higher amount of bandwidth will allow the quad Istanbul to stay out of the reach of the dual Nehalem EP Xeons in many HPC applications.
HT assist might also improve the SAP and OLTP scores quite a bit, but for a different reason. SAP and OLTP applications perform a lot of cache coherency syncronization requests, so the snoop filter will substantially lower the average latency of such requests as in some cases:
- the CPU will only wait on one other CPU (instead of waiting for all responses to come back)
- the CPU won't have to wait at all, as the other CPUs don't have this line.
Secondly, this will also lower memory latency, which is a bonus for almost every multi-threaded application.
Lower memory latency, higher bandwidth, lower "cache coherency" latency and more interconnect bandwidth: the improved "uncore" of Istanbul will be vital to close the gap with Nehalem. Much will depend on how quickly Intel introduces its own hexacore 32 nm Xeons, but that probably won't happen before 2010. Istanbul is shaping up to be a really good alternative for Intel's quadcore Nehalem. We might see a good fight after all...
Don't forget to check it.anandtech.com (IT portal) often, as many of our blogposts (for example the VMworld 2009 coverage) are not published on the frontpage of Anandtech.com.
40 Comments
View All Comments
FanofunderdogAMD24 - Thursday, May 21, 2009 - link
I agree that 90% of blogs, comment sections, etc are 'Political', instead of 'Technical'.While I am a fan of AMD it doesn't mean I'm going to go into any discussion and right bias reviews and comments to reviews. Anyone smart enough looks at the technology from both sides, and than drill down from there. When speaking of both companies I'm glad they compete with one another because at the end me the consumer wins with a wide variety of products to choose from.
Only thing I find a little insulting is when Intel users calls anyone who buys an AMD product a fool, or stupid, or anything similar. Sometimes we are informed professionals who understand the pro's and con's from both sides but decided to go with an AMD product. Does this mean where bias? no. Does this mean we only like AMD no. It just simply means the product AMD had to offer fit the project we're working on.
On personal opinion is I think Intel moves to fast. Thus the reason in the past they allowed AMD to gain the crown for a short while. They are still moving fast how ever I like the different direction they took to analyze what was missing in there CPU's and make that modification. Thus creating a truly amazing architecture.
There is a post out there that explains AMD has learned some secrets behind the Core i7 line of processor and has identified several bottlenecks on there currently platform and processor. This post also states they are currently in the works to correct those bottlenecks that have been identified. Meaning in the near future we might actually see something that might scale well against the Core i7.
Notice I said 'scale' well against Intel. This means they are working on improving power leakage in the CPU that causes extra heat. And also increasing speed on the their platform to see if they can come out with something thats worth competition against Intel products.
That being said I place AMD second place. They are a great buy for those not extremely worried about power consumption. And also for those who don't particularly look alot into performance but rather the features it has to offer.
I place Intel first because with its on-die memory control they improved performance way more than I expected. They are for those who are concerned over power consumption and heat. They compete well in terms of feature offerings (Not talking about the performance here but just the fact that they offer features). And they have the performance and head room to take on just about any application and data center out there. A complete package overall.
Hopefully this can be a neutral post as I am neutral between both companies. I am a fan of AMD as I stated before however I'm in no position to talk bad Intel because they have truly amazed me.
Adun - Saturday, April 11, 2009 - link
Let me get this straight - the difference between AMD's HT Assist and Intel's inclusive L3 cache is that HT Assist is able to determine if a cache line exist in other CPU's (different socket) and Intel's inclusive L3 core X cache of CPU Y is aware of the cache lines of all the other cores in CPU Y ?Thank you,
Adun.
winterspan - Friday, February 27, 2009 - link
Johan,Although I have been critical of some of your posts, I am above all interested in maintaining a civil, engaged debate about all of this fascinating technology. Despite all of the hostility that can be thrown your way from the legions of crazed Anandtech commenters, I think the vast majority of readers appreciate all of the painstaking work that you do.
-All the best-
JohanAnandtech - Saturday, February 28, 2009 - link
Thanks man, it was good to read your post. It is really a pity that some people focus on the "political" side of things, while 95% of the post is technical. Let us debate the technical part, and feel free to continue to be very critical about that part.I am an engineer and academic like many of you, not a specialist in "safe but totally woolly" communications also known as "Politician".
JohanAnandtech - Friday, February 27, 2009 - link
You can deduct the numbers from the current available Core i7, but Nehalem numbers have been removed to avoid any problems with NDAs.AMD HT3 numbers are based on AMD's own testing as described in the previous post. AMD HT1 numbers are based on our own testing.
tshen83 - Friday, February 27, 2009 - link
You are worried about NDAs from Intel's side on a retail Nehalem-EPs, and apparently have no trouble revealing benchmarks on Istanbul.Without Nehalem-EP performance data, how is your post relevant? The correct response is to remove the entire post and say something along the lines of "full review to come when NDA expires."
BTW, your newest blog:
HP still in denial: "SSD not ready for the enterprise" is linked improperly since yesterday. And it is not March 1, 2009 yet.
I don't know how a person with your aptitude can be legally qualified to write articles for Anandtech.
JohanAnandtech - Friday, February 27, 2009 - link
"and apparently have no trouble revealing benchmarks on Istanbul."Those benchmarks have been disclosed by AMD. I clearly indicated this in this post, and I linked to them in the previous post. You should take the time to read the article.
"HP still in denial: "SSD not ready for the enterprise" is linked improperly since yesterday. And it is not March 1, 2009 yet. "
Where do you see this? this is an error of our blog engine.
wingless - Wednesday, February 25, 2009 - link
Tshen83, your ego has made a fool out of you. You really should not accuse Anandtech of unreliable journalism. This only makes you sound more ignorant than you probably would like to come across as. You've made yourself come across as a troll.melgross - Friday, February 27, 2009 - link
Nevertheless, most of what he said is correct.dastruch - Friday, February 27, 2009 - link
indeed