Intel Woodcrest: the Birth of a New King
by Jason Clark & Ross Whitehead on July 13, 2006 12:05 AM EST- Posted in
- IT Computing
Architecture Summary
Woodcrest's home is a newer revision of the Bensley platform than what Dempsey launched with, which means that it's a drop-in part for newer Bensley based systems. If all goes to plan Clovertown (Quad-Core Xeon) should be a drop-in upgrade as well (depending on the system vendor). As we discussed in our Dempsey article, the Bensley platform features FB-DIMM with a peak bandwidth of 21GB/sec, SAS/SATA support and 1066/1333MHz FSB.
Woodcrest Highlights:
Shared 4MB L2 "Smart Cache"
Dempsey based processors had a separate 2MB L2 cache for each core, but Woodcrest has 4MB of L2 Cache shared between both cores. Due to the fact that the cores share a single cache, there is no data replication like there is with separate L2 caches; this results in more efficient data-sharing between cores. The shared cache also helps with mismatched loads: when one core is consistently using more cache than the other core, the CPU can allocate more L2 cache to that core. Both of these techniques are illustrated below.
Wide Dynamic Execution Enhancements
With the Intel Core micro-architecture, every execution core is 33% wider than previous generations, allowing each core to fetch, dispatch, execute and retire up to four full instructions simultaneously. The Opteron - as well as all previous NetBurst Xeon processors - can only handle 3 at a time.
Macro Fusion
Macro-fusion combines certain common x86 instructions into a single instruction for execution. Without Macro-fusion four instructions at a time are fetched from the queue and each instruction gets decoded into separate micro-ops. With Macro Fusion, 5 instructions can be fetched at a time, and if a fusable pair is present it can be sent to a single decoder. A single micro-op can then represent two regular x86 instructions.
Beyond 2 Sockets, is Intel's FSB still an Achilles Heel?
As we've seen in past benchmarks, the front side bus has been a thorn in Intel's side, especially in the quad socket systems. Whether or not the new architectural changes that Intel has made with Woodcrest will alleviate enough of that pressure to overpower the scalability of Opteron in four socket configurations is unknown at this point. Intel is quite confident that with the shared cache and its dual independent FSB running at 1333MHz that bus bandwidth is not a concern, however at some point the bus bottleneck will be a problem. One of Intel's architects has however stated that an integrated memory controller is possible, which Intel has already shown us a demo of.
Woodcrest's home is a newer revision of the Bensley platform than what Dempsey launched with, which means that it's a drop-in part for newer Bensley based systems. If all goes to plan Clovertown (Quad-Core Xeon) should be a drop-in upgrade as well (depending on the system vendor). As we discussed in our Dempsey article, the Bensley platform features FB-DIMM with a peak bandwidth of 21GB/sec, SAS/SATA support and 1066/1333MHz FSB.
Woodcrest Highlights:
Shared 4MB L2 "Smart Cache"
Dempsey based processors had a separate 2MB L2 cache for each core, but Woodcrest has 4MB of L2 Cache shared between both cores. Due to the fact that the cores share a single cache, there is no data replication like there is with separate L2 caches; this results in more efficient data-sharing between cores. The shared cache also helps with mismatched loads: when one core is consistently using more cache than the other core, the CPU can allocate more L2 cache to that core. Both of these techniques are illustrated below.
Wide Dynamic Execution Enhancements
With the Intel Core micro-architecture, every execution core is 33% wider than previous generations, allowing each core to fetch, dispatch, execute and retire up to four full instructions simultaneously. The Opteron - as well as all previous NetBurst Xeon processors - can only handle 3 at a time.
Macro Fusion
Macro-fusion combines certain common x86 instructions into a single instruction for execution. Without Macro-fusion four instructions at a time are fetched from the queue and each instruction gets decoded into separate micro-ops. With Macro Fusion, 5 instructions can be fetched at a time, and if a fusable pair is present it can be sent to a single decoder. A single micro-op can then represent two regular x86 instructions.
Beyond 2 Sockets, is Intel's FSB still an Achilles Heel?
As we've seen in past benchmarks, the front side bus has been a thorn in Intel's side, especially in the quad socket systems. Whether or not the new architectural changes that Intel has made with Woodcrest will alleviate enough of that pressure to overpower the scalability of Opteron in four socket configurations is unknown at this point. Intel is quite confident that with the shared cache and its dual independent FSB running at 1333MHz that bus bandwidth is not a concern, however at some point the bus bottleneck will be a problem. One of Intel's architects has however stated that an integrated memory controller is possible, which Intel has already shown us a demo of.
59 Comments
View All Comments
ashyanbhog - Tuesday, July 18, 2006 - link
Quite shocking to see Anand perform such a biased benchmark and get away so easily.Is it a coincidence that Dell did not sell AMD chips in their machines to date, and benchmarks from Dell show Intel chips perform better
Can we say tuned or skewed
photoguy99 - Thursday, July 13, 2006 - link
It's just killing fan boys like Kiijibari that Intel is the best 2-way server out there now and they have to craft these elaborate scenarios to somehow justify how AMD is still great.Man give it up - Is it not enough almost every hardware site on the net has crowned Woodcrest the new 2-way champ over AMD? How much more evidence do you want?
As I've posted before I own an FX-60 now so I don't feel great that Intel will soon be selling at Wal-mart a chip that will kick ass on my carefully overclocked FX system.
But so what? It is what it is. Sure AMD are planning new things, and when and if they are benchmarked to be superior, then you can have your day again.
For now Intel *owns* AMD except a couple niche segments - get used to it.
duploxxx - Friday, July 14, 2006 - link
i think you mean conroe... not woodcrest (the server chip) you can count the reviews on 1 hand that were made... two of them were from anand here. which are still in a large discussion about comparisson etc... and this one is comming straigth from intel...nice to see the "king" is anounced by intel themselves
vaystrem - Thursday, July 13, 2006 - link
That prevented Intel's Woodcrest computers from being considered for government bids?http://www.theinquirer.net/default.aspx?article=32...">US government unit throws Intel out over RAID problems
or
http://theinquirer.net/default.aspx?article=32818">Conroe shows dodgy RAID performance anomalies
I know its 'The Inq' but since this is a server test it would be nice to see some confirmation or exploration of this issue.
drwho9437 - Thursday, July 13, 2006 - link
The charts are almost totally inscrutable for the red-green color blind population, which is something like 5% of males. Learn to use a decent color scheme or incorperate symbol shapes as well as colors. Map makers know this...forPPP - Thursday, July 13, 2006 - link
All of you ranting about comparing 3.0 GHz Woodcrest to 2.6 GHz Operton, look here:http://www.behardware.com/articles/623-1/intel-cor...">http://www.behardware.com/articles/623-1/intel-cor... and see how much better Core 2 architecture is. Core 2 Duo 2.13 GHz beats Athlon FX-62 2.8 GHz in most benchmarks. Of course architecture is not everything, especially in enterprise market. Operton has advantage thanks to HyperTransort and advantage of Woodcrest is diminished because of FSB. But the main battle will occur with Desktops and here Core 2 Duo shines. Lets hope AMD will show something intersting soon, not only prices drop. All in all we consumers will benefit from this battle.
duploxxx - Friday, July 14, 2006 - link
you are looking at desktop... don't compare desktop with server... desktop is more the mass and low profit.... server and laptop are for the profit.Locutus465 - Thursday, July 13, 2006 - link
AMD will manage to come out with a decent competetor in the next little while and we'll have real compitition in the CPU space again. I'm sure this sucks for AMD right now, and if AMD were able to rebound and deliver a competitor in the relitive near term futer for intel too. But for consumers, competition is beautiful, already we can look twards dirt cheap A64's for your low to mid-range computing needs.FesterOZ - Thursday, July 13, 2006 - link
I tried skimming back through the article but is Anand just measuring the CPU wattage or the overall wattage draw for the whole platform (i.e. cpu, northbridge,dimms?Jason Clark - Thursday, July 13, 2006 - link
Wall folks, sorry that wasn't more clear. We'll ensure we include power measurement information in future articles. We use the same procedure as we've used in previous articles with power, an extech device and we log power througout the test duration.Cheers.