Original Link: https://www.anandtech.com/show/2800/upgrading-and-analyzing-apple-s-nehalem-mac-pro



In my line of work, I tend to get access to a lot of very fast hardware. Both our SSD and GPU testbeds use Intel’s new Core i7 processor. If you read my review of the i7 you may have left the review feeling slightly underwhelmed by the processor. Sure, it was fast, but it wasn’t that much faster than a speedy Core 2 Quad.

In the months since that review went live I’ve had the benefit of using the i7 a lot. And I might’ve grown a little attached. The processor itself isn’t overly expensive, it’s the motherboard that really puts it over the top; but if you have the means, I highly recommend picking one up.

This is my Mac Pro:

It may look modern, but this is actually the same Mac Pro I reviewed back in 2006. In it are the same two 3.0GHz dual-core Woodcrest based Xeons that I upgraded it with for part 3 of my Mac Pro coverage. Woodcrest was the server version of Conroe, the heart of the original Core 2 Duo.

You’ll remember that I was quite happy with Conroe when it launched in 2006, so by extension I was quite happy with my Mac Pro. That was then, this is now.

Apple released a newer Mac Pro with quad-core Clovertown parts (65nm Kentsfield equivalent), then once more with Harpertown (45nm Penryn equivalent). While you could stick Clovertown into the first generation Mac Pros, you couldn’t upgrade them to Harpertown without hardware modifications to the system (don’t ask me what they are :)..).

I stayed away from the Harpertown upgrade simply because it was a lot of money for a moderate increase in performance. My desktop tests showed that Penryn generally yielded a 0 - 10% performance increase over Conroe and I wasn’t about to spend $3K for 10%. Steve didn’t need another Benz that badly.

I found myself waiting for Apple to do the right thing and release a Mac based on the Core i7. Surely Apple wouldn’t wait and make a Xeon version, after all why would you need two processors? A single Core i7 can work on eight threads at the same time - most users have a tough time stressing four. Then reality set in: Apple wouldn’t put a Core i7 in the Mac Pro because Dell can do the same in a system for under $900. In order to justify the price point of the Mac Pro, it must use Xeons.

The Nehalem Xeons can be pretty fun. At the high end there’s the Nehalem-EX, that’s 8 cores on a single die. Apple could put two of those on a motherboard and have a 16-core, 32-thread monster that would probably cost over $8,000.


The 8-core Nehalem EX

Getting back to reality, we have the Nehalem-EP processor: effectively a server-version of Core i7. The other major change between Nehalem-EP and Core i7 is that each Nehalem-EP processor has two QPI links instead of one. Nehalem-EP can thus be used in dual-socket motherboards.

Nehalem-EP even uses the same socket as Intel’s Core i7: LGA-1366, implying that Intel artificially restricts its desktop Core i7s to operate in single-socket mode only. Boo.

Of course Nehalem-EP is sold under the Xeon brand; the product names and specs are as follows:

CPU Max Sockets Clock Speed Cores / Threads QPI Speed L3 Cache Max Turbo (4C/3C/2C/1C) TDP Price
Intel Xeon W5580 2 3.20GHz 4 / 8 6.4 GT/s 8MB 1/1/1/2 130W $1600
Intel Xeon X5570 2 2.93GHz 4 / 8 6.4 GT/s 8MB 2/2/3/3 95W $1386
Intel Xeon X5560 2 2.80GHz 4 / 8 6.4 GT/s 8MB 2/2/3/3 95W $1172
Intel Xeon X5550 2 2.66GHz 4 / 8 6.4 GT/s 8MB 2/2/3/3 95W $958
Intel Xeon E5540 2 2.53GHz 4 / 8 5.86 GT/s 8MB 1/1/2/2 80W $744
Intel Xeon E5530 2 2.40GHz 4 / 8 5.86 GT/s 8MB 1/1/2/2 80W $530
Intel Xeon E5520 2 2.26GHz 4 / 8 5.86 GT/s 8MB 1/1/2/2 80W $373
Intel Xeon W3570 1 3.20GHz 4 / 8 6.4 GT/s 8MB 1/1/1/2 130W $999
Intel Xeon W3540 1 2.93GHz 4 / 8 4.8 GT/s 8MB 1/1/1/2 130W $562
Intel Xeon W3520 1 2.66GHz 4 / 8 4.8 GT/s 8MB 1/1/1/2 130W $284

 

While Nehalem was originally supposed to have a simultaneous desktop and server/workstation release, the Xeon parts got pushed back due to OEM validation delays from what I heard. Core i7 launched last November and it was now mid-March with no Nehalem based Macs.

I couldn’t wait any longer and I ended up building a Hackintosh based on Intel’s Core i7. Literally a day after I got it up and running, Apple announced the new Nehalem-EP based Mac Pro.



Two Models, Neither Perfect

The one common trend I’ve noticed from companies that are building products people want to buy despite the current economic climate: prices haven’t dropped. Nikon raised prices last year and Apple introduced two new Mac Pros one at $2499 and one at $3299.

The $2499 model comes with a single quad-core Xeon running at 2.66GHz. This is the Xeon equivalent of the Core i7-920. You get 3GB of memory, a 640GB HDD, an 18x DVD-DL burner and a GeForce GT 120.

Another $800 will get you two quad-core Xeons running at a slower 2.26GHz. You get twice the memory and everything else stays the same.

I suspect that for most users the $2499 configuration is more than enough, but for this review I’m testing the $3299 system and will attempt to explain how the $2499 machine would perform.

There’s basically a $30 cost difference between 6GB of DDR3 memory and 3GB of DDR3. It’s silly for Apple to not offer the base configuration with 6GB. Anything less than 4GB in a workstation is ridiculous for a system being made and sold in 2009. If you’ve read our Nehalem articles you’ll know that each chip has three 64-bit wide memory controllers, thus you’ll want to install DIMMs in triplets. You can install four DIMMs, but accessing memory in the fourth module will be slower - something you’ll never notice if you’re wondering.


You'll find six DIMMs in the 8-core Mac Pro. Two LGA-1366 CPU sockets, 3 memory channels per CPU socket, 3 DIMMs per chip.

I won’t complain too much about the hard drive. A 640GB HDD is fine, not great and I’ll soon show you how much better one of these machines is with an SSD but no complaints there.

The video card could use some work. I’m not concerned about the GPU power; it’s the amount of memory that bothers me. If any of Apple’s users are likely to have a multi-monitor setup it is a Mac Pro owner, and 512MB isn’t enough to enable silky smooth Exposé across a 30” + 24” setup. And you can forget about smooth transitions on two 30” panels.

Even the upgraded video card, the Radeon HD 4870 only comes with 512MB of GDDR5 memory. Apple charges an extra $200 for that card, even though that’s how much the 4870 1GB cards cost at Newegg. I have no problems with Apple making money, but not even offering a single 1GB graphics card is silly; especially when more memory is actually useful.

If you want the best solution for multiple monitors in a Mac Pro, you’ll want to get two GeForce GT 120s it seems (although there is a 3rd party option).



The Downside to Innovation

I’ve always appreciated Apple as a company because it isn’t afraid to completely ditch backwards compatibility in favor of embracing a new technology. For years Apple’s notebooks shipped with DVI ports on them and no direct VGA output. I loved it because I had DVI monitors, but that wasn’t true for everyone. Today Apple’s display interface of choice is mini DisplayPort:


Mini DisplayPort, to the left of the DVI port

It’s a cute little connector that we first saw Apple use on its updated MacBook and MacBook Pro. The benefit of the mini-DP connector is that it can easily be adapted to VGA or single-link DVI; adapter cable sold separately of course.

Since most users only have a single display, the new Mac Pro’s video card ships with both a dual-link DVI and a mini-DisplayPort output.

The mini DisplayPort output is just pure awesomeness.


Cute.

The mini-DP plug is just so much more pleasant than DVI or VGA connectors. There’s no annoying screws to worry about, just plug it in and the connector is secure. After using mini-DP on the Mac Pro I’m sold - I want one of these connectors on everything and I want monitors with mini-DP outputs.

It’s not all praise unfortunately. For starters, Apple doesn’t ship the Mac Pro with a mini-DP to DVI adapter. Given that there’s only one Apple display that uses mini-DP, it’s probably safe to say that next to no one has a mini-DP capable display. I’m all for early adoption of new technologies, but on a $3299 system just bundle the adapter ok?

The problems continue: natively this port will only drive a 1920 x 1200 panel, such as Apple’s 24” LED Cinema Display.

If you want to connect a single-link DVI display to the mini-DP port you need the adapter I showed in the picture above. Apple sells it for $30. It also comes in a VGA flavor.

If you want to connect a dual-link DVI monitor to the mini-DP output you need a different adapter:

This adapter draws power from the machine’s USB port. I’m guessing that there isn’t enough room/power to feed all of the DL-DVI pins from the mini DisplayPort connector so the adapter relies on USB to help out.

Reading through the customer reviews for this adapter it seems that many users are having compatibility issues with Apple’s mini-DP to DL-DVI converter with non-Apple displays. Not to mention that the adapter itself costs $99.

Between the high cost of the adapter and the high likelihood of problems, I’d suggest simply getting another video card if you want to have multiple 30” displays connected to your Mac Pro. Apple sells the GeForce GT 120 for $150 as an upgrade option, and at least with it each 30” display will be driven by its own frame buffer, which should make for smoother Exposé and Dashboard operation.



No 2.5” Drive Bays?

I expect innovation from Apple. The first to use SSDs on their shipping notebooks, the first to use DisplayPort yet Apple was beat to the punch by HP in bringing 2.5” drive bays to market in a desktop?

The overwhelming majority of SSDs these days are made for notebooks and thus they come in 2.5” form factors. The added space of a 3.5” chassis isn’t very useful with SSDs since the limiting factor for how much flash you can put on a drive is cost, not space.

The numbers I presented in The SSD Anthology apply to Macs as well; the fresh start test depicted in the graph below is just as applicable under OS X as it is under Vista:

While Apple won’t ship any of Intel’s X25-Ms in its machines today (for reasons that I unfortunately can’t talk about but have nothing to do with technology), there’s nothing stopping its end users from making the upgrade.

The problem is that the Mac Pro doesn’t really accommodate such an upgrade very well. There aren’t any cheap 2.5” to 3.5” adapters that will work with the Mac Pro’s drive sleds so you’re left having to construct such a thing on your own. The SATA and power interfaces are standard, so the drives will work just fine, you just need a way of securing them in the chassis.


A hanging SSD in the Mac Pro. Won't Apple please provide a 2.5" drive option?

I’d expect that Apple would be the one to do this but I guess taking care of its customers spending over $3000 on a machine isn’t really at the top of anyone’s list over there. Even the MacBook and MacBook Pro get SSD options; it’s not the fastest SSD in the world but the option is there.

Apple could have easily added a single 2.5” drive bay somewhere in the case for a SSD application/boot drive, or offer an optional 2.5” sled to replace one of the four 3.5” sleds in the system. It’s a simple change that wouldn’t take an engineer long to design; Apple could call it the iForgot.



Improvements: Limited but Important

To the disappointment of many, including yours truly, the new Mac Pro doesn’t actually look any different. It’s not that there’s anything wrong with the case, there’s just this expectation of improvement with every major Apple product release. In many ways Apple suffers from the same fate that Intel now does; after Conroe, we expect every major CPU generation to give us at least a 20% performance improvement with no nasty side effects.

So it doesn’t look any different, but there are some subtle changes to the outside of the case.


That's FireWire 800 for you

There are no longer any FireWire 400 ports - they are all now FireWire 800. I have two FireWire only devices: a Lexar UDMA Compact Flash card reader and an Apple iSight. The Lexar reader is FireWire 800 (woo!) and the iSight is FireWire 400; I can’t use the iSight on the new Mac Pro (not without a FireWire 400 to 800 adapter as many have pointed out). I’m guessing Apple will probably release an updated 30” display in the not too distant future with an integrated camera. I can’t have it both ways. I can’t have a company who assimilates every standard as quickly as possible yet provides backwards compatibility for every peripheral in my life. It’s the downside to innovation, but I simply can’t dock Apple any points here - it goes against one of the reasons I like Apple.

The inside of the new Mac Pro case is where all of the magic happened.

The drive bays in the first Mac Pro were great innovations at first sight; they just slid in and out and you didn’t really need any tools to use them (although a screwdriver was handy). Anyone who owned a Mac Pro knew that wasn’t enough room between the front of the drive sled and the hard drive when installed. There was just enough room for you to slide the tips of your fingers in there, grab and pull the drive out of the machine. The limited finger room plus the initial tendency of the drive sled to stay in place made for some crushed fingers. It was a nice attempt by Apple, but one that was ultimately frustrating to live with. The new Mac Pro adds more room for your fingers to grab the drive sled, avoiding the crushed fingers syndrome of the old model.


Look ma, more finger room.

The processors are also easier to gain access to; they live on a separate board with the X58 I/O Hub and the DIMM slots. Two latches and a pull are all you need to slide this puppy out:

Of course these are the new Nehalem based Xeon processors, meaning the memory controllers are on-die. The DIMM slots on the board branch off directly from each CPU.


The only danger is bending one of the pins that go into the high density connector you see below

And yes, I’ll show you how to upgrade the CPUs in the new Mac Pro later on in this article.



The Crossroads of Simplicity and Sophistication

Choices. Choices. Apple doesn’t like to present the end user with many choices. Too many choices can confuse, if left unchecked they can become overwhelming. The overburdening of choices is something that most PC OEMs fall victim to. I recently spoke with ASUS and brought this up in a conversation about the Eee PC. Three and four digit model numbers are how you tell one Eee PC apart from another. Perhaps you have the Eee PC 901, or the Eee PC 1000HA or the S101. To an enthusiast who has time to research these things, the model numbers aren’t that hard to figure out - it’s easier than Calculus after all. To someone just looking to buy “one of those Eee things”, it’s overwhelming.

Try buying an Apple notebook and you’re faced with two models: the MacBook and the MacBook Pro. If you’re a consumer, buy a MacBook, if you’re a professional buy the Pro version. Then just select your screen size and you’re done. That’s how Apple wants it to work and for the most part, it does. Very well.

Apple’s simple approach works quite well for consumers, but once you start getting into the high end content creation world it’s not quite so easy. How do you simplify the decision between two very fast cores and four slower cores or eight even slower ones? It wouldn’t really fit within Apple’s well kept home to ask its customers whether they run predominantly single threaded, lightly threaded or heavily threaded applications. Much to my surprise, the two new Mac Pros do effectively that. They present the end user with an option to choose four faster cores or eight slower ones. And there’s much more to the numbers that what Apple publishes on its own website.

These are the CPUs Apple offers on the new Mac Pro:

Apple Mac Pro (2009) Quad Core Model Eight Core Model
Default CPU 1 x Xeon W3520 (2.66GHz) 2 x Xeon E5520 (2.26GHz)

 

The clock speed difference appears to only be 17% at first glance, but there’s much more to the story.

Four or Eight Cores and the Magic of Nehalem

There are effectively three classes of applications that we have to consider when wondering whether or not the new Mac Pro is indeed a good buy. On one end of the spectrum we have single-threaded applications and tasks.

These days CPU performance improvements happen along three vectors: ILP, clock speed and TLP. The first vector of performance improvement is ILP (Instruction Level Parallelism). These improvements are changes to the micro-architecture. They could be as simple as adding a larger/faster cache, or as complex as a faster/more capable SSE unit. These days there are minor improvements in ILP between microprocessor generations. The second vector, clock speed, is also fairly stagnant. The Nehalem based Xeons run at about the same clock speed as the Woodcrest, Clovertown and Harpertown based Xeons that the older Mac Pros used. The final vector, TLP (Thread Level Parallelism), is where we’ve seen some of the biggest gains this round. As the name implies, execute more threads in parallel and you can get more performance. You increase the number of threads you can execute by running multiple threads on a core (SMT or Hyper Threading) or by adding more cores to a chip. Quad-core is still the sweet spot configuration for Xeons, but the Nehalem architecture brings Hyper Threading back to the limelight and now each of those four cores can work on two threads of instructions at the same time.

Well let’s look at how ILP, clock speed and TLP compare from Harpertown to Nehalem (for more details on what makes Nehalem tick, err tock, be sure to read our architectural analysis):

Apple Mac Pro (2009) vs Apple Mac Pro (2006 - 2008) Upgrade Downgrade
Instruction Level Parallelism (ILP) Faster memory access
Minor microarchitectural updates
Smaller L2 caches
Clock Speed Minor clock speed advantage in some cases Minor clock speed disadvantage in others
Thread Level Parallelism (TLP) Large L3 cache shared by all cores
2x threads per core (Hyper Threading)
 

 

Looking at the table of improvements you should already know where to expect the Nehalem Mac Pro to excel. With each chip being able to execute twice as many threads as those used in the old Mac Pro, if you’re running a well threaded application then you’ll certainly see performance improvements on the new Mac Pro. What sorts of applications are “well threaded”? Generally things like 3D rendering and professional video encoding. The easiest way to find out is to fire up activity monitor and see how many of your cores are taxed while you’re using your system. If all of the bars are full of blue on a quad-core machine then you’d probably appreciate a Nehalem Mac Pro.

The clock speed improvements are minimal. In a non thermally constrained environment you can add 133MHz to whatever clock speed Apple puts on the box. So the 2.26GHz Mac Pro will most likely run at 2.40GHz and the 2.66GHz Mac Pro will spend most of its time at 2.80GHz, if you’re doing something CPU intensive that is. This is of course do to Intel’s Turbo mode.



Understanding Nehalem’s Turbo Mode

Modern day CPUs and GPUs are more power constrained than anything else. They could run faster, if they could get around pesky problems like power density. Intel and AMD have both figured out that the maximum power consumption for a single processor falls into one of the following ranges depending on the platform:

System Processor TDP Number of Cores
High End Desktop 80 - 130W 4
Mainstream Desktop 65W 2 - 4
Notebook 20 - 45W 2
Ultra Portable Notebook 10 - 20W 1 - 2
Netbook 2 - 5W 1

 

If we look at the bottom of the table we see that our limits to performance aren’t technology, but rather power; netbooks could be as fast as desktops if we could stick 130W processors in them.

Pay attention to the third column however. A high end desktop processor is designed to dissipate up to 130W of heat; you reach that value by running all four cores at full load. But what happens if you only have two active cores? The total power consumption and thermal dissipation of your processor is no longer 130W, it’s noticeably less.

I just finished saying that power was our fundamental limit to faster microprocessors, but if half of a 130W chip is idle - shouldn’t the working half be able to run faster? The answer is yes, but only with some clever technology.

The Nehalem CPU includes a fairly complex hardware monitoring microprocessor on-die. This processor is called the Power Control Unit (engineers r awesome). It monitors the temperature, current and power consumption of each core independently. The PCU also the part of the chip that handles OS requests to drop the cores down to lower power states. Now get this; if there’s room in the power envelope, and the OS requests a high performance state, the PCU will actually increase the clock speed of the active cores beyond their shipping frequency.

It all boils down to the TDP of the chip, or its Thermal Design Point. The more TDP constrained a platform is, the more you stand to gain from Intel’s Turbo mode. Let me put it another way; in order to fit four cores into a 130W TDP, each core has to run at a lower clock speed than if we only had one core at that same TDP.

At higher TDPs, there’s usually enough thermal headroom to run the individual cores pretty high. At lower TDPs, CPU manufacturers have to make a tradeoff between the number of cores and their clock speeds - that’s where we can have some fun.

The Other Difference Between the Quad and Eight Core Models

Apple sells two versions of the new Mac Pro, a quad-core and an eight-core system. The motherboard is the same in both machines, but the processor board is different. The quad-core processor board has a single LGA-1366 socket and four DIMM slots, while the eight-core processor board has two sockets and eight DIMM slots. They also use significantly different CPUs, although Apple doesn’t tell you this.

Below you’ll find the standard and upgraded options for each system:

Apple Mac Pro (2009) Quad Core Model Eight Core Model
Default CPU Xeon W3520 (2.66GHz) Xeon E5520 (2.26GHz)
CPU Upgrade Options Xeon W3540 (2.93GHz) Xeon X5550 (2.66GHz)
Xeon X5570 (2.93GHz)

 

Although Apple offers a 2.93GHz CPU in both systems, it’s actually a different chip that’s used in each model. The clock speeds, core counts and cache sizes are the same, the difference is in the TDP.

The quad-core Mac Pro uses 130W TDP Xeon uniprocessor workstation processors, the eight core Mac Pro however uses an 80W (2.26GHz) or 95W chip (2.66/2.93GHz). There are more CPUs in the eight-core model, so Intel offers chips with lower TDPs to keep total platform power under control. While the eight-core Mac Pro uses more power than the quad-core Mac Pro, each chip individually should use less power. And remember what we discussed earlier: lower TDPs mean higher turbo frequencies.

The table below shows the maximum turbo frequency available for each chip depending on the number of cores currently in use:

System (Processor) Default Clock Max Turbo w/ 4-cores active Max Turbo w/ 3-cores active Max Turbo w/ 2-cores active Max Turbo w/ 1-core active
8-core Mac Pro (Xeon X5570) 2.93GHz 3.20GHz 3.20GHz 3.33GHz 3.33GHz
8-core Mac Pro (Xeon X5550) 2.66GHz 2.93GHz 2.93GHz 3.06GHz 3.06GHz
8-core Mac Pro (Xeon E5520) 2.26GHz 2.40GHz 2.40GHz 2.53GHz 2.53GHz
4-core Mac Pro (Xeon W3540) 2.93GHz 3.06GHz 3.06GHz 3.06GHz 3.20GHz
4-core Mac Pro (Xeon W3520) 2.66GHz 2.80GHz 2.80GHz 2.80GHz 2.93GHz

 

What the table above tells us is that while the quad-core Mac Pro can turbo up by 133MHz if more than one core is active, and 266MHz if only one core is active, the processors in the eight-core Mac Pro can do better. The Xeons in the eight-core Mac Pro can turbo up by 266MHz or 333MHz, depending on the number of cores active. The 333MHz turbo mode is available even if two cores are active.

Apple isn’t big on specs like these so we don’t see any mention of them in Apple’s Mac Pro sales literature, the only clue you get is in the form of the model numbers Apple lists on its spec sheets:

Although it’s a pricey upgrade, you do get better processors with the eight-core Mac Pro than you do with the quad-core version. If you don’t need more than four cores however, you’ll still be better off with a 2.66GHz quad-core Mac Pro than a 2.26GHz eight-core model.



Performance

Understanding how the new Nehalem Mac Pro performs really isn't that difficult; it all boils down to the type of workload. On very threaded workloads, the new Mac Pro should be much faster than the old one, even at a lower clock speed. Single threaded applications will show us the opposite - the new Mac Pro will need equivalent clock speed to equal the older one. And for everything in between, the wins will vary.

Adobe Photoshop CS4 Performance

To measure performance under Photoshop CS4 we turn to the Retouch Artists’ Speed Test. The test does basic photo editing; there are a couple of color space conversions, many layer creations, color curve adjustment, image and canvas size adjustment, unsharp mask, and finally a gaussian blur performed on the entire image.

The whole process is timed and thanks to the use of Intel's X25-M SSD as our test bed hard drive, performance is far more predictable than back when we used to test on mechanical disks.

Time is reported in seconds and the lower numbers mean better performance. The test is multithreaded and can hit all four cores in a quad-core machine.

Adobe Photoshop CS4

This test isn't heavily threaded from start to finish, some actions only stress one or two cores while others will drive all sixteen virtual threads on the new Mac Pro. The speedup from Hyper Threading is enough however to give the new Mac Pro an advantage, even at a lower clock speed, over the older model. This makes sense given how well the Nehalem based Core i7s do in our CS4 benchmark in Bench.

Apple Aperture 2.1.2 Performance

While Photoshop lets us do a lot of photo processing, Aperture is useful in managing workflow before we get to the heavy processing stages of Photoshop. For this test I'm exporting one of the sample albums that comes with Aperture from RAW to JPEG format.

Apple Aperture 2.1.2 - Export to JPEG

The results here aren't uncommon. In a lightly threaded task you shouldn't expect the new Mac Pro to be faster than the old one, there's no replacement for clock speed. Other workloads will be hurt simply because L2 cache sizes are smaller with Nehalem then they were previously (only 256KB L2 per core with Nehalem).

Xcode Performance

Good compiler tests are hard to come by, but I've found that building the Adium source in Xcode is not only repeatable but a great test of platform performance. The build process is multithreaded and will use up to 16 threads, although not consistently over the course of the build.

Xcode - Build Adium Project

Hyper Threading has the potential to really help the new Mac Pro if you're running a heavily threaded workload. The Xcode test is the perfect example of a real world usage scenario that doesn't max out all cores, but still does very well on the new Mac Pro.

Adobe Premier Pro CS4 Performance

Video encoding under Quicktime was a bust, but using a professional encoder like Premier Pro shows the strength of Nehalem:

Adobe Premier Pro CS4

At 2.26GHz we're faster than eight cores running at 3.0GHz with the first Mac Pro. Throw in a pair of 2.93GHz Xeons and you'll see another ~20% performance improvement on top of that.

I would guesstimate that the quad-core 2.66GHz Mac Pro should deliver performance similar to (if not slightly slower than) the older 8-core 3.0GHz Mac Pro.

Quicktime H.264 Encoding Performance

While video encoding can definitely benefit from the monster threading abilities of the new Mac Pro, lighter encoding workloads don't really benefit:

Quicktime H.264 Encoding - Default Settings

Quicktime is hardly the best application to do serious video encoding, here we see it barely scales beyond two cores - Nehalem has nothing to offer us in this sort of a situation.

iWork, iLife and General Use Performance

If you plan on using your Mac Pro for more than just rendering, encoding and computation, you'll find that it does work very well as a general use machine:

Pages - Export to Word Doc

But clock speed definitely matters. In these two iWork benchmarks the 2.26GHz eight-core Mac Pro is measurably slower than its 3GHz predecessors. You'll at least need the 2.93GHz upgrade to equal their performance here.

Keynote - Export to PowerPoint

Finder - Compress Archive

iPhoto 2009 - Import Photos



The Alternative: SSD in an Older Mac Pro?

I hate to sound like a broken record but I can’t stress the upside to having a SSD in any machine, especially the Mac Pro. I’ll give you my history with the Mac Pro before diving into some of the details on what a fast SSD will do for one of these systems.

One thing I always appreciated about OS X was that it seemed to keep things in memory in a more intelligent way than Windows ever did. I could leave most applications active and I was rarely bogged down by the inexplainable disk crunching that I got in Windows. Because of this I always outfitted my Macs with as much memory as possible. My Mac Pro started with 2GB, then 4GB then 8GB. For the most part the machine remained nice and snappy, but over time it lost that fresh-out-of-the-box feeling. Applications didn’t launch quite as snappily, not to mention how painful it was to launch anything immediately upon reaching the desktop.

Admittedly my Mac Pro lasted longer before I started to feel that it was slow than any PC I’d used up to that point, but it eventually got to where I was frustrated. That’s when I turned to an SSD to solve my problems.

You can read about the history behind SSDs in my Mac Pro here, but eventually I ended up with an Intel X25-M in the system.

Now Apple won’t ship a X25-M or any Intel SSD in its systems. The reasoning isn’t public, but it’s not exactly a technical limitation or performance issue. The why doesn’t really matter, because the drive works just fine in any Mac Pro, whether the original one from 2006 or the newest model from 2009. You have to come up with a clever way to mount the drive in the system, but assuming you’re good with metal (or rubber bands) you’ll find a way to get the drive in there.

The benefits of using the X25-M in a Mac Pro are just like that of any system: huge. Allow me to make my point.

One of my benchmarks for this review is a test that developers will appreciate. I use the latest version of Apple’s Xcode tools to compile the Adium project and I time the build. This particular test is quite CPU intensive, it will actually tax all 16 threads on a dual-socket Nehalem Mac Pro. The CPUs don’t stay at 100% for the entire time, but there are periods when they do.

The graph below shows you the build time on three systems, the original Mac Pro running at 3.0GHz (in both four and eight core varieties) and the new eight-core Nehalem Mac Pro running at 2.26GHz:

Xcode - Build Adium Project

Parallel processing to the rescue. Despite the significant reduction in clock speed, Hyper Threading gives the new Mac Pro an advantage in build time. The Nehalem system completed the test in 19% less time than the old 8-core Mac Pro.

Now both of these machines used the drive that comes with the new Mac Pro. It’s a 7200RPM 640GB Western Digital Caviar SE16 SATA hard drive. By no means a slouch. Now let’s look at what happens if we throw an Intel X25-M into the old Mac Pro:

Ah ha! Remember that I mentioned the Adium compile test isn’t entirely CPU bound. Well, when the benchmark isn’t taxing all cores it is bottlenecked by IO; it’s accessing the disk. Simply putting a SSD in the old Mac Pro makes it as fast as the new one with its stock hard drive. Now if you combine the new Mac Pro with a SSD, you get an even faster system - it’ll complete the same test in 87 seconds.

So adding a SSD to an older Mac Pro can breathe new life into it, and in some cases make it faster than a new Mac Pro with a standard hard drive. But let’s look at this another way. Is Apple doing the new Mac Pro a disservice by not putting a SSD in it as a boot/applications drive?

The table below shows the performance improvement from the old Mac Pro to the new Mac Pro using a HDD and using a SSD. I'm simply comparing how long it takes to build the Adium application using Xcode on my old Mac Pro vs the new one using a HDD and then using an Intel X25-M SSD:

Xcode Adium Build Test Stock HDD Intel X25-M SSD
8-core Mac Pro 2006/2007 3.0GHz (Clovertown) 139.5s 113.0s
8-core Mac Pro 2009 2.26GHz (Nehalem) 112.7s 87.0s
% Increase in Performance 23.7% 29.9%

 

With a standard 7200RPM hard drive, the new Nehalem Mac Pro is nearly 24% faster than the original 8-core Mac Pro. However, swap in Intel’s X25-M and the new Mac Pro is almost 30% faster.

In other words, with a faster IO subsystem the Nehalem Mac Pro is able to outperform its predecessor by a wider margin. Or to answer my loaded question from above: yes, Apple is limiting the performance of its latest Mac Pro by not outfitting it with a high performance SSD.

The explanation is simple. Nehalem is more data hungry than any previous generation Intel microprocessor. It can operate on twice as many threads as Penryn and Conroe and it has much deeper buffers internally. To fill them with instructions it needs fast access to memory, which it has. Unfortunately not everything you ask of it is already in memory, and that’s where the burden gets pushed down to the hard drive. Speed up the hard drive and you’ll help Nehalem shine.

What’s the practical recommendation? If you need more processing power, the new Mac Pro will give it to you. Here’s another test where switching to a SSD does absolutely nothing:

Not all applications are going to be as sensitive to random IO latency as building a large project in Xcode. But I will stress this, it’s ridiculous for any OEM today to be selling a machine costing over $3000 without outfitting it with an SSD.

The table below shows application launch times for the two Mac Pro configurations I’ve been using with and without an SSD:

Xcode Adium Build Test Mac Pro 2006 (3.0GHz) - HDD Mac Pro 2006 (3.0GHz) - SSD Mac Pro 2009 (2.26GHz) - HDD Mac Pro 2009 (2.26GHz) - SSD
Adobe Photoshop CS4 7.4s 3.2s 7.9s 3.3s
Adobe Premier CS4 28.1s 15.7s 28.7s 17.0s
Microsoft Office 2008 (Word, Excel & PowerPoint) 13.0s 4.7s 13.3s 5.1s

 

If you’ve never seen a table of what a good SSD can do for application launch times, the one above is just as good as any. And yes, the third test in the table is launching all three applications at the same time.

Let’s look at what’s happening here. Both my old eight-core Mac Pro and the new eight-core Nehalem Mac Pro launch these applications in about the same amount of time. The older system is slightly faster simply because of its higher clock speed. Launching an application is generally not very CPU intensive and definitely doesn’t consist of many high CPU use threads, so there’s no benefit from Hyper Threading here. Now if you launched 20 or 30 applications at the same time we’d be telling a different story, but firing up a single app is going to be mostly a product of ILP and clock speed, the combination of the two is going to favor the older Mac Pro in this case thanks to the higher clock speed.

The launch times aren’t very impressive regardless of which system you look at. Premier takes nearly 30 seconds to load. Blech. But now look at what the X25-M does for both systems. Basically cut the time it takes to launch an application in half and that’s what a good SSD will do for you.

Application launch time is one of those things that helps contribute to how snappy a system feels and if you want to make your system feel faster, you'll need an SSD.



Upgrading the CPUs in the Nehalem Mac Pro

Let’s say you get over the $3299 price tag of the 8-core Mac Pro but aren’t really happy with the paltry 2.26GHz clock speed of the quad-core Nehalems in the box. Apple offers two upgrades: a pair of 2.66GHz or 2.93GHz Nehalems, how nice of them. The 2.66GHz upgrade will set you back $1400, while the 2.93GHz upgrade will basically cost you another Mac Pro at $2600.

To Apple’s credit, these CPUs are expensive. Here is Intel’s pricing:

CPU Intel's Price for Two CPUs What Apple Charges for Two (BTO Upgrade)
Intel Xeon X5570 $2772 $2600
Intel Xeon X5550 $1916 $1400
Intel Xeon E5520 $746  
Intel Xeon W3540 $1124 $1000
Intel Xeon W3520 $568  

 

A single Xeon X5570 costs $1386, Apple is charging you $2600 for two - but that’s on top of the base cost of the 8-core Mac Pro; you’re effectively paying for the two Xeon E5520 chips and the two X5570s, but only getting the latter.

The same applies to the single-chip Mac Pro. The only CPU upgrade offered there is the Xeon W3450; retail cost is $562, Apple’s benevolent self will only charge you $500.

I should also point out the sheer ridiculousness of Apple putting a pair of $373 CPUs in a $3300 machine. I get that Apple wants to commoditize everything that they don’t make, but that’s just ridiculous.

Do you smell motivation? Because I do.

If you don’t mind voiding your warranty, you are better off buying the base Mac Pro (4 or 8 core) configuration, upgrading the CPUs yourself and ebaying the originals.

The even more sensible option would be to wait a while and upgrade the Xeons once these ones fall in price.

Regardless of when or why you want to do it, I figured we should give it a try. They built sockets for a reason after all.

Voiding the Warranty

Getting inside the new Mac Pro is much easier than the old one. Remove the side panel then unlatch and remove the processor tray and you’ve got this:

Two towering heatsinks, eight DIMM slots an X58 chipset are the main attractions. Remember all of the hoopla a few pages ago about Turbo Mode? That’s the reason for these beefy heatsinks; there’s actually a fan inside each heatsink, as well as two large fans moving air across the entire board in the case.


The heatsinks have an integrated fan and thermal sensor (black cable)

There are four screw holes at the top of the heatsink. Apple actually made removing the heatsinks very easy, all you need is a long 3mm hex key - about 3” long (plus a handle) should suffice.

Stick the hex key in any one of the holes, move it around until it grabs, and then unscrew. Rinse and repeat. The screws are attached to the heatsink and spring loaded; you don’t have to physically remove any, just wait until they pop up.

With all four screws removed you can just lift the heatsink straight up. The CPU sockets don’t have clamps, so the chip will most likely lift out of the LGA socket attached to the heatsink.

Carefully twist and pull the CPU until it comes off of the heatsink and you’ll be greeted with the first surprise from Apple: the 8-core Mac Pro ships with lidless Nehalems.

Normally a Core i7 or Xeon processor will look like this:

But the 8-core Mac Pro uses specially sourced parts from Intel that have no integrated heat spreader (IHS):


No lid, all Nehalem.

This is obviously a cooling play. The IHS is useful in preventing cracked cores from improperly installed heatsinks, but it does make cooling more difficult. With the heatsink flush against the bottom of the Nehalem die, it can remove heat faster from the chip. More efficient cooling results in lower CPU temperatures and lower fan speeds.

The lidless Nehalems are only used in the 8-core version as far as I can tell. The standard quad-core Nehalem Mac Pro uses regular Xeons with an integrated heat spreader.

The lidless Nehalems do provide a challenge: you can’t buy them. You have to buy a standard, lidded Nehalem Xeon and either remove the IHS or leave it intact and hope it works well. The first option isn’t a very good one; while removing a heat spreader isn’t impossible, you do run the risk of destroying your $1400 CPU. The second option, if it works, is the safest route.

We got a pair of Xeon X5570s and tried to install them, with heat spreaders and all, in our 8-core Mac Pro. Simply pull one chip out, replace it, apply thermal grease, remount the heatsink and screw it back in...or so we thought.


The lidless Nehalems were out, the new lidded processors should work - they should just be a bit more difficult to cool



The two heatsinks on the Mac Pro aren’t interchangeable, so keep track of which one came from where and don’t try to force their installation if they don’t fit properly.

After we completed the swap, we powered the machine on and were met with the worst sound: fans that didn’t spin down. The Mac Pro refused to POST and forcing it to turn off revealed a CPU A Temperature Overheat warning LED on the motherboard.

Removing the CPU_A heatsink revealed the problem:


uh-oh

CPU_B was fine, but CPU_A was far from it. While there are no pins to bend/break on these LGA CPUs, if anything goes wrong the socket is toast. In this case, both the socket and CPU were beyond saving.

To date I’m not sure what went wrong, but I have two theories. Unlike desktop Nehalem motherboards, there is no clamp that holds the CPU in place. There’s a chance that during the heatsink installation that the CPU moved slightly and shorted as soon as it got power.

The other theory is that I somehow over tightened the heatsink on CPU_A. Remember, the chips I used had heat spreaders, the ones they were replacing did not. The added thickness of the heat spreader could have helped push the CPU too hard against the pins in the socket, causing some of them to move out of place.

Regardless of the how, what remained was that I now had a dead Xeon and a dead Mac Pro processor board on my hands. The CPU I could always replace from my stash, but I don’t keep many Nehalem Mac Pro processor boards in my parts closet. This would require a trip to the Apple store.

And if you’ve ever walked into an Apple store holding a Nehalem Mac Pro processor board, you’ll get some looks.

Thankfully, the folks at the Crabtree Valley Mall Apple Store in Raleigh, NC are AnandTech readers and quickly understood what had happened. They ordered the replacement part and I waited. If you’re curious, it’ll cost a bit under $400 to replace the processor board in an 8-core Mac Pro provided you allow Apple to keep your dead board.


The new board and its accessories

It’s been years since I’ve killed a processor and this would be my first LGA socket death, so I was admittedly nervous once I had the new board in hand. I went in and tried to replace the chips again, this time under tightening all of the hex screws on the heatsinks just to be safe.

I mounted and dismounted each heatsink three times before committing to the install. Nervously, I hit the power button.

I had good news and bad news.

The bad news, the fans were spinning at full speed. The good news? The upgrade worked. Initially I didn’t tighten the heatsinks enough so some of the memory channels weren’t working, but a couple more spins of the hex key and I was in business.

The fans spinning at full speed were caused by the thermal sensor on one of the heatsinks being dead. It looks like it died when the first CPU shorted out. The folks at the Apple store had to order me a replacement heatsink as well.

After the smoke cleared and my embarrassment subsided, I had an upgraded Nehalem Mac Pro.

I started with a pair of 2.26GHz Xeon E5520s and ended up with a pair of 2.93GHz Xeon X5570s.

Gaining access to the processors is far easier than the previous generation Mac Pro, but you have to exhibit more care in physically replacing them. The fact that the CPUs I used had integrated heat spreaders hasn’t caused any problems either. Temperatures are well within reasonable limits, the fans are just as quiet as before and Turbo mode still works.



Upgraded Mac Pro Performance

With a pair of 2.93GHz Xeons in the new Mac Pro, I ran a few benchmarks to confirm that things did indeed get faster. Update: I've removed the Cinebench results while I try to figure out why the 2.26GHz configuration was performing lower than expected in the single-threaded tests.

Adobe Photoshop CS4

Apple Aperture 2.1.2 - Export to JPEG

Xcode - Build Adium Project

Pages - Export to Word Doc

With the exception of the Pages benchmark, the improvement was significant. With a pair of 2.93GHz Xeons, the new Mac Pro is easily much faster than the old 3.0GHz eight-core system. The Harpertown model that preceded the current Mac Pro was available at up to 3.2GHz, but even it shouldn't be able to best the new Mac Pro in workloads where Nehalem has a clear advantage. At worst, the two may be equal with a slight preference towards the larger L2 caches of Harpertown over Nehalem.



Final Words

With this I hope I can retire from writing epic-ly long Apple articles for a while, but I'm not done yet - I must first conclude.

The new Mac Pro is fast and expensive. As I casually mentioned on the performance page, if you're upgrading from a PowerMac G5 then even the cheapest iMac (or even the Mac mini) will have more processing power; an upgrade to the Nehalem Mac Pro will absolutely rock your world.

If you have one of the original Mac Pros from 2006, the new Mac Pro should be an upgrade provided that you're at all CPU bound in your tasks. Clock speed is important however, going from a pair of 3.0GHz Woodcrest based Xeons in the first Mac Pro to a 2.26GHz Nehalem based Xeon won't always give you better performance. The entry level 2.26GHz Xeons for the 8-core Mac Pro are ridiculous given the price point of the system, but so are the upgrade prices for the 2.66GHz and 2.93GHz processors. If you do have a highly threaded workload you can always get the entry level 8-core and then upgrade the CPUs on your own down the line if you're careful.

Now if you’re running applications that stress all eight cores in the $3299 Mac Pro then the clock speed difference won’t matter. But if all you’re doing is stressing four cores then the $2499 machine will perform noticeably better (and save you some money). Apple effectively offers a machine optimized for users of heavily threaded workloads and one for everyone else, they just don’t advertise it as such.

Ultimately, it’s all about snappiness and response time. The new Mac Pro makes tasks that generally take several minutes to hours run in considerably less time, but still on the same order of magnitude of performance. Compiling Adium took 130 seconds on my old eight-core Mac Pro and less than 90 on my new one. That’s a noticeable performance improvement. Unfortunately some aspects of the Mac Pro just haven’t improved that much at all. Application launch time and general use performance are still very I/O bottlenecked; these things need SSDs and with a price tag of over $3,000 there's absolutely no excuse for Apple not including one.

If you have a Mac Pro from last year and aren't doing a lot of heavily threaded work, stick a SSD in your machine, it'll feel better than new.

 

Log in

Don't have an account? Sign up now