The Search for Universal Binaries

All of the applications that we've looked at thus far are Universal binaries, meaning that they are compiled to work on both PowerPC and x86 architectures.  And although the number of Universal applications is quite large, there are still quite a few that are missing.  The entire Microsoft Office suite, all of Adobe's products, and even Apple's professional application line have yet to be made available as Universal binaries.  All have been scheduled and committed, but we're still at least another month away from seeing their debut. 

Apple has done a tremendous job of making sure that non-Universal binaries do run on their new Intel based Macs, thanks to a binary translation program called Rosetta.  Apple has been extremely quiet about the specifics of Rosetta, in my opinion, because it is a temporary solution that doesn't perform very well and they would rather that everyone forget it exists and port to Universal binaries immediately than rely on it as a crutch.  The basic gist of Rosetta is this: when a non-Universal application runs its PowerPC assembly code, it is handed off to Rosetta, which then translates it into another form, optimizes it and then generates its own x86 code.  The code is also cached along the way so that frequently used code blocks run quicker, since they don't have to be re-translated. 

If you're familiar with compilers, this very much sounds like a real time compiler, except that you are going from low level assembly code to low level assembly code instead of a high level programming language to the latter.  And although the process works, it is primarily to ensure functionality and wreaks havoc on performance. 


You can tell if an application is Universal or not by looking at its info, the Kind: field will be listed as "Application (Universal)".

The other side effect to a Rosetta-style binary translation is that the amount of memory that you need goes up, sometimes significantly.  Let's look at an example. First, let's take a Universal application, in this case iPhoto, and look at its memory footprint:

  iMac G5 1.9GHz  iMac Core Duo 1.83GHz
iPhoto Memory Size (at Open) 20.48MB 21.24MB

On the iMac G5, the application, with no photos in its database, takes up 20.48MB of memory.  On the Intel based iMac, iPhoto '06 occupies 21.24MB.  We've already seen that the Intel based Macs do take up a little more memory when running the same applications as the PowerPC based Macs, so the differences here are expected. 

Now, let's take a look at a non-Universal application, Microsoft Word, that has to be executed using Rosetta:

 Rosetta Performance Comparison iMac G5 1.9GHz  iMac Core Duo 1.83GHz
MS Word Memory Size (at Open) 40.03MB 64.43MB
MS Word # of Threads (at Open) 2 3

At startup, with no open documents, Rosetta increases the memory footprint of MS Word from 40MB to just over 64MB - an increase of over 60%! 

Remember the PDF that we generated in our Pages benchmark earlier?  Instead of outputting it to a PDF, I exported it to MS Word and opened it on the two machines.  I timed how long it took to open the 116-page document, but first. let's look at how much memory MS Word is occupying:

 Rosetta Performance Comparison iMac G5 1.9GHz  iMac Core Duo 1.83GHz
MS Word Memory Size (116 page Document Open) 75.75MB 218.79MB
MS Word # of Threads (116 page Document Open) 4 5

The memory footprint of MS Word has gone absolutely insane, growing from "only" 60% greater than the native application on the G5 to just under 3x the size.  With the document open, the Intel based iMac had 218.79MB of its 512MB of memory being used by MS Word and Rosetta, compared to 75.78MB on the G5.  I stressed earlier that 512MB isn't enough once you start to seriously use iLife/iWork applications. Now it's worth amending that to include anything that requires Rosetta to run.

So, how long did it take to open the Word document?  Approximately 69% longer, thanks to the necessary binary translation during the process.  Since Rosetta operates in its own thread, I checked to see if having a dual core processor sped things up at all. Unfortunately, the gain is basically nothing. 

 Rosetta Performance Comparison iMac G5 1.9GHz  iMac Core Duo 1.83GHz  iMac Core Solo 1.83GHz
MS Word Document Open 27.1 seconds 45.7 seconds 47.9 seconds
MS Word Document Convert to HTML 36 seconds 114 seconds*  

I eventually mustered up the courage to do the unthinkable: convert the open Word document to HTML.  On the iMac G5, this process took a healthy 36 seconds; on the Core Duo based iMac running Rosetta, the process took 114 seconds and then crashed, leaving me without my HTML file.  No matter what I did, I could not get the process to complete without crashing. 

I turned to one more test of performance under Rosetta; this time, to see how Rosetta impacted graphics operations as well as some more CPU bound tasks.  Cinebench 2003 has yet to be made Universal, and as a scripted benchmark, it's very easy to generate and compare data that it produces.  I used the unoptimized (non-G5) version for both platforms to keep things as equal as possible. The results are below (scores are in Cinebench 2003's own units, higher numbers are better):

 Cinebench 2003 Rosetta Performance Comparison iMac G5 1.9GHz  iMac Core Duo 1.83GHz  iMac Core Solo 1.83GHz
Rendering-1CPU 203 76 75
Rendering-2CPU   143  
C4D Shading 246 103 101
OpenGL Software Lighting 634 182 185
OpenGL Hardware Lighting 1336 506 504

Once again, the inclusion of a second core doesn't really seem to speed up the translation process at all, with the Core Duo and Core Solo posting very similar scores.  The actual performance is absolutely abysmal, not to mention the serious performance hit when looking at the OpenGL tests.  In the OpenGL Software Lighting test, the Core Duo running the benchmark using Rosetta can't even perform at 1/3 the level of the G5. 

iWork '06 Performance with Pages and Keynote Final Words
POST A COMMENT

35 Comments

View All Comments

  • Anand Lal Shimpi - Tuesday, January 31, 2006 - link

    Turning off one core leaves the full 2MB of cache for the other core to use since it is a shared L2.

    Take care,
    Anand
    Reply
  • Eug - Tuesday, January 31, 2006 - link

    quote:

    Turning off one core leaves the full 2MB of cache for the other core to use since it is a shared L2.

    Take care,
    Anand

    Cool thanks.

    P.S. I have read elsewhere that the new iMac Core Duo uses less than half of the CPU's processing power to play back H.264 Hi-Def 1920x1080 video at a full 24 fps. If true, that's great, because my iMac 2.0 chokes on that. It plays back relatively smoothly, but only at about 12-15 fps.

    That bodes well for a future single-core Yonah Mac mini.

    Then again, probably not, considering that I suspect the iMac Core Duo does so well on H.264 playback because of its Radeon X1600. I'd doubt the Mac mini would get anything close to that any time soon.
    Reply
  • Anand Lal Shimpi - Tuesday, January 31, 2006 - link

    Max CPU utilization (across both CPUs) when playing a 1080p stream scaled to fit the screen is about 60%, but it usually hovers below 50%. I am not sure whether or not the X1600's H.264 decode acceleration is taken advantage of (I doubt it), I'm trying to find out now. Also remember that on the PC side, the X1600 will only accelerate up to 720p.

    Take care,
    Anand

    Reply
  • Anand Lal Shimpi - Tuesday, January 31, 2006 - link

    I just confirmed with ATI, the X1600's H.264 decode acceleration is currently not supported under OS X. ATI is working with Apple on trying to get the support built in, but currently it isn't there.

    Take care,
    Anand
    Reply
  • Eug - Tuesday, January 31, 2006 - link

    quote:

    I just confirmed with ATI, the X1600's H.264 decode acceleration is currently not supported under OS X. ATI is working with Apple on trying to get the support built in, but currently it isn't there.

    Thanks again for the info. That's actually good news in a way. Things are looking up for that single-core Yonah Mac mini HTPC.
    Reply
  • andrep74 - Tuesday, January 31, 2006 - link

    Isn't performance/Watt a function of the CPU, not the platform? Reply
  • Kyteland - Tuesday, January 31, 2006 - link

    That picture of Jobs doesn't say "PC vs Intel" it says "PowerPC vs Intel". Jobs is just standing in the way. He's comparing the old mac to the new mac. Reply
  • Calin - Tuesday, January 31, 2006 - link

    You could think about it that way - but in the end, the buyer is interested on the total energy consumption/heat production (as this is what he pays for, and what he must get rid of).
    Have you heard of the Toyota D4D engine? It has a record of 2.4 liter (less than a gallon) diesel fuel used per a hundred kilometers (60 miles). However, the same engine on a Land Cruiser 4x4 all options will get you much less (four times less maybe).
    It doesn't worths talking about performance per watt at the processor level, it is better at the platform level.
    Reply
  • BUBKA - Tuesday, January 31, 2006 - link

    Were these benches done with a USB 2.0 device plugged in? Reply
  • Furen - Tuesday, January 31, 2006 - link

    I was under the impression that Intel was blaming Microsoft for that, so that would not apply to OSX, though if the driver works perfectly for every platform except Napa I'd guess its a hardware problem that MS will fix in software (which is well enough as long as it works). The power consumption difference is probably less than 10W anyway. It matters on a notebook but hardly matters with a desktop.

    Reply

Log in

Don't have an account? Sign up now