Earlier today we posted our review of Intel’s Ivy Bridge ULV Ultrabook. In it, we found that while the maximum iGPU clock on the ULV HD 4000 is 1150MHz—just 8% lower than the maximum iGPU clock on the quad-core HD 4000—performance in most games is significantly lower. We postulated that the reduced thermal headroom created by the 17W TDP was at least partially to blame, but we didn’t have time to really look into the matter. We received several requests to look into the matter further, with the suggestion of using GPU-Z logging to capture the real-time iGPU clocks. With those logs now in hand, we’re ready to present additional information that explains why we’re seeing larger performance discrepancies despite the relatively minor <10% iGPU clock speed difference.

We didn’t run tests on our entire gaming suite, but we did investigate iGPU clocks on six of the eight games. The other two games that we skipped are Portal 2 and Skyrim, mostly because the pattern we’ll discuss is already quite clear. For each game, with the exception of Diablo III, we ran through our benchmark sequence while logging the iGPU clocks using GPU-Z. In order to make the charts presentable and easier to compare, we’ve trimmed out small amounts of data so that for each game on each laptop, we’ve got the same amount of data. Note that we did check average frame rates with and without the extra frames and made sure that removing a few seconds of logging didn’t make more than a 1% difference—and it was typically more like less than a 0.1% difference.

What about Diablo III? Since there’s not a great way to do a consistent benchmark sequence—the game world is randomly generated within certain bounds each time you play, at least for areas with combat—we simply played the game for around 15 minutes while collecting iGPU clock data. This includes killing monsters (with several larger mobs on each system), visiting town (in Act 2), checking out inventory, etc. As with the other games, the pattern of iGPU clock speed is pretty clear.

While we’re talking iGPU clock speeds, it also helps to keep average frame rates in mind. Let’s start with a recap of gaming performance for our two test laptops (leaving out Diablo III since benchmark results are from slightly different areas).

Average HD 4000 Frame Rates
* i7-3720QM tested with 2696 driver; i5-3427U with 2725 driver.
Portal 2 Performance is signficantly improved with 2725.

As noted in our Ultrabook review, quad-core Ivy Bridge performs substantially better in games than ULV dual-core Ivy Bridge. This isn’t too surprising, as we’ve see the same thing with HD 3000 in Ultrabooks compared to dual- and quad-core notebooks. Our assumption has always been that the lowered iGPU clocks in Ultrabooks was the primary culprit, but with Ivy Bridge the maximum clock speeds aren’t all that different—i7-3720QM can hit a maximum iGPU clock that’s just 9% higher than i5-3427U. When it comes to actual gaming performance, however, quad-core IVB ends up 35% faster on average—and that’s not even accounting for the substantially improved Portal 2 performance thanks to a new driver on the Ultrabook.

The question then is whether we’re hitting thermal constraints on IVB ULV, and are we GPU or CPU limited—or maybe even both. Before we get to the detailed graphs, here’s the quick summary of the average HD 4000 clocks on the Ivy Bridge Ultrabook (i5-3427U) and the ASUS N56VM (i7-3720QM).

Average HD 4000 Clock Speed

So, that makes things quite a bit easier to understand, doesn’t it? During actual game play (e.g. when the iGPU isn’t idle and clocked at its minimum 350MHz), the quad-core IVB laptop is able to run at or very close to its maximum graphics turbo clock of 1250MHz, and as we’ll see in a moment it never dropped below 1200MHz. By comparison, in most games the ULV IVB laptop averages clock speeds that are 50-150MHz lower than its maximum 1150MHz turbo clock. With that overview out of the way, here are the detailed graphs of iGPU clocks for the tested games.

The GPU clocks over time tell the story even better. The larger i7-3720QM notebook basically runs at top speed in every game we threw at it, other than a few minor fluctuations. With the smaller ULV i5-3427U, we see periodic spurts up to the maximum 1150MHz clock, but we also see dips down as low as 900MHz. (We’re not counting the bigger dips to 350MHz that happen in Batman during the benchmark scene transitions, though it is interesting that those don’t show up on 4C IVB at all.)

After performing all of the above tests, we end up with an average iGPU clock on the i5-3427U around 1050MHz compared to 1250 on the larger notebook. That represents a nearly 20% performance advantage on the GPU clocks, but in the actual games we’re still seeing a much greater advantage for the quad-core chip. Given that most games don’t actually use much more than two CPU cores (if that), even in 2012, the most likely explanation is the higher clock speeds of the quad-core processor. However, from what we’ve seen of the ULV iGPU clocks under load, it seems likely that not only is the quad-core chip clocked higher, but it’s also far more likely to be hitting higher CPU Turbo Boost clock speeds.

It’s at this point that I have to admit I decided to wrap things up without doing a ton more testing. Having already run a bunch of tests with GPU-Z logging iGPU clocks in the background, I now wanted to find a way to log the CPU core clocks in a similar fashion. CPU-Z doesn’t support logging, and TMonitor doesn’t work with Ivy Bridge, so I needed to find a different utility. I eventually found HWiNFO64, which did exactly what I wanted (and more)—though I can’t say the UI is as user friendly as I’d like.

The short story is that if I had started with HWiNFO64, I could have gathered both CPU and GPU clocks simultaneously, which makes the charts more informative. Since we’re dealing with dual-core and quad-core processors, I have two lines in each chart: Max CPU is the highest clocked CPU core at each measurement point, while Avg CPU is the average clock speed across all two/four cores. There’s not always a significant difference between the two values, but at least on the quad-core IVB we’ll see plenty of times where the average CPU clock is a lot lower than the maximum core clock. Besides the CPU clocks), we again have the GPU clock reported.

As mentioned just a minute ago, I got tired of running these tests at this point and figured I had enough information, so I just reran the DiRT 3 and Diablo III tests. DiRT 3 is one of the worst results for IVB ULV compared to IVB 4C, while Diablo III is more in line with the difference in iGPU clocks (e.g. the quad-core notebook is around 20-25% faster). So here are the final two charts, this time showing CPU and GPU clocks.

As before, the quad-core notebook runs HD 4000 at 1250MHz pretty much the entire time. Not only does the iGPU hit maximum Turbo Boost, but the CPU is likewise running at higher Turbo modes throughout the test. i7-3720QM can turbo up to a maximum clock speed of 3.6GHz on just one core, and the average of the “Max CPU” clock ends up being 3442MHz in DiRT 3 and 3425MHz in Diablo III.

i5-3427U also has a decent maximum CPU Turbo Boost clock of 2.8GHz, but it rarely gets that high. Diablo III peaks early on (when the game is still loading, actually) and then quickly settles down to a steady 1.8GHz—the “guaranteed CPU clock”, so no turbo is in effect. The overall average “Max CPU” clock in Diablo III is 1920MHz, but most of the higher clocks come at the beginning and end of the test results when we’re in the menu or exiting the game. DiRT 3 has higher CPU clocks than Diablo III on average—2066MHz for the “Max CPU”—but the average iGPU clock is slightly lower.

Interestingly, HWiNFO also provides measurements of CPU Package Power (the entire chip), IA Cores Power (just the CPU), and GT Cores Power (just the iGPU). During our test runs, Diablo III on the ULV Ultrabook shows an average package power of 16.65W and a maximum package power of 18.75W (exceeding the TDP for short periods), with the CPU drawing an average of 3.9W (6.22W max) and the iGPU drawing 9.14W on average (and 10.89W max)—the rest of the power use presumably goes to things like the memory controller and cache. The DiRT 3 results are similar, but with a bit more of the load shifted to the CPU: 15.73W package, 4.39W CPU, and 8.3W iGPU (again with maximum package power hitting 18.51W briefly). For the N56VM/i7-3720QM, the results for Diablo III are: 30.27W package, 12.28W CPU, 13.43W iGPU (and maximum package power of 34.27W). DiRT 3 gives results of 32.06W package, 13.2W CPU, and 14.8W iGPU (max power of 38.4W).

Wrap-Up

Most of this shouldn’t come as a surprise. In a thermally constrained environment (17W for the entire package), it’s going to be difficult to get higher performance from a chip. If you start from square one trying to build a chip for a low power environment (e.g. for a tablet or smartphone) and scale up, you can usually get better efficiency than if you start out with a higher power part and scale down—the typical range of scaling is around an order of magnitude—but if you need more performance you might fall short. The reverse also holds: starting at the top and scaling down on power and performance, you might eventually come up short if you need to use less power.

As far as Ivy Bridge goes, HD 4000 can offer relatively competitive performance, but it looks like it needs 10-15W just for the iGPU to get there. On a 45W TDP part, that’s no problem, but with ULV it looks like Ivy Bridge ends up in an area where it can’t quite deliver maximum CPU and iGPU performance at the same time. This generally means iGPU clocks will be closer to 1000MHz than 1150MHz, but it also means that the CPU portion of the chip will be closer to the rated clock speed rather than the maximum Turbo Boost speed. One final item to keep in mind is just how much performance we’re getting out of a chip that uses a maximum of 17W. ULV IVB isn’t going to offer gaming performance comparable to an entry level graphics solution, but then even the low-end discrete mobile GPUs often use 25W or more. Cut the wattage in half, and as you’d expect the performance suffers.

So how much faster can we get with ULV chips, particularly with regards to gaming? Intel has a new GPU architecture with Ivy Bridge that represents a significant update from the HD 3000 iGPU, but they’re still trailing AMD and NVIDIA in the graphics market. Their next architecture, Haswell, looks to put even more emphasis on the iGPU, so at least on higher TDP chips we could very well see as much as triple the performance of HD 4000 (if rumors are to be believed). How will that fit into ULV? Even if ULV Haswell graphics are only half as fast as full voltage chips, they should still be a decent step up from the current full voltage HD 4000 performance, which seems pretty good. Too bad we’ll have to wait another year or so to see it!

As for AMD, they appear to be in much the same situation, only they’ve got better GPU performance in Trinity with less CPU performance. The problem is again trying to get decent performance out of a lower power solution, and while the 25W A10-4655M Trinity part looks quite attractive, the 17W A6-4455M part has to make do with half the CPU and GPU cores. ULV Ivy Bridge is only able to deliver about 70% of the graphics performance of full voltage (45W) Ivy Bridge, but I don’t expect ULV Trinity to fare much better. Hopefully we’ll be able to get some hardware in hand for testing to find out in the near future. Unfortunately, at least judging by GPU-Z and HWiNFO results on an A10-4600M, we still don’t have a good way of getting real-time iGPU clocks from Trinity or Llano.

POST A COMMENT

35 Comments

View All Comments

  • JarredWalton - Friday, June 01, 2012 - link

    Configurable TDP is available on all IVB chips I believe, but it's up to the laptop OEMs to enable it. Some might only do TDP Down (I think that's what ASUS has done on the UX21A at least), others will do up and down. Most likely, TDP Up will only be available on larger laptops and/or laptops with docking stations/cooling stations. It's pretty obvious (based on the CPU/GPU Turbo modes) that the prototype Ultrabook is already out of room as far as cooling goes, and it would be difficult given the position of the intake/exhaust ports to increase the cooling without redesigning the chassis. Other Ultrabooks and laptops could change the design and thus improve cooling and performance. Reply
  • stanwood - Friday, June 01, 2012 - link

    The way that the Quad-Core iGPU traces are pegged and the ULV is not makes me wonder if there are some driver issues keeping the ULV chip from running at a fixed frequency for a long time. By the looks of it, that frequency would still be lower than the quad core iGPU. Reply
  • JarredWalton - Friday, June 01, 2012 - link

    The ULV chip is supposed to max out lower than the quad-core (1150 vs. 1250). The difference is in the TDP limit, as well as temperatures. The Ultrabook is hitting a pretty constant 80C during gaming, compared to 68-70C on the ASUS N56VM (using only the IGP). Everything is working as designed, and if I were to put the i7-3720QM under a torture test (e.g. block off the exhaust so it heats up more, or run a heavily threaded CPU test concurrent with gaming), I expect we'd see similar variance in GPU clocks. In fact, I'm going to test that right now.... Reply
  • JarredWalton - Friday, June 01, 2012 - link

    And the result: nope, I can't get the N56VM to throttle down on the iGPU. It stays at 1250MHz, but with a heavy CPU load running (6-threaded x264 encode, leaving one core/two threads for DiRT 3) the CPU clocks drop to 2700-3000MHz instead of 3400-3600MHz. Package temperatures are hitting up to 91C though. Reply
  • seapeople - Sunday, June 03, 2012 - link

    If you were to take the ULV laptop and run your benchmark while holding it over an air conditioning vent (or something similar) would you approach the 35W results, or are we purely power limited here (since we're hitting about 17W already)? Reply
  • JarredWalton - Sunday, June 03, 2012 - link

    I did do a test at night next to an open window (probably mid-50s or lower outside) and the results didn't change that I could tell. Temperatures actually don't look too bad on the CPU/package (80C or so), so I think it's mostly the TDP limit. If the laptop supported configurable TDP with the option to set a 20W TDP, I bet performance would improve at least 10% in several games. Reply
  • cwcwfpfp - Monday, June 04, 2012 - link

    Awesome article. Looking at the HWiNFO64 results, it appears that any downward spikes in GPU clock are accompanied by upward spikes in CPU clock. Wouldn't this seem to be consistent with the thought that the limit is TDP and not thermal?

    I'm also curious if you have any insight on how the ULV would perform with non-graphics applications (such as say crunching numbers in MATLAB). Presumably the majority of the wattage would then be available to the CPU and we'd see more sustained turbo CPU speeds? Any thoughts/speculation?
    Reply
  • MrSpadge - Friday, June 01, 2012 - link

    - if you're looking for steady state performance (sustained longer than for a few seconds) and not for burst-mode performance (less than a few seconds), feel free to skip the higher clocked 17 W CPUs - they won't reach their higher turbo modes anyway (in games)
    - 25 W TDP seems the sweet spot for dual core IVB. Look for laptops slightly larger than Ultrabooks if you're looking for performance
    - lowering CPU & iGPU voltage might increase performance (question: how can it be done? software?)
    Reply
  • JarredWalton - Friday, June 01, 2012 - link

    You won't hit the higher Turbo Boost modes on an i7, but you should hit the guaranteed base clock (e.g. 2.0GHz on the i7-3667U vs. 1.8GHz on the i5-3427U). In some cases, that would make the i7 faster, but if you're comparing 11.6" i7 to 13.3" i5, e.g. the ASUS UX21A, all bets are off.

    As for the last item, AFAIK there are very few utilities that even try to work with laptops for over/under volting/clocking. The OEMs that make the laptops might have the knowledge to release such a utility, but they don't want to do so as it will more likely than not cause problems.
    Reply
  • ssiu - Friday, June 01, 2012 - link

    For mobile Llano, "k10stat" is commonly used for over/under volting/clocking of the CPU. (I think there is another alternate utility but I forgot the name.) It is surprising that there is no equivalent in Intel land (I guess stock mobile Intel CPUs still beat overclocked mobile Llano).

    Not sure if "k10stat" works with Trinity. I can't check it but someone can :-)
    Reply

Log in

Don't have an account? Sign up now