Original Link: http://www.anandtech.com/show/6994/nvidia-geforce-gtx-770-review
NVIDIA GeForce GTX 770 Review: The $400 Fightby Ryan Smith on May 30, 2013 9:00 AM EST
As spring gets ready to roll over to summer, last week we saw the first phase of NVIDIA’s annual desktop product line refresh, with the launch of the GeForce GTX 780. Based on a cut-down GK110 GPU, the GTX 780 was by most metrics a Titan Mini, offering a significant performance boost for a mid-generation part, albeit a part that forwent the usual $500 price tier in the process. With the launch of GTX 780 the stage has been set for the rest of the GeForce 700 series refresh, and NVIDIA is wasting no time on getting to the next part in their lineup. So what’s up next? GeForce GTX 770, of course.
In our closing thoughts on the GTX 780, we ended on the subject of what NVIDIA would do for a GTX 770. Without a new mid/high-end GPU on the horizon, NVIDIA has instead gone to incremental adjustments for their 2013 refreshes, GTX 780 being a prime example through its use of a cut-down GK110, something that has always been the most logical choice for the company. But any potential GTX 770 is far more nebulous, as both a 3rd tier GK110 part and a top-tier GK104 part could conceivably fill the role just as well. With the launch of the GTX 770 now upon us we finally have the answer to that question, and the answer is that NVIDIA has taken the GK104 option.
What is GTX 770 then? GTX 770 is essentially GTX 680 on steroids. Higher core clockspeeds and memory clockspeeds give it performance exceeding GTX 680, while higher voltages and a higher TDP allow it to clock higher and for it to matter. As a result GTX 770 is still very much a product cut from the same cloth as GTX 680, but as a fastest GK104 card yet it is a potent successor to the outgoing GTX 670.
|GTX 770||GTX 680||GTX 670||GTX 570|
|Memory Clock||7GHz GDDR5||6GHz GDDR5||6GHz GDDR5||3.8GHz GDDR5|
|Memory Bus Width||256-bit||256-bit||256-bit||320-bit|
|FP64||1/24 FP32||1/24 FP32||1/24 FP32||1/8 FP32|
|Manufacturing Process||TSMC 28nm||TSMC 28nm||TSMC 28nm||TSMC 40nm|
With GTX 780 based on GK110, GTX 770 gets to be the flagship GK104 based video card for this generation. At the same time to further differentiate it from the outgoing GTX 680, NVIDIA has essentially given GK104 their own version of the GHz Edition treatment. With higher clockspeeds, a new turbo boost mechanism (GPU Boost 2.0), and a higher power limit, GTX 770 is GK104 pushed to its limit.
The end result is that we’re looking at a fully enabled GK104 part – all 32 ROPs and 8 SMXes are present – clocked at some very high clockspeeds. GTX 770’s base clock is set at 1046MHz and its boost clock is at 1085MHz, a 40MHz (4%) and 27MHz (3%) increase respectively. This alone doesn’t amount to much, but GTX 770 is also the first desktop GK104 part to implement GPU Boost 2.0, which further min-maxes NVIDIA’s clockspeeds. As a result being that GTX 770 reaches its highest clocks more often, making the effective clockspeed increase greater than 4%.
But the more breathtaking change will be found in GTX 770’s memory configuration. With GTX 680 already shipping at 6GHz there’s only one way for NVIDIA to go – up – so that’s where they’ve gone. GTX 770 ships with 7GHz GDDR5, making this the very first product to do so. This gives GTX 770 nearly 17% more memory bandwidth than GTX 680, an important increase for the card as the 256bit memory bus means that NVIDIA has no memory bandwidth to spare for GTX 770’s higher GPU throughput.
We’ve talked in length about GDDR5 memory controllers before, noting that 7GHz has always been the planned limit for GDDR5. Good GDDR5 memory can hit it easily enough, but GPU memory controllers and memory buses are another matter. After faltering with the Fermi generation NVIDIA was able to hit 6GHz on their first shot with GK104, and now with their second shot and a new PCB NVIDIA is ready to certify GK104 as 7GHz capable. Given all the teething GDDR5 has gone through on both sides of the aisle, this is a small but impressive achievement for NVIDIA.
Moving on, between the higher GPU clockspeeds, higher memory clockspeeds, and the introduction of GPU Boost 2.0, NVIDIA is also giving GTX 770 a hearty increase in TDP, for both the benefits and drawbacks that brings. GTX 770’s TDP is 230W versus GTX 680’s 195W, and due to GPU Boost 2.0 the old 170W “power target” concept is going away entirely, so in some cases the difference in effective power consumption is going to be closer to 60W. Like GTX 780, this higher TDP is a natural consequence of pushing out a faster part based on the same manufacturing process and architecture, and we expect this to be the same story across the board for all of the GeForce 700 series parts. At the same time however we’d point out that the 230W TDP higher than usual for a sub-300mm2 GPU, reflecting the fact that NVIDIA really is pushing GK104 to its limit here.
Along with differentiating the GTX 770 from the GTX 680, these small improvements also serve to further separate the GTX 770 from the GTX 670, which because it’s based on the same GPU, makes this to some extent necessary to provide the necessary performance gains to justify the mid-generation refresh. As GTX 670 was a lower clocked part with only 7 of 8 SMXes enabled, the performance difference between it and the GTX 770 ends up being due to a combination of those two factors. With a clockspeed difference of 131MHz (14%), the theoretical performance difference between the two cards stands at about 30% for shading/texturing, 14% for ROP throughput, and of course 17% for memory bandwidth. This won’t be nearly enough to justify replacing a GTX 670 with a GTX 770, but it makes it a respectable increase as a mid-generation part, and very enticing for those GTX 470 and GTX 570 owners on 2-3 year upgrade cycles.
Moving on to the launch and pricing, unlike the GTX 780 last week, NVIDIA is being far more aggressive on pricing with the GTX 770, catching even us by surprise. From a performance standpoint the GTX 770 already makes the GTX 680 redundant, and if the performance doesn’t do it then the launch price of $399 will. $399 also happens to be the same price the GTX 670 launched at, so this is a fairly straightforward spec-bump in that respect.
At the same time NVIDIA is going to be phasing out the GTX 680 and GTX 670, so while these parts may see some sales to clear our inventory there won’t be any kind of official price cut. As such other than their lower TDPs these parts are essentially redundant at the moment.
For this reason NVIDIA’s real competition will be from AMD, with the $399 price tag putting the GTX 770 somewhere between AMD’s Radeon HD 7970 and Radeon HD 7970 GHz Edition. The price of the GTX 770 is going to be closer to the former while the performance is going to be closer to the latter, which will put AMD in a tight spot. AMD’s saving throw here will be their game bundles; NVIDIA isn’t bundling anything with the GTX 770, while the 7970 cards will come with AMD’s huge 4 game Level Up with Never Settle Reloaded bundle.
Finally, today’s launch is going to be a hard launch just like GTX 780 last week. Furthermore NVIDIA’s partners will be shipping semi-custom cards right at launch, and in fact we aren’t expecting to see any reference cards for sale in North America. This means there will be a great variety among cards, but not necessarily much in the way of consistency.
|May 2013 GPU Pricing Comparison|
|AMD Radeon HD 7990||$1000||GeForce GTX Titan/GTX 690|
|$650||GeForce GTX 780|
|Radeon HD 7970 GHz Edition||$440||GeForce GTX 680|
|$400||GeForce GTX 770|
|Radeon HD 7970||$380|
|$350||GeForce GTX 670|
|Radeon HD 7950||$300|
Meet The GeForce GTX 770
It’s unfortunate that none of NVIDIA’s North American partners will be selling cards based on NVIDIA’s reference design, since NVIDIA is once again using GTX Titan as their template for their design, making for a very high quality card. At the same time it’s unfortunate the reference design will not be available because it means that not everything we have to say about GTX 770 will be applicable to retail cards. We’re essentially reviewing a card with a unique cooler you can’t buy, which has been something of a recurring problem for us with these virtual launches.
In lieu of the reference design, NVIDIA’s partners will be going semi-custom right from the start. A lot of what we’re going to see are various 2 and 3 fan open air coolers, however at least a couple of partners will also be selling blowers, albeit plastic in place of the Titan-derived metal cooler. Still, blowers may be a bit hard to come by with GTX 670, which is something of an odd outcome given how prevalent blowers have been at this performance tier in the past.
In any case, we have a few different semi-custom GTX 770 cards that just arrived in-house (all of the overclocked variety) which we’ll be looking at next week. In the meantime let’s dive in to NVIDIA’s reference GTX 770.
Whereas GTX 780 was truly a Titan Mini, GTX 770 has a few more accommodations to account for the differences between the products, but the end product is still remarkably Titan-like. In short, GTX 770 is still a 10.5” long card composed of a cast aluminum housing, a nickel-tipped heatsink, an aluminum baseplate, and a vapor chamber providing heat transfer between the GPU and the heatsink. The end result is that NVIDIA maintains Titan’s excellent cooling performance while also maintaining Titan’s solid feel and eye-catching design.
The story is much the same on the PCB and component selection. The PCB itself is Titan’s PCB retrofitted for use with GK104 instead of GK110, which amounts to a handful of differences. Besides a new memory layout suitable for a 256bit bus operating at 7GHz, the other big change here is that NVIDIA has scaled down the power circuitry slightly, from a 6+2 phase design for their GK110 cards to a 5+1 phase design for GTX 770, in reflection of GTX 770’s lower 230W TDP.
On that note, for those of you looking for clean pictures of the PCB and GPU, unfortunately you will be out of luck as NVIDIA used the same silk-screened Shin-Etsu thermal compound as they did for GTX Titan. This compound is great for transferring heat and a great thing for GTX 770 buyers, but its composition and application means that we can’t take apart these cards without irrevocably damaging their cooling capabilities, and at the same time NVIDIA didn’t take pictures of their own.
Anyhow, with all of the similarities between GTX 770 and GTX 780/Titan, we are otherwise looking at a card that could be mistaken for Titan if not for the “GTX 770” stamped into the card’s shroud. This means that the I/O options are also identical, with a set of 8pin + 6pin power sockets providing the necessary extra power, a pair of SLI connectors allowing for up to 3-way SLI, and the NVIDIA standard display output configuration of 2x DL-DVI, 1x HDMI, 1x DisplayPort 1.2.
Like GTX 780, we expect to see some interesting designs come out of NVIDIA’s partners. The Titan cooler sets an extremely high bar here given the fact that it was designed for a higher 250W TDP, meaning it’s slightly overpowered for GTX 770. Meanwhile NVIDIA’s Greenlight approval program means that their partners semi-custom and custom designs need to maintain roughly the same level of quality, hence the common use of open-air coolers.
|GeForce Clockspeed Bins|
|Clockspeed||GTX 770||GTX 680|
Moving on to overclocking, as this is a GPU Boost 2.0 part, overclocking will also operate in the same way it did on GTX 780, and yes, this includes overvolting. GTX 770’s maximum power target is 106% (244W), and a very mild overvoltage of +0.012v is available, unlocking one higher boost bin. This also means that GTX 770 follows the usual TDP and temperature throttling conditions, with a standard temperature throttle of 80C. In practice (at least on our reference card) GTX 770 typically reaches its highest clockspeeds before it reaches the TDP or temperature throttles, so these are mostly of use in concert with overvolting and the use of offset clocks.
Finally, GTX 770 also includes the incremental fan speed improvements first introduced last week with GTX 780. So like GTX 780, GTX 770’s default fan controller programming is biased to react more slowly to temperature changes in order to minimize sudden shifts in fan speed.
The 2GB Question & The Test
Before diving into our test results, I wanted to spend a moment mulling over NVIDIA’s choice for the default memory configuration on GTX 770. Due to the use of a 256bit bus on GK104, NVIDIA limits their practical memory choices to either 2GB of RAM or 4GB. A year ago this was fine even if it wasn’t as large as AMD’s 3GB memory pool, but that was after all a year ago.
Not unlike where we are with 1GB/2GB on mainstream ($150+) cards, we’re at a similar precipice with these enthusiast class cards. Having 2GB of RAM doesn’t impose any real problems today, but I’m left to wonder for how much longer that’s going to be true. The wildcard in all of this will be the next-generation consoles, each of which packs 8GB of RAM, which is quite a lot of RAM for video operations even after everything else is accounted for. With most PC games being ports of console games, there’s a decent risk of 2GB cards being undersized when used with high resolutions and the highest quality art assets. The worst case scenario is only that these highest quality assets may not be usable at playable performance, but considering the high performance of every other aspect of GTX 770 that would be a distinct and unfortunate bottleneck.
The solution for better or worse is doubling the GTX 770 to 4GB. GTX 770 is capable of housing 4GB, and NVIDIA’s partners will be selling 4GB cards in the near future, so 4GB cards will at least be an option. The price premium for 4GB of RAM looks to be around $20-$30, and I expect that will come down some as 4Gb chips start to replace 2Gb chips. 4GB would certainly make the GTX 770 future-proof in that respect, and I suspect it’s a good idea for anyone on a long upgrade cycle, but as always this is a bit of a gamble.
Though I can’t help but feel NVIDIA could have simply sidestepped the whole issue by making 4GB the default, rather than an optional upgrade. As it stands 2GB feels shortsighted, and for a $400 card, a bit small. Given the low cost of additional RAM, a 4GB baseline likely would have been bearable.
For today’s launch article we’re using NVIDIA’s 320.18 drivers for the GTX 780 and GTX 770, , and AMD’s Catalyst 13.5b2 drivers for all AMD cards.
|CPU:||Intel Core i7-3960X @ 4.3GHz|
|Motherboard:||EVGA X79 SLI|
|Power Supply:||Antec True Power Quattro 1200|
|Hard Disk:||Samsung 470 (256GB)|
|Memory:||G.Skill Ripjaws DDR3-1867 4 x 4GB (8-10-9-26)|
|Case:||Thermaltake Spedo Advance|
AMD Radeon HD 7970 GHz Edition
AMD Radeon HD 7990
NVIDIA GeForce GTX 580
NVIDIA GeForce GTX 680
NVIDIA GeForce GTX 690
NVIDIA GeForce GTX 780
NVIDIA GeForce GTX Titan
NVIDIA ForceWare 320.14
NVIDIA ForceWare 320.18
AMD Catalyst 13.5 Beta 2
|OS:||Windows 8 Pro|
As always, starting off our benchmark collection is our racing benchmark, DiRT: Showdown. DiRT: Showdown is based on the latest iteration of Codemasters’ EGO engine, which has continually evolved over the years to add more advanced rendering features. It was one of the first games to implement tessellation, and also one of the first games to implement a DirectCompute based forward-rendering compatible lighting system. At the same time as Codemasters is by far the most prevalent PC racing developers, it’s also a good proxy for some of the other racing games on the market like F1 and GRID.
Despite the fact that it’s a $400 card, GTX 770 straddles the line between being a card best suited for 2560x1440, and a card best suited for 1920x1080. With GTX 780 and above we could get away with 2560 on the highest settings in most games, but with GTX 770 there will at times be compromises, either in quality/resolution, or dropping below 60fps.
In any case, DiRT: Showdown remains a troublesome title for NVIDIA. With its advanced lighting system on, GTX 770 trails the 7970 – let alone the 7970GE – at every resolution. For GTX 780 this wasn’t a problem, but for GTX 770 this means dropping below 60fps at 2560.
Total War: Shogun 2
Our next benchmark is Shogun 2, which is a continuing favorite to our benchmark suite. Total War: Shogun 2 is the latest installment of the long-running Total War series of turn based strategy games, and alongside Civilization V is notable for just how many units it can put on a screen at once. Even 2 years after its release it’s still a very punishing game at its highest settings due to the amount of shading and memory those units require.
With Shogun 2 we see an immediate advantage for NVIDIA and the GTX 770 in particular. Where the 7970GE and GTX 680 were generally tied here, the GTX 770 pulls ahead by several percent. It won’t get GTX 770 to 50fps at 2560, but it at least gets it into the 40s.
Altogether this gives the GTX 770 a 19% advantage over the 7970GE. Meanwhile against past generation NVIDIA cards the difference is anywhere between 14% over the GTX 680, and coming very close to outright doubling the performance of the GTX 570. The GTX 680 comparison ends up being particularly interesting, as this is well ahead of where clockspeed alone should have pushed the GTX 770. In this case Shogun 2 seems to especially benefit from that extra memory bandwidth.
The third game in our lineup is Hitman: Absolution. The latest game in Square Enix’s stealth-action series, Hitman: Absolution is a DirectX 11 based title that though a bit heavy on the CPU, can give most GPUs a run for their money. Furthermore it has a built-in benchmark, which gives it a level of standardization that fewer and fewer benchmarks possess.
Under Hitman the GTX 770 once more goes back to trailing AMD’s 7970 cards. The GTX 770 and 7970 vanilla are neck-and-neck at times, and meanwhile it trails the faster 7970GE by about 13% at 2560, and 17% at 1920. Whereas compared to the NVIDIA stable we’re seeing just 2% faster than the GTX 680, and even the gains over the 570 are only 54%. Though dropping to 1920 with 4x MSAA opens a much wider gap between the GTX 770 and GTX 680, once again playing off the former’s significant memory bandwidth advantage.
Minimum framerates are essentially the same story as the average framerates, with the GTX 770 trailing AMD’s cards. Though for anything with an absolute minimum over 60fps we have to look at 1920 with high settings regardless.
Another Square Enix game, Sleeping Dogs is one of the few open world games to be released with any kind of benchmark, giving us a unique opportunity to benchmark an open world game. Like most console ports, Sleeping Dogs’ base assets are not extremely demanding, but it makes up for it with its interesting anti-aliasing implementation, a mix of FXAA and SSAA that at its highest settings does an impeccable job of removing jaggies. However by effectively rendering the game world multiple times over, it can also require a very powerful video card to drive these high AA modes.
Sleeping Dogs is another title AMD generally does well in, but in this case not so well that it shuts out the GTX 770. At 2560 it’s fast enough to pull ahead of the 7970, while still falling behind the 7970GE by a few FPS. Meanwhile the GTX 770’s 10% gain over the GTX 680 is solid and once again memory bandwidth likely plays a factor, while against the GTX 570 it’s ahead by 76%.
The GTX 770 doesn’t fare quite as well when it comes to minimum framerates. It’s not a poor showing, but it loses more ground than AMD’s cards do here, leaving it a few FPS behind the 7970.
Up next is our legacy title for 2013, Crysis: Warhead. The stand-alone expansion to 2007’s Crysis, at over 4 years old Crysis: Warhead can still beat most systems down. Crysis was intended to be future-looking as far as performance and visual quality goes, and it has clearly achieved that. We’ve only finally reached the point where single-GPU cards have come out that can hit 60fps at 1920 with 4xAA.
Crysis: Warhead is another title that generally favors AMD cards, to the GTX 770’s detriment. Not that anyone does particularly well at 2560, while at 1920 with Enthusiast quality we see the GTX 770 trailing the 7970 by a couple of frames per second, and the 7970GE by several more (13%). The extra memory bandwidth is helping the GTX 770 to some extent here, pushing it above the GTX 680 by 7%, but it’s not a title GK104 excels at, with the GTX 770 only surpassing the GTX 570 by 57%.
Minimum framerates are generally a repeat of our average framerates here, leading to the GTX 770 falling behind both AMD cards. Even the gains over the GTX 570 aren’t very good, with just a 39% improvement at 1920.
Far Cry 3
The next game in our benchmark suite is Far Cry 3, Ubisoft’s island-jungle action game. A lot like our other jungle game Crysis, Far Cry 3 can be quite tough on GPUs, especially with MSAA and improved alpha-to-coverage checking thrown into the mix. On the other hand it’s still a bit of a pig on the CPU side, and seemingly inexplicably we’ve found that it doesn’t play well with HyperThreading on our testbed, making this the only game we’ve ever had to disable HT for to maximize our framerates.
With Far Cry 3 we shift to a set of games that historically favor NVIDIA’s cards, and as it turns out benefit the GTX 770 to a pretty big degree. At 2560 we’re looking at a 13% advantage over the 7970GE, rising to 17% at 1920 without MSAA. The GTX 680 and GTX 570 are also summarily put in their place, with the GTX 770 gaining on them by 11% and 85% respectively.
Our final multiplayer action game of our benchmark suite is Battlefield 3, DICE’s 2011 multiplayer military shooter. Its ability to pose a significant challenge to GPUs has been dulled some by time and drivers, but it’s still a challenge if you want to hit the highest settings at the highest resolutions at the highest anti-aliasing levels. Furthermore while we can crack 60fps in single player mode, our rule of thumb here is that multiplayer framerates will dip to half our single player framerates, so hitting high framerates here may not be high enough.
GTX 770 is going to struggle with BF3 at 2560 with everything turned up, but dropping down to 1920 is enough to get the average framerate well above 60fps, and consequently the minimum framerates should be well above 30fps too. In this case this is another game NVIDIA traditionally excels at, leading to a 15% performance advantage over the 7970GE. The gains against the GTX 680 are far more muted at just 6% - indicating that we’re not seeing much of a benefit from more memory bandwidth – while it’s another very big step up from the GTX 570 at 83%.
Our other strategy game, Civilization V, gives us an interesting look at things that other RTSes cannot match, with a much weaker focus on shading in the game world and a much greater focus on creating the geometry needed to bring such a world to life. In doing so it uses a slew of DirectX 11 technologies, including tessellation for said geometry, driver command lists for reducing CPU overhead, and compute shaders for on-the-fly texture decompression.
Civ V is another game that that traditionally goes to NVIDIA, though in this case the lead is only 8% at 2560. The GTX 770 gains more on the GTX 680 than we would have expected however, with a similar 8%.
Bioshock Infinite is Irrational Games’ latest entry in the Bioshock franchise. Though it’s based on Unreal Engine 3 – making it our obligatory UE3 game – Irrational had added a number of effects that make the game rather GPU-intensive on its highest settings. As an added bonus it includes a built-in benchmark composed of several scenes, a rarity for UE3 engine games, so we can easily get a good representation of what Bioshock’s performance is like.
Bioshock is another title where we can’t quite hit 60fps at 2560 with everything turned up, forcing us to look at 1920 if we want to hit 60fps. This gives the GTX 770 another relatively sizable lead over the 7970GE at 16%, and almost no lead over the GTX 680 at just 3%. Against GTX 570 however it’s another very strong showing, besting it by nothing shy of 83%.
Our final benchmark in our suite needs no introduction. With Crysis 3, Crytek has gone back to trying to kill computers, taking back the “most punishing game” title in our benchmark suite. Only in a handful of setups can we even run Crysis 3 at its highest (Very High) settings, and that’s still without AA. Crysis 1 was an excellent template for the kind of performance required to driver games for the next few years, and Crysis 3 looks to be much the same for 2013.
Unsurprisingly, Crysis 3 is another game where 2560 isn’t really on the table. In fact we have to go all the way to 1920 at High settings to get a framerate above 60fps. By this point the GTX 770 leads over the 7970GE by 16%, a smaller 7% over the GTX 680, and 76% over the GTX 570.
As always we’ll also take a quick look at synthetic performance, though as GTX 780 is just another GK110 card, there shouldn't be any surprises here. These tests are mostly for comparing cards from within a manufacturer, as opposed to directly comparing AMD and NVIDIA cards. We’ll start with 3DMark Vantage’s Pixel Fill test.
Thanks to the combination of core clock and memory clock increases, GTX 770 doesn’t just pass GTX 680, but even GTX 780. This is a bit paradoxical at first, but it’s worth keeping in mind that this is a pixel throughput test, and GTX 770’s ROPs are clocked a good 200MHz+ higher than GTX 780’s. In real games GTX 780 is still going to have the edge, but in edge cases having such a clockspeed gap can lead to some unusual outcomes.
Moving on, we have our 3DMark Vantage texture fillrate test, which does for texels and texture mapping units what the previous test does for ROPs.
Texel throughput on the other hand is all about the core clocks; the improvement over the GTX 680 is barely at 3%, and GTX 770 is well behind GTX 780.
Finally we’ll take a quick look at tessellation performance with TessMark.
GTX 770 pulls ahead in TessMark by more than we would expect given the fact that this test should be largely GPU limited. This may be a case of GTX 770 squeezing out a bit more performance due to GPU Boost 2.0.
Jumping into compute, we aren’t expecting too much here. Outside of DirectCompute GK104 is generally a poor compute GPU, and other than the clockspeed boost GTX 770 doesn’t have much going for it.
As always we'll start with our DirectCompute game example, Civilization V, which uses DirectCompute to decompress textures on the fly. Civ V includes a sub-benchmark that exclusively tests the speed of their texture decompression algorithm by repeatedly decompressing the textures required for one of the game’s leader scenes. While DirectCompute is used in many games, this is one of the only games with a benchmark that can isolate the use of DirectCompute and its resulting performance.
Civilization V at least shows that NVIDIA’s DirectCompute performance is up to snuff in this case. Though as is the case with GTX 780, we’re reaching the limits of what this benchmark can do, due to just how fast modern cards have become.
Our next benchmark is LuxMark2.0, the official benchmark of SmallLuxGPU 2.0. SmallLuxGPU is an OpenCL accelerated ray tracer that is part of the larger LuxRender suite. Ray tracing has become a stronghold for GPUs in recent years as ray tracing maps well to GPU pipelines, allowing artists to render scenes much more quickly than with CPUs alone.
Moving on to a more general compute task, we get a reminder of how poor GK104 is here. GTX 770 can beat the slower GK104 products, and that’s it. Even GTX 570 is faster, never mind the massive lead that 7970GE holds.
Our 3rd benchmark set comes from CLBenchmark 1.1. CLBenchmark contains a number of subtests; we’re focusing on the most practical of them, the computer vision test and the fluid simulation test. The former being a useful proxy for computer imaging tasks where systems are required to parse images and identify features (e.g. humans), while fluid simulations are common in professional graphics work and games alike.
CLBenchmark paints GTX 770 in a better light than LuxMark, but not by a great deal. The gains over the GTX 680 are miniscule since these benchmarks aren’t memory bandwidth limited, and the gap between it and the 7970GE is nothing short of enormous.
Moving on, our 4th compute benchmark is FAHBench, the official Folding @ Home benchmark. Folding @ Home is the popular Stanford-backed research and distributed computing initiative that has work distributed to millions of volunteer computers over the internet, each of which is responsible for a tiny slice of a protein folding simulation. FAHBench can test both single precision and double precision floating point performance, with single precision being the most useful metric for most consumer cards due to their low double precision performance. Each precision has two modes, explicit and implicit, the difference being whether water atoms are included in the simulation, which adds quite a bit of work and overhead. This is another OpenCL test, as Folding @ Home has moved exclusively to OpenCL this year with FAHCore 17.
Recent core improvements in Folding @ Home continue to pay off for NVIDIA. In single precision the GTX 770 is just fast enough to hang with the 7970 vanilla, though the 7970GE is still over 10% faster. Double precision on the other hand is entirely in AMD’s favor thanks to GK104’s very poor FP64 performance.
Wrapping things up, our final compute benchmark is an in-house project developed by our very own Dr. Ian Cutress. SystemCompute is our first C++ AMP benchmark, utilizing Microsoft’s simple C++ extensions to allow the easy use of GPU computing in C++ programs. SystemCompute in turn is a collection of benchmarks for several different fundamental compute algorithms, as described in this previous article, with the final score represented in points. DirectCompute is the compute backend for C++ AMP on Windows, so this forms our other DirectCompute test.
Unlike our other compute benchmarks, System Compute is at least a little bit memory bandwidth sensitive, so GTX 770 pulls ahead of GTX 680 by 11%. Otherwise like every other compute benchmark, AMD’s cards fare far better here.
Power, Temperature, & Noise
As always, last but not least is our look at power, temperature, and noise. Next to price and performance of course, these are some of the most important aspects of a GPU, due in large part to the impact of noise. All things considered, a loud card is undesirable unless there’s a sufficiently good reason – or sufficiently good performance – to ignore the noise.
GTX 770 ends up being an interesting case study in all 3 factors due to the fact that NVIDIA is pushing the GK104 GPU so hard. Though the old version of GPU Boost muddles things some, there’s no denying that higher clockspeeds coupled with the higher voltages needed to reach those clockspeeds has a notable impact on power consumption. This makes it very hard for NVIDIA to stick to their efficiency curve, since adding voltages and clockspeeds offers diminishing returns for the increase in power consumption.
|GeForce GTX 770 Voltages|
|GTX 770 Max Boost||GTX 680 Max Boost||GTX 770 Idle|
As we can see, NVIDIA has pushed up their voltage from 1.175v on GTX 680 to 1.2v on GTX 770. This buys them the increased clockspeeds they need, but it will drive up power consumption. At the same time GPU Boost 2.0 helps to counter this some, as it will keep leakage from being overwhelming by keeping GPU temperatures at or below 80C.
|GeForce GTX 770 Average Clockspeeds|
|Max Boost Clock||1136MHz|
|Far Cry 3||
Speaking of clockspeeds, we also took the average clockspeeds for GTX 770 in our games. In short, GTX 770 is almost always at its maximum boost bin of 1136; the oversized Titan cooler keeps temperatures just below the thermal throttle, and there’s enough TDP headroom left that the card doesn’t need to pull back to avoid that. This is one of the reasons why GTX 770’s performance advantage over GTX 680 is greater than the clockspeed increases alone.
We don’t normally publish this data, but GTX 770 has an extra interesting attribute about it: its idle clockspeed is lower than other Kepler parts. GTX 680 and GTX 780 both idle at 324MHz, but GTX 770 idles at 135MHz. Even 324MHz has proven low enough to keep Kepler’s idle power in check in the past, so it’s not entirely clear just what NVIDIA is expecting here. We’re seeing 1W less at the wall, but by this point the rest of our testbed is drowning out the video card.
Moving on to BF3 power consumption, we can see the power cost of GTX 770’s performance. 374W at the wall is only 18W more than GTX 680, thanks in part to the fact that GTX 770 isn’t hitting its TDP limit here. At the same time compared to the outgoing GTX 670, this is a 44W difference. This makes it very clear that GTX 770 is not a drop-in replacement for GTX 670 as far as power and cooling go. On the other hand GTX 770 and GTX 570 are very close, even if GTX 770’s TDP is technically a bit higher than GTX 570’s.
Despite this runup, GTX 770 still stays a hair under 7970GE, despite the slightly higher CPU power consumption from GTX 770’s higher performance in this benchmark. It’s only 6W at the wall, but it showcases that NVIDIA didn’t have to completely blow their efficiency curve to get a GK104 card back up to 7970GE performance levels.
In our TDP constrained scenario we can see the gaps between our cards grow. 78W separates the GTX 770 from GTX 670, and even GTX 680 draws 41W less, almost exactly what we’d expect from their published TDPs. On the flip side of the coin 383W is still less than both 7970 cards, reflecting the fact that GTX 770 is geared for 230W while AMD’s best is geared for 250W.
This is also a reminder however that at a mid-generation product extra performance does not come for free. With the same process and the same architecture, performance increases require power increases. This won’t significantly change until we see 20nm cards next year.
Moving on to temperatures, these are going to be a walk in the part for the reference GTX 770 due to the Titan cooler. At idle we see it hit 31C, which is actually 1C warmer than GTX 780, but this really just comes down to uncontrollable variations in our tests.
As a GPU Boost 2.0 card temperatures will top out at 80C in games, and that’s exactly what happens here. Interestingly, GTX 770 is just hitting 80C, as evidenced by our clockspeeds earlier. If it was running hotter, it would have needed to drop to lower clockspeeds.
Of course it doesn’t hold a candle here to 7970GE, but that’s the difference between a blower and an open air cooler in action. The blower based 7970 is much closer, as we’d expect.
Under FurMark the temperature situation is largely the same. The GTX 770 comes up to 82C here (favoring TDP throttling over temperature throttling), but the relative rankings are consistent.
With Titan’s cooler in tow, idle noise looks very good on GTX 770.
Our noise results under Battlefield 3 are a big part of the reason we’ve been calling the Titan cooler oversized for GTX 770. When is the last time we’ve seen a blower on a 230W card that only hit 46.7dB? The short answer is never. GTX 770’s fan simply doesn’t have to rev up very much to handle the lesser heat output. In fact it’s damn near competitive with the open air cooled 7970GE; there’s still a difference, but it’s under 2dB. More importantly however, despite being a more powerful and more power-hungry card than the GTX 680, the GTX 770 is over 5dB quieter, and this is despite the fact that the GTX 680 is already a solid card own its own. Titan’s cooler is certainly expensive, but it gets results.
Of course this is why it’s all the more a shame that none of NVIDIA’s partners are releasing retail cards with this cooler. There are some blowers in the pipeline, so it will be interesting to see if they can maintain Titan’s performance while giving up the metal.
With FurMark pushing our GTX 770 at full TDP, our noise results are still good, but not as impassive as they were under BF3. 50.3dB is still over a dB quieter than GTX 680, though obviously much closer than before. On the other hand the GTX 770 ever so slightly edges out the 7970GE and its open air cooler. Part of this comes down to the TDP difference of course, but beating an open air cooler like that is still quite the feat.
Wrapping things up here, it will be interesting to see where NVIDIA’s partners go with their custom designs. GTX 770, despite being a higher TDP part than both GTX 670 and GTX 680, ends up looking very impressive when it comes to noise, and it would be great to see NVIDIA’s partners match that. At the same time the increased power consumption and heat generation relative to the GeForce 600 series is unfortunate, but not unexpected. But for buyers coming from the GeForce 400 and GeForce 500 series, GTX 770 is in-line with what those previous generation cards were already pulling.
Although GTX 770 is already a very high clocked part for GK104, we still wanted to put it through its paces when it comes to overclocking. Of particular interest here is actually memory overclocking, as this is the first video card shipping with 7GHz GDDR5 standard. This will let us poke at things to see just how far both the RAM itself and NVIDIA’s memory controller can go.
Meanwhile the switch to GPU Boost 2.0 for GTX 770 is going to change the overclocking process somewhat compared to GTX 680 and GTX 670. Overvolting introduces marginally higher voltages and boost bins to play with, while on the other hand the removal of power targets in favor of TDP means that we only get 106% – an extra 14W – to play with in TDP limited scenarios. Thankfully as we’ve seen we’re generally not TDP limited on GTX 770 at stock, which means our effective headroom should be greater than that.
|GeForce GTX 770 Overclocking|
|Max Boost Clock||1136MHz||1241MHz|
We’re actually a bit surprised we were able to get another 100MHz out of the GPU itself. Even without the extra overvoltage boost bin, we’re still pushing 1200MHz+ on 1.2v, which is doing rather well for GK104. Of course this is only a 9% increase in the GPU clockspeed, which is going to pale in comparison to parts like GTX 670 and GTX 780, each of which can do 20%+ due to their lower clockspeeds. So there’s some overclocking headroom in GTX 770, but as to be expected it's not a lot.
More interesting however is the memory overclock. We’ve been able to put another 1GHz on 6GHz GTX 680 cards in the past, and with the 7GHz base GTX 770 we’ve been able to pull off a similar overclock, pushing our GTX 770 to an 8GHz memory clock. The fact that NVIDIA’s memory controller can pull this off is nothing short of impressive; we had expected there to be some headroom, but another 14% is beyond our expectations. At this clockspeed the GTX 770 has a full 256GB/sec of memory bandwidth, 33% more than both a stock GTX 680 and the 384-bit GTX 580. Of course we’ll see if GTX 770 can put that bandwidth to good use.
The end result of our overclocking efforts nets a very consistent 9%-12% increase in performance across our games. 9% is the upper bound for improvements due to GPU overclocking, so anything past that means we’re also benefitting from the extra memory bandwidth. We aren’t picking up a ton of performance from memory bandwidth as far as we can tell, but it does pay off and is worth pursuing, even with the GTX 770’s base memory clock of 7GHz.
Overall overclocking can help close the gap between the GTX 770 and 7970GE in some games, and extend it in others. But 10% won’t completely close the gap on the GTX 780; at best it can halve it. GTX 780’s stock performance is simply not attainable without the much more powerful GK110 GPU.
Moving on to power consumption, we can see that the 106% TDP limit keeps power usage from jumping up by too much. In Battlefield 3 this is a further 12W at the wall, and 21W at the wall with FurMark. In games this means our power usage at the wall is still below GTX 780, though we’ve equaled it under FurMark.
The fan curve for GTX 770 appears to be identical to that of GTX 780. Which is to say the fan significantly ramps up around 84C, keeping temperatures in the low-to-mid 80s even though GPU Boost 2.0 is allowed to go up to 95C.
Finally for fan noise, we see a small increase under Battlefield 3, and no change under FurMark. 1.5dB louder under Battlefield 3 puts noise levels on par with the GTX 780, sacrificing some of GTX 770’s abnormally quiet acoustics, but still keeping noise below the 50dB level. Or to put this another way, the performance gains for overclocking aren’t particularly high, but then again neither is the cost of overclocking in terms of noise.
This current generation of video cards has been something of a rollercoaster ride in both performance and leadership. In the last 18 months we’ve seen AMD take the lead with Radeon HD 7970, unexpectedly lose it to GeForce GTX 680, gain it again with Radeon HD 7970 GE and greatly improved drivers, and then break even in the end with GTX 770. GTX 780 and GTX Titan make all of this moot with their much greater single-GPU performance, but priced as they are they’re also nowhere near being in the same market segment as the GTX 770 and 7970GE.
In any case, more than anything else it strikes us as particularly funny that we’re once again looking at a tie. That’s right: on average GTX 770 and 7970GE are tied. GTX 770 delivers 102% of the performance of 7970GE at both our high quality 2560x1440 and high quality 1920x1080 settings. Of course as with some of the past battles between AMD and NVIDIA in this segment, these cards may be tied in our benchmarks but they’re anything but equal.
After all is said and done, the GTX 770 ends up beating the 7970GE at 6 games, while the 7970GE takes the other 4. Meanwhile within those individual games we’ll see anything between a near-tie to a very significant 20% advantage for either side, depending on the game in question. This is very much a repeat of what we saw with the GTX 680 versus the 7970GE, and GTX 670 versus the 7970.
Our advice then for prospective buyers is to first look at benchmarks for the games they intend to play. If you’re going to be focused on only a couple of games for the near future then there’s a very good chance one card or the other is going to be the best fit. Otherwise for gamers facing a wide selection of games or looking at future games where their performance is unknown, then the GTX 770 and 7970GE are in fact tied, and from a performance perspective you couldn’t go wrong with either one.
With that said, there are a couple of wildcard factors in play here that can tilt things in either side’s favor. At $399 the GTX 770 is cheaper than the 7970GE by $20 to $50, depending on the model and whether there’s a sale going on (the 7970 is actually priced closer, but we’d consider the 7970GE the better value for AMD cards). Consumers at virtually every level are still very price-conscious, so that’s going to put AMD in a pinch as they need 7970GE, not 7970 vanilla, to match GTX 770.
At the same time however given the fact that we’re looking at a performance tie AMD is making a very serious effort to offer more value than NVIDIA through their Level Up with Never Settle Reloaded gaming bundle. These bundles are non-tangible items – the value of which is solely in the eye of the beholder – but for a buyer interested in those games it’s going to be a very convincing argument. And then there’s compute performance and the amount of included RAM, both of which continue to favor AMD, though admittedly this is nothing new.
Meanwhile on a side note, it’s interesting to note that as evidenced by this launch that AMD has pushed NVIDIA to the point where NVIDIA has generally sacrificed their efficiency advantage to reach performance parity at a $400 price point. At the launch of the 7970GE NVIDIA at least tied the 250W 7970GE with a 195W GTX 680, giving NVIDIA an efficiency advantage. But now with the launch of the GTX 770 NVIDIA needs a 230W card to match that very same 250W 7970GE, a testament to AMD’s driver improvements and a reflection of the fact that just like AMD, NVIDIA needed to push a GPU to its limits to get here. There are still some edge cases here worth considering – you can’t get 7970GE on a blower for example – but under gaming workloads AMD and NVIDIA’s power consumption and heat generation have been equalized, making these cards more tied than ever before.
Ultimately a tie is a wonderful thing and a frustrating thing at the same time, and that’s definitely the case here with the launch of the GTX 770. The wonderful aspect of it is that NVIDIA and AMD are once again locked in vicious, brutal combat around the $400 price point. It has brought performance up and prices down in the middle of a generation, improving the options for all customers. The frustrating aspect on the other hand is that having a clear winner makes customers feel better as it removes any question about whether they’ve made the right choice. After all it’s much easier to make a choice when there’s really no choice to be made.
Moving on to some other comparisons, though we’ve focused mostly on the immediate competition, for those buyers on an upgrade cycle things have panned out pretty much as to be expected. The GTX 770 delivers an average performance improvement of 75% over the two-and-a-half year old GTX 570, which is roughly what we’d expect for jumping from one mid-generation card to another, and at $399 it is reasonably priced as an upgrade. The performance improvement from the GTX 670 is much smaller at just 20%, but GTX 770 is clearly not targeted at GTX 670 owners as an upgrade. At the same time it’s interesting to note that between the higher core clockspeed, higher memory clockspeed, and higher TDP plus GPU Boost 2.0 found on GTX 770, NVIDIA has improved their performance over GTX 680 by just 7% on average. This isn’t a lot in and of itself, but we’re talking about replacing a $450 video card with a $400 video card that’s faster across the board, so it’s a nice way to raise the bar on performance while bringing prices down.
Wrapping things up, this should set the stage for the enthusiast/high-end market for the rest of the year. According to AMD’s last schedule they won’t have a new high-end part to replace Tahiti until the end of the year, and NVIDIA won’t have Maxwell until 2014; all of this being complicated by the fact that TSMC’s 20nm process is still so far out. NVIDIA still has the rest of the GeForce 700 lineup to roll out through the next few months, but for the GTX 770 and the 7970/7970GE, the rest of the year will be a battle of prices and bundles.