AMD's Radeon HD 6990: The New Single Card King
by Ryan Smith on March 8, 2011 12:01 AM EST- Posted in
- AMD
- Radeon HD 6990
- GPUs
The AMD Radeon HD 6990, otherwise known as Antilles, is a card we have been expecting for some time now. In what’s become a normal AMD fashion, when they first introduced the Radeon HD 6800 series back in October, they also provided a rough timeline for the rest of the high-end members of the family. Barts would be followed by Cayman (6950/6970), which would be followed by the dual-GPU Antilles (6990).
AMD’s original launch schedule at the time was to have the whole stack out the door by the end of 2010 – Antilles would be the last product, likely to catch Christmas before it was too late. What ended up happening however is that Cayman didn’t make it out until the middle of December, which put those original plans on ice. So we ended up closing the year with the 6800 series and the single-GPU members of the 6900 series, but AMD did not launch a replacement for their flagship dual-GPU card, leaving AMD’s product stack in an odd place where their top card was a 5000 series card compared to the 6000 series occupying everything else.
So while we’ve had to wait longer than we anticipated for Antilles/6990, the wait has finally come to an end. Today AMD is launching their new flagship card, retiring the now venerable 5970 and replacing it with a new dual-GPU monster powered by AMD’s recently introduced VLIW4 design. Manufactured on the same 40nm process as the GPUs in the 5970, AMD has had to go to some interesting lengths to improve performance here. And as we’ll see, it’s going to be a doozy in more ways than one.
AMD Radeon HD 6990 | AMD Radeon HD 6970 | AMD Radeon HD 6950 | AMD Radeon HD 5970 | |
Stream Processors | 2x1536 | 1536 | 1408 | 2x1600 |
Texture Units | 2x96 | 96 | 88 | 2x80 |
ROPs | 2x32 | 32 | 32 | 2x32 |
Core Clock | 830MHz | 880MHz | 800MHz | 725MHz |
Memory Clock | 1.25GHz (5.0GHz data rate) GDDR5 | 1.375GHz (5.5GHz data rate) GDDR5 | 1.25GHz (5.0GHz data rate) GDDR5 | 1.GHz (4GHz data rate) GDDR5 |
Memory Bus Width | 2x 256-bit | 256-bit | 256-bit | 2x256-bit |
Frame Buffer | 2x2GB | 2GB | 2GB | 2x1GB |
FP64 | 1/4 | 1/4 | 1/4 | 1/5 |
Transistor Count | 2x 2.64B | 2.64B | 2.64B | 2x2.15B |
Manufacturing Process | TSMC 40nm | TSMC 40nm | TSMC 40nm | TSMC 40nm |
Price Point | $699 | $349 | $259 | N/A |
For the Radeon HD 5970, AMD found themselves in an interesting position: with the 5000 series launching roughly 6 months ahead of NVIDIA’s 400 series of GPUs, they already had a lead in getting products out the door. But furthermore NVIDIA never completely responded to the 5970, foregoing dual-GPU entirely with the 400 series. The 5970 was undisputed king of video cards – no single card was more powerful. Thus given a lack of direct competition, how AMD can follow up on the 5970 is a matter of great interest.
But before we get too far ahead of ourselves, let’s start with the basics. The Radeon HD 6990 is AMD’s new flagship card, based on a pair of Cayman (VLIW4) GPUs mounted on a single PCB. AMD has clocked the GPU at 830MHz and the GDDR5 memory at 1250MHz (5GHz data rate). The card comes with 4GB of RAM, which due to the internal CrossFire setup of the card reduces the effective RAM capacity to 2GB, the same as AMD’s existing 6900 cards.
Starting with the 5970, TDP limits and the laws of physics began limiting what AMD could do with a dual-GPU card; unlike the 4870X2, the 5970 wasn’t clocked quite high enough to match a pair of 5870s. The delta between the 5970 and the 5870 came down to the 5970 being 125MHz slower on the core and 200MHz (800Mhz data rate) slower for its RAM. In practice this reduced 5970 performance to near-5850CF levels. For the 6990 this gap still exists, but it’s much smaller this time. At 830MHz the 6990 is only 50MHz (5.5%) slower than the 6970, while the 5GHz memory takes a bigger hit as it’s 500MHz (9%) slower than the 6970. As a result at stock settings the 6990 is closer to being a dual-GPU 6970 than the 5970 was a dual-GPU 5870; there is one exception we will see however. Meanwhile the 6990’s GPUs are fully enabled, so all 1536 SPs and 32 ROPs per GPU are available, making the only difference between the 6990 and 6970 the clockspeeds.
Compared to the 5970, the official idle TDP is down some thanks to Cayman’s better power management, leading to an idle TDP of 37W. Meanwhile under load we find our first doozy: the card’s TDP at default clocks is 375W (this is not a typo), and like the 5970 AMD has built it to take even more. Whereas the 5970 stayed within PCI-Express specifications at default clocks, the 6990 makes no attempt to do so, and as such at 375W is the most power hungry card to date.
AMD will be launching the 6990 at $699. Officially this is $100 more expensive than the 5970 at its launch, however the 5970 was virtually never available at this price until very late in the card’s lifetime. $700 does end up being much closer to both the 5970’s historical price and its price relative to AMD’s top single-GPU part (5870), which was $700 and approximately twice the cost respectively. With a more stable supply of GPUs and stronger pressure from NVIDIA we’d expect prices to stick closer to their MSRP this time around, but at the top there’s not a lot of pressure to keep prices from rising. Meanwhile AMD has not provided any hard numbers for availability, but $700 cards are not high volume products. We’d expect availability to be a non-issue.
With the launch of the 6990 AMD’s high-end product stack is fully fleshed out. At the top will be the 6990, followed by the 6970, the 6950 2GB, and the 6950 1GB. The astute among you will notice that the average price of the 6970 is less than half that of the 6990, and as a result a 6970 CrossFire setup is cheaper than the 6990. At the lowest price we’ve seen for the 6970, we could pick up 2 of them for $640, which will put the 6990 in an interesting predicament of being a bit more expensive and a bit slower than the 6970 in CrossFire.
March 2011 Video Card MSRPs | ||
NVIDIA | Price | AMD |
$700 | Radeon HD 6990 | |
$480 | ||
$350 | ||
$320-$340 | Radeon HD 6970 | |
$249-269 | Radeon HD 6950 2GB | |
|
$230-$250 | Radeon HD 6950 1GB |
GeForce GTX 560 Ti
|
$249 | |
$219 | Radeon HD 6870 | |
$160-170 | Radeon HD 6850 |
130 Comments
View All Comments
smookyolo - Tuesday, March 8, 2011 - link
My 470 still beats this at compute tasks. Hehehe.And damn, this card is noisy.
RussianSensation - Tuesday, March 8, 2011 - link
Not even close, unless you are talking about outdated distributed computing projects like Folding@Home code. Try any of the modern DC projects like Collatz Conjecture, MilkyWay@home, etc. and a single HD4850 will smoke a GTX580. This is because Fermi cards are limited to 1/8th of their double-precision performance.In other words, an HD6990 which has 5,100 Gflops of single-precision performance will have 1,275 Glops double precision performance (since AMD allows for 1/4th of its SP). In comparison, the GTX470 has 1,089 Gflops of SP performance which only translates into 136 Gflops in DP. Therefore, a single HD6990 is 9.4x faster in modern computational GPGPU tasks.
palladium - Tuesday, March 8, 2011 - link
Those are just theoretical performance numbers. Not all programs *even newer ones* can effectively extract ILP from AMD's VLIW4 architecture. Those that can will no doubt with faster; others that can't would be slower. As far as I'm aware lots of programs still prefer nV's scalar arch but that might change with time.MrSpadge - Tuesday, March 8, 2011 - link
Well.. if you can oly use 1 of 4 VLIW units in DP then you don't need any ILP. Just keep the threads in flight and it's almost like nVidias scalar architecture, just with everything else being different ;)MrS
IanCutress - Tuesday, March 8, 2011 - link
It all depends on the driver and compiler implementation, and the guy/gal coding it. If you code the same but the compilers are generations apart, then the compiler with the higher generation wins out. If you've had more experience with CUDA based OpenCL, then your NVIDIA OpenCL implementation will outperform your ATI Stream implementation. Pick your card for it's purpose. My homebrew stuff works great on NVIDIA, but I only code for NVIDIA - same thing for big league compute directions.stx53550 - Tuesday, March 15, 2011 - link
off yourself idiotm.amitava - Tuesday, March 8, 2011 - link
".....Cayman’s better power management, leading to a TDP of 37W"- is it honestly THAT good? :P
m.amitava - Tuesday, March 8, 2011 - link
oops...re-read...that was idle TDP !!MamiyaOtaru - Tuesday, March 8, 2011 - link
my old 7900gt used 48 at loadD:
Don't like the direction this is going. In GPUs it's hard to see any performance advances that don't come with equivalent increases in power usage, unlike what Core 2 was compared to Pentium4.
Shadowmaster625 - Tuesday, March 8, 2011 - link
Are you kidding? I have a 7900GTX I dont even use, because it fried my only spare large power supply. A 5670 is twice as fast and consumes next to nothing.