We’re here at NVIDIA’s GPU Technology Conference (GTC) 2012, where NVIDIA is holding their semi-annual professional developers conference. There’s been a great deal announced that will take a few days to completely go over, but for now we wanted to start on the product side with NVIDIA’s major product announcements.  With the launch of GK104 back in March NVIDIA is now ready to start rolling out some of their professional productions, and while the next generation of Quadro is not yet ready, Tesla is another matter. This brings us to our first part of our GTC coverage: the next generation of Tesla cards, Tesla K10 and Tesla K20.

  Tesla K20* Tesla K10 Tesla M2090
Stream Processors <=2880 2 x 1536 512
Texture Units <=240 2 x 128 64
ROPs <=48 2 x 32 48
Core Clock ? 745MHz 650MHz
Shader Clock N/A N/A 1300MHz
Memory Clock ? 5GHz GDDR5 3.7GHz GDDR5
Memory Bus Width 384-bit 2 x 256-bit 384-bit
L2 Cache <=1.5MB 2 x 512KB 768KB
VRAM ? 2 x 4GB 6GB
ECC Full Partial (DRAM) Full
FP64 1/3 FP32 1/24 FP32 1/2 FP32
TDP ? 225W 225W
Transistor Count 7.1B 2 x 3.5B 3B
Manufacturing Process TSMC 28nm TSMC 28nm TSMC 40nm

The first of the new Teslas, and the only model slated to be available in the near future is the Tesla K10. In an interesting turn of events, Tesla K10 will be based on GK104. Specifically it’s a dual-GPU card based on NVIDIA’s recently launched GTX 690, modified to fit the needs of the GPU compute market. Previous generation Tesla cards have always been based on NVIDIA’s top-tier GPUs – GT200 and GF100/GF110 respectively – so this is the first time NVIDIA has ever split the Tesla market in this way by using a lower tier GPU.

The fact of the matter is that with GK104 first launching in GeForce products, NVIDIA downplayed GK104’s compute capabilities. And our own benchmarking has established that GTX 680’s compute performance is anywhere between slightly ahead of the Fermi based GTX 580 to well behind it. Being a descendant of GF114, GK104 had a fair bit of its compute capabilities stripped out relative to GF110, not the least of which is double-precision floating point performance, ECC cache protection, and a about half of the number of registers per CUDA core relative to Fermi.

Given the questionable compute performance of GK104, this makes NVIDIA’s decision to launch a Tesla part based on it quite unexpected.  Still, this is not to say that GK104 can’t perform well in the right situations and this is exactly what NVIDIA designing K10 around. The fact that we’ve found GK104 cards to be slow at compute workloads at times is not lost on NVIDIA; they know better than anyone else what GK104 really can and can’t do and have planned accordingly. For that reason NVIDIA is breaking from what little tradition there is with Tesla as a broad market product and pitching K10 at a very specific market.

NVIDIA’s market strategy here is actually summed up rather well in their K10 press release: “NVIDIA Tesla K10 GPU Accelerates Search for Oil and Gas Reserves, Signal and Image Processing for Defense Industry.” GK104 lacks the ECC and compute flexibility of the Fermi Tesla cards, but what it doesn’t lack is single-precision compute performance and memory bandwidth; and with a dual-GPU card in particular it has both of those in spades. Accordingly, NVIDIA’s goal for K10 is to go after the specific market segments that don’t need ECC and don’t need flexibility, but do need all the raw compute performance they can get. This as it turns out is something gamers are already familiar with: image processing. Image processing doesn’t need the incredible levels of precision that pure computational work does and for that matter it’s rather tolerant of the errant error, so NVIDIA believes there’s a suitably large market there that can be served by GK104 rather than GK110.

With that said, I must admit that if GK110 had come first I don’t know if we’d be having this conversation. Even if a dual GK104 card is faster splitting their market like this is not an easy to move to make. But with GK110 not due in retail for another 5-6 months it’s obviously NVIDIA’s only choice if they want to get new Tesla cards out on the market before the end of the year.

In any case we’ll know more about the full performance of K10 soon enough. Based on GTX 680 I think we already have a good idea of GK104’s basic strengths and weaknesses, but I also have to consider the possibility that NVIDIA has been sandbagging the GTX 600 series’ compute performance. NVIDIA has handicapped GeForce performance in a few different ways for quite a number of years in order to create distinct market segments, first for Quadro and more recently for Tesla.  With GTX 580 this was done by handicapping both double-precision and geometry performance, but because GK104 is inherently weak at double-precision NVIDIA would need to handicap the GTX 600 series in some other manner if they wanted to maintain this kind of market segmentation.  So perhaps GK104 is actually faster at compute than what we’ve seen so far?

Wrapping things up, while NVIDIA hasn’t posted every last spec for K10 they have posted enough for us to work with.  Like GTX 690 K10 is using fully enabled GK104 GPUs, so based on NVIDIA’s theoretical performance data of 4.58TFLOPs with 320GB/sec of bandwidth it’s almost certainly clocked at around 745MHz core and 5GHz memory. Meanwhile for memory the card has 8GB of GDDR5, which breaks down to 16 2Gb GDDR5 modules per GPU for a total of 32 on the card. TDP is said to be identical to M2090, which would make it a 225W part.  Finally, as far as availability and pricing is concerned officially K10 is available “now” though in practice partners won’t be shipping cards and systems until closer to the end of the month. Pricing is expected to be close to that of the M2090 it replaces, which would mean we’re looking at $2500 and higher.

Tesla K20 - The First GK110 Product
POST A COMMENT

50 Comments

View All Comments

  • PsiAmp - Saturday, May 19, 2012 - link

    K10 and GTX 680 share the same chip. So DP in K10 is terrible. Reply
  • belmare - Friday, August 03, 2012 - link

    Mmmm, I thought they said that you'd be getting 3x the power efficiency of Fermi. M2090 had 660GFLOPS so it should follow that it gets somewhere around 1.9TFLOPS.
    Also, it wouldn't be much competition for the 7970GE which takes 1TFLOPS of DP, especially for a die this large. 550 square mm is enormous and the K20 should pay it back in huge performance gains.
    We might be able to get a 780 next year at close to the same FLOPS-age, just so it's able to compete with the next year GCN.
    Kepler might be the first to not cap SP or DP. However, I think this is just part of the plan to drop Maxwell tortuously and rip GPGPU at the seams.
    Reply
  • wiyosaya - Thursday, May 17, 2012 - link

    IMHO, I think this might be a losing bet for NVIDIA as Kepler is taking DP performance away from everyone, and, HPC computing power away from enthusiasts on a budget. IMHO, they are walking the Compaq, HP, and Apple line now with overpriced products in the compute area. As DP is important to me, I just bought a 580 rather than a 680. I'll wait for benchmarks on this card, however, as an enthusiast on a budget looking for the best value for the money, I'll be passing on this card.

    Perhaps NVIDIA is trying to counter the last gen situations where Tesla cards performed about as well, and sometimes not even as well, as the equivalent gaming card, and the Tesla 'performance' came at an extreme premium. The gamer cards were a far better value than the Tesla cards for compute performance in general.

    I wish NVIDIA luck in this venture. There are not many public distributed computing projects out there that need DP support, however, for those projects, NVIDIA may be driving business to AMD - which presently has far superior DP performance. I think this is an instance where NVIDIA is definitely making a business decision to move in this direction; I hope it works out for them, or if it fails, I hope that they come to their senses. $2,500 is an expensive card no matter how you look at it, and the fact that they are courting oil and gas exploration is an indication that they are after the $$$.
    Reply
  • Parhel - Thursday, May 17, 2012 - link

    There are plenty of products on the market that are for the professional rather than the "enthusiast on a budget."

    I have a professional photographer friend who recently showed me a lens that he paid over $5,000 for. Now, I like cameras a lot, and have since I was a kid. And, I'm willing to spend more on them than most people do. But I'm an enthusiast on a budget when it comes to cameras. As much as I'd love to own that lens, that product clearly wasn't designed for me.

    Good for nVidia if they can design a card that oil and gas companies are willing to pay crazy money for. In the end, nVidia being "after the $$$" is the best thing for us mere enthusiasts.
    Reply
  • SamuelFigueroa - Thursday, May 17, 2012 - link

    Unlike gamer cards, Tesla cards have to work reliably 24x7 for months on end, and companies that buy them need to be able to trust the results of their computations without resorting to running their weeks-long computation again just in case.

    By the way, did you notice that K10 does not have an integrated fan? So even if you had the money to buy one, it wouldn't work in your enthusiast computer case.
    Reply
  • Dribble - Thursday, May 17, 2012 - link

    Last gen was exactly the same - the GF104 had weak floating point performance too. The only differences so far is that nvidia haven't released the high end GK110 as a gaming chip yet, and due to the GK104 being so fast in comparison to ati's cards they called it the 680, not the 660 (which they would have if ati had been faster).

    I'm sure they will release a GK110 to gamers in the end - probably call it the 780, and rebrand the 680 as a 760.
    Reply
  • chizow - Thursday, May 17, 2012 - link

    Exactly, the poor DP performance of GK104 just further drives the point home it was never meant to be the flagship ASIC for this generation of Kepler GPUs. Its obvious now GK110 was mean to be the true successor to GF100/110 and the rest of Nvidia's lineage of true flagship ASICs (G80, GT200, GT200b etc.) Reply
  • CeriseCogburn - Saturday, May 19, 2012 - link

    What is obvious and drives the very point not made home without question is amd cards suck so badly nVidia has a whole other top tier they could have added on long ago now, right ?
    ROFL - you've said so, many times, nVidia is a massive king on performance right now, and since before the 680 release, so much extra gaming performance up their sleeve already, amd should be ashamed of itself.

    This is what you've delivered, while you little attacking know it alls pretend amd is absolutely coreless and cardless in the near future, hence you can blame the release level on nVidia, while ignoring the failure company amd, right ?
    As amd "silently guards" any chip it may have designed or has in test production right now, all your energies can be focused on attacking nVidia, while all of you carelessly pretend amd gaming chips for tomorrow don't exist at all.
    I find that very interesting.
    I find the glaring omission that is now standard practice quite telling.
    Reply
  • Impulses - Thursday, May 17, 2012 - link

    I'll happy buy a 680/760 for $300 or less when/if that happens... :p Upgrading my dual 6950s in CF (which cost me like $450 for both) to anything but an $800+ SLI/CF setup right now would just be a sidegrade at best, so I'll probably just skip this gen. Reply
  • CeriseCogburn - Wednesday, May 23, 2012 - link

    All that yet you state you just bought an nVidia GTX 580.
    ROFL
    BTW, amd is losing it's proprietay openCL Winzip compute benchmarks to Ivy Bridge cpu's.

    Kepler GK110 is now displayig true innovation and the kind of engineering amd can only dream of - with 32 cpu calls available instead of 1 - and it also sends commands to other Keplers to keep them working.
    Obviously amd just lost their entire top end in compute - it's over.
    Reply

Log in

Don't have an account? Sign up now