Where AFR Is Mediocre, and How Hydra Can Be Better
Perhaps it’s best that we first start with a discussion on how modern multi-GPU linking is handled by NVIDIA and AMD. After some earlier experimentation, both have settled on a method called Alternate Frame Rendering (AFR), which as the name implies has each card render a different frame.
The advantage of AFR is that it’s relatively easy to implement – each card doesn’t need to know what the other card is doing beyond simple frame synchronization. The driver in turn needs to do some work managing things in order to keep each GPU fed and timed correctly (not to mention coaxing another frame out of the CPU for rendering).
However even as simple as AFR is, it isn’t foolproof and it isn’t flawless. Making it work at its peak level of performance requires some understanding of the game being run, which is why for even such a “dumb” method we still have game profiles. Furthermore it comes with a few inherent drawbacks
- Each GPU needs to get a frame done in the same amount of time as the other GPUs.
- Because of the timing requirement, the GPUs can’t differ in processing capabilities. AFR works best when they are perfectly alike.
- Dealing with games where the next frame is dependent on the previous one is hard.
- Even with matching GPUs, if your driver gets the timing wrong, it can render frames at an uneven pace. Frames need to be spaced apart equally – when this fails to happen you get microstuttering.
- AFR has multiple GPUs working on different frames, not the same frame. This means that frame throughput increases, but not the latency for any individual frame. So if a single card gets 30fps and takes 16.6ms to render a frame, a pair of cards in AFR get 60fps but still take 16.6ms to render a frame.
Despite those drawbacks, for the most part AFR works. Particularly if you’re not highly sensitive to lag or microstuttering, it can get very close to doubling the framerate in a 2-card configuration (and less efficient with more cards).
Lucid believes they can do better, particularly when it comes to matching cards. AFR needs matching cards for timing reasons, because it can’t actually split up a single frame. With Hydra, Lucid is splitting up frames and gives them two big advantages over AFR: Rendering can be done by dissimilar GPUs, and rendering latency is reduced.
Right now, the ability to use dissimilar GPUs is the primary marketing focus behind the Hydra technology. Lucid and MSI will both be focusing almost exclusively on that ability when it comes to pushing the Hydra and the Fuzion. What you won’t see them focusing on is the performance versus AFR, the difference in latency, or game compatibility for that matter. The ability to use dissimilar GPUs is the big selling point for the Hydra & Fuzion right now.
So how does the Hydra work? We covered this last year when Lucid first announced the Hydra, so we’re not going to cover this completely in depth again. However here’s a quick refresher for you.
As the Hydra technology is based upon splitting up the job of rendering the objects in a frame, the first task is to intercept all Direct3D or OpenGL calls, and to make some determinations about what is going to be rendered. This is the job of Lucid’s driver, and this is where most of the “magic” is in the Hydra technology. The driver needs to determine roughly how much work will be needed for each object, also look at inter-frame dependences, and finally look at the relative power of each GPU.
Once the driver has determined how to best split up the frame, it then interfaces with the video card’s driver and hands it a partial frame composed of only the bits it needs to render. This is followed by the Hydra then reading back the partial frames, and compositing them into one whole frame. Finally the complete frame is sent out to the primary GPU (the GPU the monitor is plugged into) to be displayed.
All of this analysis and compositing is quite difficult to do (which is in part why AMD and NVIDIA moved away from frame-splitting schemes) which is what makes Hydra’s method the “hard” method. Compared to AFR, it takes a great deal more work to split up a frame by objects and to render them on different GPUs.
As with AFR, this method has some drawbacks:
- You can still microstutter if you get the object allocation wrong. Some frames may put too much work on the weaker GPU
- Since you can use mismatched cards, you can’t always use “special” features like Coverage Sampling Anti-Aliasing unless both cards have the feature.
- Synchronization still matters.
- Individual GPUs need to be addressable. This technology doesn’t work with multi-GPU cards like the Radeon 5970 or the GeForce GTX 295.
This is also a good time to quickly mention the hardware component of the Hydra. The Hydra 200 is a combination PCIe bridge chip, RISC processor, and compositing engine. Lucid won’t tell us too much about it, but we know the RISC processor contained in it runs at 300MHz, and is based on Tensilica’s Diamond architecture. The version of the Hydra being used in the Fuzion is their highest-end part, the LT24102, which features 48 PCIe 2.0 lanes (16 up, 32 down). This chip is 23mm2 and consumes 5.5W. We do not have any pictures of the die or know the transistor count, but you can count on it using relatively few transistors (perhaps 100M?)
Ultimately in a perfect world, the Hydra method is superior – it can be just as good as AFR with matching cards, and you can use dissimilar cards. In a practical world, the devil’s in the details.
47 Comments
View All Comments
liveonc - Tuesday, March 23, 2010 - link
Hydra is still pretty raw, but can it be the One Chip to rule them all, One Chip to find them, One Chip to bring them all and in the darkness bind them In the Land of Mordor where the Shadows lie? CPU, GPU, GPGPU wars comming to a standstill, where it doesn't matter if you use an Intel, AMD, Nvidia or Ati.Focher - Wednesday, January 13, 2010 - link
I think people should really give Lucid their due in regards to proving the underlying concept - that it is feasible to deliver mixed frame rendering in real time. Granted, the technology still seems immature but one has to remember that AMD and NVIDIA have both rejected the approach at this point.I'm still prepared to wait and see how the technology - and not just the current approach from Lucid - evolves. For example, perhaps AMD and NVIDIA will put some RnD efforts into multi-GPU cards that are better equipped at mixed frame rendering. Having it all on the same board could alleviate some of the bottlenecks.
Baron Fel - Sunday, January 10, 2010 - link
Crysis has a 91 at Metacritic and sold millions.Just wanted to point that out.
x86 64 - Saturday, January 9, 2010 - link
I thought the Hydra didn't do SLI\CF through software? I thought that was one of the main benefits of Hydra, no software profiles were needed. The preliminary results you guys posted are less than impressive. Not to sound like a pessimist but I figured it was too good to be true.Focher - Wednesday, January 13, 2010 - link
I think the term "profiles" isn't appropriate, as the review suggests it's more of a whitelist than any type of profile with customized settings for the specific game.prophet001 - Friday, January 8, 2010 - link
The implications of this technology are tremendous. I'm rather surprised at people brushing it off. It is fledgling and will obviously need some work but they will be able to do some really neat things once this matures. I'm thinking GPU farm via external PCI-E.hyvonen - Friday, January 8, 2010 - link
"We’ll start with Call of Juarez, which is one of the Hydra’s better titles. With our 5850s in Crossfire, we get 94fps, which is just less than double the performance of a single 5850 (49.5)."Don't mean "... we get 94fps, which is more than double the performance of a single 5850 (49.5)."
Or, on other words, WFT happened - how do you get more than double the performance with CF?!?!? Something got messed up in your test here, bro.
Veerappan - Friday, January 8, 2010 - link
Read it again... He's saying that the 94fps that they got is just slightly LESS THAN double 49.5 fps. So if a single 5850 gets 49.5 fps, double that is 99 fps.They got 94 fps, which is just a little bit less than 99.
jmurbank - Friday, January 8, 2010 - link
To me Lucid got something going but they should have done it differently. If they created the Hydra chip to be an on-board graphics chip and dispatcher, things will be different. Right now all they have is a dispatcher chip that uses a discrete graphics card to output video which makes it have multiple bottlenecks. It will be better if the Hydra output the graphics on on its own through its own display port while all the processing is done by the discrete graphics cards using stream processing technology like CAL (ATI) and CUDA (nVidia).Of course bad drivers screws up everything. Have look at ATI's history. ATI still makes poor software, but people do not mind. It seems people care more about performance than reliable and stable drivers. I care more about reliable and stable drivers, so it screws up my day if my computer crashes because of a driver.
beginner99 - Friday, January 8, 2010 - link
The worst thing you can do is promise stuff you can't deliver. it's sad. After these first benchmark, the tech will just have a bad reputation even if it will get better over time. Intel way would have been better. Just don't release it at all if it's an underperformer.I do see that it must we extremly complex to get this running at all. So it's actually quite an achievment but it's similar to cars. Combustion engines have been optimized during the last 100 years. No wonder no new technology can compete.
Maybe in 1-2 years this will be usable. If lucid is still alive then...Don't believe many will buy this board.
I also was rather suprised about CF. Used to be quite bad too as I remember? Probably due to a driver update? And how nows what nvidia or ATI is doing in there drivers. I assume they could put in stuff to cripple hydra on purpose.