Original Link: https://www.anandtech.com/show/291
It's easy to make a decision on what video card you want to buy in comparison to the difficult time the manufacturers have creating the products you want, while making a profit and remaining competitive. There is a choice, as a manufacturer, that you must make before beginning production on any new video solution: whether you're going to strive to be the absolute best, or be the best in a certain price range. If you pick the former, you're in for a rough ride, as the competitive nature of manufacturers in the video industry will keep you on your toes.
This is the path companies such as 3dfx and NVIDIA have stuck to, and as you can tell by the intense competition that exists between their products (as well as between their users). If you are to choose the latter option, times are a bit less fast paced and competitive, however the challenge is equally as great if not greater. Basically, as a manufacturer looking to produce a low-cost, high-performance video solution, you are placed under the same expectations as the big boys, but you're now limited by cost.
Case in point is the formerly dominant force in the video industry, S3. Originally on top of the game before the 3D revolution in the mid-90's, S3 controlled a large portion of the market simply because they could produce competitive 2D accelerators that performed well, and could be molded to fit virtually any desktop situation. With the dawn of the 3D gaming era, S3, like a number of other former heavy weights such as Matrox and Number Nine, seemingly dropped off the face of the earth. Last year was the return of the former heavy weights to the market and comparatively speaking, they did not achieve the same status that they once held prior to the transition towards standardizing 3D accelerators.
S3's Savage3D, on paper, was a high-end solution capable of winning the elusive title of "Voodoo2-killer," the most sought after champion of 1998. Unfortunately, in practice, the Savage3D turned out to be one of the biggest failures of the industry. While the Savage3D did manage to find its very own niche market with Super7 users as well as some lower end Slot-1 users, the product itself (as well as its drivers) were quite immature and the product never really attained its full potential. Most manufacturers such as Diamond Multimedia dropped their Savage3D based products, leaving a relatively few companies, mostly Taiwanese OEMs, to buy out the remaining Savage3D parts and make low-cost 2D/3D AGP accelerators out of them.
The only big-name to boast a truly reliable Savage3D solution was Hercules, a company whose devotion to quality led them to a discriminating process of hand picking Savage3D chips for their accelerators, which made Hercules' Savage3D boards undoubtedly the best in the industry. But in the grand scheme of things, it would take much more than Hercules, a company much smaller than the Diamonds and Creatives out there to save S3's rear with the Savage3D. What S3 needed was another chance, should the market give them another chance and more importantly, is it worth re-visiting S3 after what they pulled with the Savage3D? Let's find out as AnandTech takes an in-depth look at S3's latest shot at a low-cost, high-performance video-solution.
Enter the Savage4
This time around, 3dfx isn't the competition, NVIDIA isn't the one to topple, and Matrox isn't the one S3 needs to be guarding their customers from, instead, this time around S3 is looking to be the best they can be, not the best in the market. With the Savage3D, there was much confusion as to what the target audience of the product would be. As a high-performing solution, the Savage3D would offer the most bang for your buck, however the frame rate dreams of the Savage3D ended up being quite disappointing and therefore took it out of the run for the number one spot in the race for the 3D king last year. |
The Savage4 starts off by boasting a core very similar to the original Savage3D, in actuality, the 0.25-micron core of the Savage3D (one of the first to be implemented in a video accelerator) bears a striking physical resemblance to the Savage4. The core is clocked at a meager, by today's newly set standards, 125MHz, with a memory clock of 125/143MHz, a feature which is up to the manufacturer's discretion. |
The Savage4 builds on the stack of the Savage3D's original features, adding such important features as single-pass multi-texturing and bringing back the old strengths of the Savage3D such as the texture compression that filled so many screen shots last year. As with most other 3D accelerators that are just now hitting the streets, the Savage4 provides compatibility with both AGP 2X and 4X specifications, with the 4X versions of the parts hitting the streets towards the end of the year with Intel's release of the upcoming AGP 4X compliant 820 chipset (aka Camino).
The Savage4 allows for a number of different memory configurations, allowing for manufacturers to custom make their products fit the price and performance needs of their target market. At first, only 16MB and 32MB boards will be seen on the market, however there is always the option of pursuing a cheaper 8MB alternative for OEMs and other such cases where price is a very delicate issue. After discussing the topic with a couple of manufacturers, it seems like the 32MB Savage4 based products won't be too big of a rarity in the low-cost isles at your local computer hardware store.
Also, mimicking the latest from 3dfx and NVIDIA (and soon to be Matrox), the Savage4 also supports the "oh-so-marketable" digital interface for flat panel LCD displays, a feature which will slowly gain popularity and eventually pick up as the desired output port on video cards when digital LCD monitors drop to a more reasonable price point.
The Specs
Here are the full specs for the S3 Savage4 and the Savage4 Pro, the two are distinguished apart from each other by essentially the memory clock of the product, with the Savage4 Pro operating at a memory clock of 143MHz.
High-Performance 2D/3D/Video Accelerator
- Floating point triangle setup engine
- Single cycle 3D architecture
- 8M triangles/second setup engine
- 128-bit rendering pipeline
- 140M pixels/second trilinear fill rate
- Full AGP 4X/2X, including sideband addressing and execute mode
- S3 DX6 texture compression (S3TC)
- Flat panel desktop monitor support
- High quality DVD video playback
3D Rendering Features
- Single-pass multiple textures
- Hardware bump mapping
- Full scene anti-aliasing
- Anisotropic filtering
- 8-bit stencil buffer
- Single cycle trilinear filtering
- 32-bit true color rendering
- Specular lighting and diffuse shading
- Alpha blending modes
- MPEG-2 video textures
- Vertex and table fog
- 16- or 24-bit Z-buffering
- Sprite anti-aliasing, reflection mapping, texture morphing, shadows, procedural textures and atmospheric effects
Motion Video Architecture
- High quality up/down scalar
- Planar to packed format conversion
- Motion compensation for full speed DVD playback
- Hardware subpicture blending and highlights
- Multiple video windows for video conferencing
- Contrast, hue, saturation, brightness and gamma controls
- 60MHz VIP video port allows HDTV resolutions
- Digital port for NTSC/PAL TV encoders
High Speed Memory Bus
- 125/143 MHz memory interface
- 2 to 32 MB frame buffer
- 1Mx16 or 2Mx32 or 4Mx16 SDRAMs
- 256Kx32 or 512Kx32 or 1Mx32 SGRAMs
- SO-DIMM memory upgrade
- Block write support
2D Acceleration Features
- Highly optimized 128-bit graphics engine
- Full featured 2D engine for acceleration of BitBLT, rectangle fill, line draw, polygon fill, panning/scrolling and hardware cursor
- 8, 16, and 32 bpp mode acceleration
Flat Panel Desktop Monitor Support
- 24-bit digital interface for flat panel encoders
- Auto-expansion and centering for VGA text and graphics modes
- Support for all resolutions up to 1280x1024
Full Software Support
- Drivers for major operating systems and APIs: [Windows. 9x, Windows NT 4.0/5.0, Windows 3.x and OS/2. 2.1/3.0 (WarpTM), Direct3DTM, DirectDrawTM and DirectShowTM, OpenGLTM ICD for Windows 9x and NT]
- Comprehensive SDK, utilities and ISV tools
- ISV and bundling programs
Additional Features
- 300MHz RAMDAC with gamma correction
- I2C serial bus and flash ROM support
- ACPI and PCI power management
- Hardware and BIOS support for VESA timings and DDC monitor communications
- PCI 2.2 bus support including bus mastering
- 27x27mm PBGA with 336 balls
- 2.5V core with 3.3V/5V tolerant I/O
As you can tell by the specs, the Savage4 already features a fully functional OpenGL ICD under Windows 9x and NT, a definite plus. The 300MHz integrated RAMDAC is seemingly standard among the big four manufacturers, 3dfx, NVIDIA, Matrox and S3, and it provides for an increasingly sharp 2D picture in comparison to the old Savage3D.
The Need for Texture Compression
For starters: What is AGP?
There isn't a computer gamer out there that hasn't heard the term AGP, or Accelerated Graphics Port, used in a requirements listing on the back of the latest game they've been eyeing. But what is AGP? It isn't rare that our favorite three letter acronyms quickly become lost in their household connotations and we eventually forget what it is they are actually providing us with, if anything at all (the letters MMX come to mind). So what is AGP? (If you're already familiar with the basics of AGP you'll want to skip past this part)
AGP is, according to Intel, a "high performance, component level interconnect targeted at 3D graphical display applications and is based on a set of performance extensions or enhancements to PCI." Basically it is a specialized bus whose sole purpose is to provide "housing" for 2D/3D accelerators in your AGP compatible system.
When operating in its currently standard 2X transfer mode, the AGP bus allows for peak transfer rates of up to 528MB/s up from the 132MB/s of the PCI bus. Using the faster transfer rates the AGP bus allows your AGP compliant computer can store those extremely large textures that games may use in your system memory and retrieve them quickly instead of having to use your graphics card's limited local memory for texture storage and retrieval. What happens when a non-AGP graphics card receives a texture that is larger than its local memory? The same thing that happens when your system runs out of local memory, it begins swapping, but in this case, instead of swapping to your hard disk, your graphics card will perform texture swapping with your system memory which, with the absence of the high speed AGP bus, slows performance down tremendously. This concept of AGP texturing is made possible by the hardware behind it, which is commonly referred to by the acronym GART, or Graphics Address Remapping Table. The function of the GART is basically do allow for the hardware to access large texture maps as single data objects in system memory, permitting your AGP card and software using the AGP bus to access the same memory addresses.
The second major feature of AGP which will briefly be alluded to will be the ability of AGP compliant graphics adapters to manipulate the textures it stores in the system memory directly as opposed to retrieving them and processing all the manipulations locally. This process, known as Direct Memory Execution, or DIME, allows memory intensive texture-mapping procedures to be performed within the spacious system memory rather than in the restrictive graphics card's local memory.
AGP a Reality?
If you can think back to the release of Intel's first AGP chipset, the i440LX, one of the most commonly voiced criticisms of the move to AGP graphics accelerators was that no games would take advantage of AGP or AGP texturing for that matter. Although the design was wonderful in theory, in application it boasted a relatively small performance improvement in the real world tests of the time. Even today, playing a game of Quake 2 doesn't really stress the amount of memory on most current generation 3D accelerators. It is because of this assumption, that the current wave and the next generation in 3D games won't truly require support for AGP texturing (the ability to use the AGP bus as a means of transferring textures to/from system memory) that companies like 3Dfx have stuck to either 100% PCI solutions (i.e. Voodoo2) or poorly implemented AGP specifications (i.e. Banshee/Voodoo3) that don't allow for AGP texturing in light of keeping costs down. Is that how things truly are?
Here's a reality check, they aren't. It's very easy to understand, more detailed textures simply look better and, unfortunately, more detailed textures do take up more space than those of lesser detail. This brings up the whole image quality vs performance debate, however that's something that we won't get in to for now, instead we'll leave the discussion at this comparison: regardless of how beautiful a sunset is, if you're blind to a point where all you can distinguish is general color changes, the sunset has virtually no effect on you. That may be an extreme example, but it's a quick end to an argument that could take pages upon pages to even come close to resolving. After playing a game that has extremely detailed textures on a single 12MB Voodoo2 at 800 x 600, then playing the same game at 800 x 600 on an AGP nVidia Riva TNT based graphics card, you can truly begin to notice the difference. Remember when you heard all of the industry analysts talking about how AGP would eventually be put to good use in the gaming industry? That time is almost upon us, and it's time for companies to realize that AGP texturing and taking full advantage of the specification's capabilities is something all of their video products should do. The technology is there, and it's about time that we start using it. Luckily companies like nVidia, Matrox, and S3 have already taken the first steps in that directionbut it seems as if there is more that could be done
Step two to better Texture Quality
So we have these larger textures being stored in system memory after being transferred over the AGP bus, and these textures are put to work using the DIME specification, but where does S3 fit into this all? Instead of relying solely on the benefits of AGP to allow for game developers to toss incredible amounts of textures at the graphics subsystem, S3 realized that although AGP allows for a large portion of your system memory to be used for texture storage and manipulation, there is still a limit to the effectiveness of the technology. Primarily, the amount of system memory available for AGP to make use of. The next logical step in this progression would be to somehow compress the textures to a point that even larger, and higher quality, textures could be stored in the same amount of system memory, therefore giving us step two, to better texture quality: S3's Texture Compression.
S3's 6:1 compression ratio provided for by the S3 Texture Compression (S3TC) technology allows for the seamless integration of compression technology, with no noticeable visual penalty, due to its full support in Microsoft's DirectX 6.0. S3TC isn't something S3 cooked up on the side, it is a fully licensable and readily available compression algorithm that is supported in DirectX 6.0, it's about time a company realized that proprietary isn't the way to go to make your presence known in the market. Just to get a vague idea of the capabilities of S3TC, while the Voodoo2 is limited to texture sizes of 256 x 256 bytes, a S3TC texture is limited to 2048 x 2048 byte textures.
Copyright 1999 S3
2D/3D Image Quality & Drivers
The 2D image quality on the two Savage4 sample boards AnandTech used during the tests was an incredible improvement over the TNT quality 2D output on the original Savage3D. This is primarily contributed to by the 300MHz integrated RAMDAC that is now the minimum requirement for a graphics accelerator, and a specification that has only been exceeded by 3dfx's 350MHz RAMDAC on the higher-end Voodoo3 cards (3000+ models). While resolutions at 1600 x 1200 and above are still iffy on the Savage4, dependent almost entirely upon the individual manufacturers themselves, most resolutions at 1280 x 1024 and below will give you an average 2D image quality, definitely not as poor as the cards from last year.
The 3D image quality of the Savage4, if you take the topic of texture compression out of consideration, is fairly standard for a 3D chipset being released today. Although there is no reason to scream out in joy about the standard 3D image quality of the Savage4, there's no big reason to complain either. With 3dfx, and NVIDIA already demonstrating image quality superior to that of the previous generation of accelerators, we can call the Savage4's 3D image quality average by today's standards, and above average by the standards implemented just a few months ago.
Adding the Savage4's unique support for texture compression into the equation changes that average rating into something more along the lines of the best that ever lived, since the textures capable of being displayed using a technology like S3TC are simply beautiful. The problem here is that although S3 claims a great deal of the upcoming titles such as idSoftware's Quake 3 Arena and Epic's Unreal Tournament will support S3TC, it's still not an indication of what games will actually ship with S3TC support from the start.
The quality of the Savage4's current drivers (S3 provided AnandTech with the latest engineering sample drivers as of April 27, 1999) is not bad, however they are not nearly as solid and stable as 3dfx's currently shipping Voodoo3 drivers (which isn't something to go boasting about either). This translates into a warning for potential buyers, the manufacturers shipping Savage4 based products currently will probably be doing so a tad prematurely. You'll either end up with a product with drivers boasting poor performance or horrid stability, if you're considering the Savage4, you might as well wait another week or two.
S3 is shipping the Savage4 with a fully functional OpenGL ICD right out of the box, and fortunately the Savage4 drivers don't feature the same problems the old Savage3D's drivers did, although they aren't perfect. AnandTech's test system did experience a few crashes during normal operation, however the performance numbers were consistent, and the Savage4 silicon is final, so you can expect performance and stability to do nothing other than improve between now and the widespread release of the Savage4.
The Test
AnandTech received a pre-release version of Gainward's Cardexpert SG4 and the Diamond Multimedia Stealth III S540 Savage4 based products. Both cards featured 32MB of on-board SDRAM. While the Cardexpert SG4 operated at the base setting of 125MHz core and 125MHz memory clock, the Stealth III's default operation was at 125MHz core and 143MHz memory (Savage4 Pro). AnandTech's Slot-1/Socket-370 test configuration was as follows:
- Intel Pentium III 500, Intel Pentium II 400, Intel Pentium II 266, Intel Celeron 333, Intel Celeron 266 (0KB L2) on an ABIT BX6 Revision 2.0 or an ABIT ZM6 for the Socket-370 Celeron 333 tests.
- 64MB of Memman/Mushkin SEC Original SDRAM was used in each test system
- Western Digital 5.1GB Ultra ATA/33 HDD
- Microsoft Windows 98
The benchmark software used was as follows:
- id Software's Quake 2 Version 3.20 using demo1.dm2 and 3Finger's crusher.dm2
- Monolith's Shogo using 3Finger's RevDemo
- Interplay's Descent3 Demo2 using AnandTech's Descent3 Torture Demo (results will be featured in an upcoming 3D accelerator comparison)
- Ziff Davis' Winbench 99 at 1600 x 1200 x 32-bit color for 2D performance tests
Each benchmark was run a total of three times and the average frame rates taken. Vsync was disabled. All cards were run in 16-bit mode unless otherwise indicated.
OpenGL Performance - Quake 2 demo1.dm2
Q2D1 - P3/500
Q2D1 - P2/400
Q2D1 - P2/266
Q2D1 - C333A
Q2D1 - C266
OpenGL Performance - Quake 2 crusher.dm2
Q2C - P3/500
Q2C - P2/400
Q2C - P2/266
Q2C - C333A
Q2C - 266
OpenGL CPU Scaling Performance
Quake2 Performance Conclusions
There are three big revelations illustrated by the Savage4's Quake 2 scores, 1) The Savage4's 32-bit color rendering comes at less than a 10% drop in performance, 2) the Savage4 is the worst processor when it comes to scalability, and 3) the Savage4 experiences a tremendous drop off in frame rate at higher resolutions. Let's discuss the three individually.
32-bit color with < 10% performance drop
Here's the biggest benefit the Savage4 offers outside of S3TC, for the first time ever, low-end users can enjoy the benefits of 32-bit color rendering, without having to experience a performance drop that decreases the playability of the game. The Savage4 is by far the best solution for 32-bit color rendering, even outpacing the TNT2 in terms of performance under Quake 2 at 32-bit rendering. Quake 3 Arena should be an interesting title to play with on the Savage4 as it is supposed to look significantly better in 32-bit color, a setting where the Savage4 starts to compete with the TNT2 quite nicely.
If S3 and id deliver as expected, the combination of a powerful rendering pipeline capable of rendering in 32-bit color with a very small performance drop and S3TC should make Quake 3 Arena extremely playable on lower end systems that end up going with the Savage4. For once, having a lower-end system doesn't mean not being able to play the latest games in the manner they were meant to be played, in the highest quality settings possible.
Virtually No Processor Scalability
The Savage4's downfall, and what will keep the solution from making its way up to the hands of the high-end users running Pentium III 500's and soon to be the Pentium III 550's, is its extremely poor scalability. The low clock speed of the Savage4 ends up being the limiting factor after you hit the Pentium II 350/400MHz mark, so unless you plan on overclocking the Savage4, you won't see a huge increase in performance as you start proceeding past the Pentium II 350MHz mark.
Fortunately, the Savage4 Pro core seems to run quite nicely at above 125MHz, in fact, both the Diamond Stealth III 540 and the Gainward Cardexpert SG4 worked fine at 143MHz core, however anything above 143MHz started requiring added cooling (not bad considering this is only using the standard heatsink on both cards).
Unless S3 can release an "Ultra" version of the Savage4 Pro, most high-end system users will be putting their CPUs to waste on the Savage4. If you're on the low end to mid-range of the computing performance spectrum, then the Savage4 suddenly becomes the competitor no one imaged would exist. If anything, the Savage4 is a feel good solution for owners of slower computers (by today's standards), let's see a TNT2 run Quake 2 in 32-bit color with only a 10% drop in performance.
Poor Performance at Higher Resolutions
At up to 800 x 600, the Savage4 is definitely in there in terms of competition, however once you hit 1024 x 768 there is a significant drop in performance. The drop is so significant that the performance of the Savage4 drops to levels you wouldn't even expect from the original TNT or the ATI Rage 128. So if you're planning on running at 1024 x 768, don't expect the Savage4 to be the answer to your prayers in terms of high performance at that resolution.
Although the Savage4 does allow for resolutions up to 1600 x 1200, performance at resolutions above 1024 x 768 grows unplayable very quickly. Playing at 1600 x 1200 in Quake 2, or any OpenGL game with Quake 2's complexity won't be fun at all on the Savage4. It must be mentioned that no current accelerator is capable of running at 1600 x 1200 at above 40 fps (under Q2) in its best case scenario on today's CPUs, we'll have to wait until the end of the year for that level of performance.
Direct3D Performance - Shogo RevShogo
Shogo - P3/500
Shogo - P2/400
Shogo - P2/266
Shogo - C333A
Shogo - C266
Shogo - CPU Scaling Performance
Shogo Performance Conclusions
Since Shogo has yet to support 32-bit color rendering properly, the only two items that could be duplicated in the Direct3D arena were the poor CPU scalability of the Savage4 and its poor performance at higher resolutions. Unfortunately, those two factors being downsides don't really help the argument for the Savage4 under Direct3D.
Direct3D is one of the Savage4's weak points, which is ironic since, at the time, it was one of the Savage3D's strong points. The Savage4 performed well below the rest of the competition in Direct3D, an issue which may or may not be corrected over time with updated drivers. Since the Savage4's silicon is in its final form, the only hope for performance increases now lay with the drivers.
The poor performance of the Savage4 under Direct3D is an unfortunate consequence of the Savage4's low-cost design, however there is hope if your favorite games happen to support S3TC.
Final Words
For those of you expecting the Savage4 to come through as being the world's fastest 2D/3D accelerator, you're out of luck. However, the Savage4 does have its strengths as discussed above. The Savage4's ability to render at 32-bit color depths with a very small loss in performance will make the Savage4 a strong performer in Quake 3 Arena which is supposed to truly illustrate a difference between 16-bit and 32-bit rendering. If it does in fact illustrate such a difference, Savage4 owners will be pleased to know that the drop in performance they'll experience for the added image quality is next to nothing compared to the TNT2 and other competing solutions capable of 32-bit rendering.
S3's use of the texture compression algorithm supported by DirectX 6.1 is an extremely promising tool and if used properly, can improve image quality by an extremely noticeable amount, unfortunately the only question remaining in this case is when S3TC enabled games will actually hit the hands of Savage4 owners. Interplay will be shipping S3TC enhanced versions of Descent 3, and that particular title will probably make its way into a number of software bundles shipping with Savage4 cards, however the big titles such as Quake 3 Arena have yet to illustrate any benefits from S3TC (not a single Q3A S3TC screenshot has been posted). Although we've all seen screenshots from Unreal Tournament, not everyone plays Unreal. For those that don't, implementation of S3TC won't be popular until much later into the year (Q4) when other manufacturers may possibly include support for texture compression.
Should you buy the Savage4? The first thing you need to do is weigh the pros and cons the Savage4 offers you, and then consider the type of system you'll be using. For high-end systems (anything faster than a P2-350/400), the Savage4 shouldn't be your top pick, in spite of its support for texture compression. The target market for the Savage4 are the lower-end CPUs, so if you happen to have a CPU that won't benefit from a TNT2 then the Savage4 may offer the performance you're looking for, even in 32-bit color.
The best way to sum up the Savage4 is this: everything the original Savage3D should have been. Better late than never.