CPUs, GPUs, Motherboards, and Memory

For an article like this, getting a range of CPUs, which includes the most common and popular, is very important. I have been at AnandTech for just over two years now, and in that time we have had Sandy Bridge, Llano, Bulldozer, Sandy Bridge-E, Ivy Bridge, Trinity and Vishera, of which I tend to get supplied the top end processors of each generation for testing. (As a motherboard reviewer, it is important to make the motherboard the limiting factor.) A lot of users have jumped to one of these platforms, although a large number are still on Wolfdale (Core2), Nehalem, Westmere, Phenom II (Thuban/Zosma/Deneb) or Athlon II.

I have attempted to pool all my AnandTech resources, contacts, and personal resources together to get a good spread of the current ecosystem, with more focus on the modern end of the spectrum. It is worth noting that a multi-GPU user is more likely to have the top line Ivy Bridge, Vishera or Sandy Bridge-E CPU, as well as a top range motherboard, rather than an old Wolfdale. Nevertheless, we will see how they perform. There are a few obvious CPU omissions that I could not obtain for this first review which will hopefully be remedied over time in our next update.

The CPUs

My criteria for obtaining CPUs was to use at least one from the most recent architectures, as well as a range of cores/modules/threads/speeds. The basic list as it stands is:

AMD

Name Platform /
Architecture
Socket Cores / Modules
(Threads)
Speed Turbo L2/L3 Cache
A6-3650 Llano FM1 4 (4) 2600 N/A 4 MB / None
A8-3850 Llano FM1 4 (4) 2900 N/A 4 MB / None
A8-5600K Trinity FM2 2 (4) 3600 3900 4 MB / None
A10-5800K Trinity FM2 2 (4) 3800 4200 4 MB / None
Phenom II X2-555 BE Callisto K10 AM3 2 (2) 3200 N/A 1 MB / 6 MB
Phenom II X4-960T Zosma K10 AM3 4 (4) 3200 N/A 2 MB / 6 MB
Phenom II X6-1100T Thuban K10 AM3 6 (6) 3300 3700 3 MB / 6 MB
FX-8150 Bulldozer AM3+ 4 (8) 3600 4200 8 MB / 8 MB
FX-8350 Piledriver AM3+ 4 (8) 4000 4200 8 MB / 8 MB

Intel

Name Architecture Socket Cores
(Threads)
Speed Turbo L2/L3 Cache
E6400 Conroe 775 2 (2) 2133 N/A 2 MB / None
E6700 Conroe 775 2 (2) 2667 N/A 4 MB / None
Celeron G465 Sandy Bridge 1155 1 (2) 1900 N/A 0.25 MB / 1.5 MB
Core i5-2500K Sandy Bridge 1155 4 (4) 3300 3700 1 MB / 6 MB
Core i7-2600K Sandy Bridge 1155 4 (8) 3400 3800 1 MB / 8 MB
Core i3-3225 Ivy Bridge 1155 2 (4) 3300 N/A 0.5 MB / 3 MB
Core i7-3770K Ivy Bridge 1155 4 (8) 3500 3900 1 MB / 8 MB
Core i7-3930K Sandy Bridge-E 2011 6 (12) 3200 3800 1.5 MB / 12 MB
Core i7-3960X Sandy Bridge-E 2011 6 (12) 3300 3900 1.5 MB / 15 MB
Xeon X5690 Westmere 1366 6 (12) 3467 3733 1.5 MB / 12 MB

A small selection

There omissions are clear to see, such as the i5-3570K, a dual core Llano/Trinity, a dual/tri module Bulldozer/Piledriver, i7-920, i7-3820, or anything Nehalem. These will hopefully be coming up in another review.

The GPUs

My first and foremost thanks go to both ASUS and ECS for supplying me with these GPUs for my test beds. They have been in and out of 60+ motherboards without any issue, and will hopefully continue. My usual scenario for updating GPUs is to flip AMD/NVIDIA every couple of generations – last time it was HD5850 to HD7970, and as such in the future we will move to a 7-series NVIDIA card or a set of Titans (which might outlive a generation or two).

ASUS HD 7970 (HD7970-3GD5)

The ASUS HD 7970 is the reference model at the 7970 launch, using GCN architecture, 2048 SPs at 925 MHz with 3GB of 4.6GHz GDDR5 memory. We have four cards to be used in 1x, 2x, 3x and 4x configurations where possible, also using PCIe 3.0 when enabled by default.

ECS GTX 580 (NGTX580-1536PI-F)

ECS is both a motherboard manufacturer and an NVIDIA card manufacturer, and while most of their VGA models are sold outside of the US, some do make it onto etailers like Newegg. This GTX 580 is also a reference model, with 512 CUDA cores at 772 MHz and 1.5GB of 4GHz GDDR5 memory. We have two cards to be used in 1x and 2x configurations at PCIe 2.0.

The Motherboards

The CPU is not always the main part of the picture for this sort of review – the motherboard is equally important as the motherboard dictates how the CPU and the GPU communicate with each other, and what the lane allocation will be. As mentioned on the previous page, there are 20+ PCIe configurations for Z77 alone when you consider some boards are native, some use a PLX 8747 chip, others use two PLX 8747 chips, and about half of the Z77 motherboards on the market enable four PCIe 2.0 lanes from the chipset for CrossFireX use (at high latency).

We have tried to be fair and take motherboards that may have a small premium but are equipped to deal with the job. As a result, some motherboards may also use MultiCore Turbo, which as we have detailed in the past, gives the top turbo speed of the CPU regardless of the loading.

As a result of this lane allocation business, each value in our review will be attributed to both a CPU, whether it uses MCT, and a lane allocation. This would mean something such as i7-3770K+ (3 - x16/x8/x8) would represent an i7-3770K with MCT in a PCIe 3.0 tri-GPU configuration. More on this below.

For Sandy Bridge and Ivy Bridge: ASUS Maximus V Formula, Gigabyte Z77X-UP7 and Gigabyte G1.Sniper M3.

The ASUS Maximus V Formula has a three way lane allocation of x8/x4/x4 for Ivy Bridge, x8/x8 for Sandy Bridge, and enables MCT.

The Gigabyte Z77X-UP7 has a four way lane allocation of x16/x16, x16/x8/x8 and x8/x8/x8/x8, all via a PLX 8747 chip. It also has a single x16 that bypasses the PLX chip and is thus native, and all configurations enable MCT.

The Gigabyte G1.Sniper M3 is a little different, offering x16, x8/x8, or if you accidentally put the cards in the wrong slots, x16 + x4 from the chipset. This additional configuration is seen on a number of cheaper Z77 ATX motherboards, as well as a few mATX models. The G1.Sniper M3 also implements MCT as standard.

For Sandy Bridge-E: ASRock X79 Professional and ASUS Rampage IV Extreme

The ASRock X79 Professional is a PCIe 2.0 enabled board offering x16/x16, x16/x16/x8 and x16/x8/x8/x8.

The ASUS Rampage IV Extreme is a PCIe 3.0 enabled board offering the same PCIe layout as the ASRock, except it enables MCT by default.

For Westmere Xeons: The EVGA SR-2

Due to the timing of the first roundup, I was able to use an EVGA SR-2 with a pair of Xeons on loan from Gigabyte for our server testing. The SR-2 forms the basis of our beast machine below, and uses two Westmere-EP Xeons to give PCIe 2.0 x16/x16/x16/x16 via NF200 chips.

For Core 2 Duo: The MSI i975X Platinum PowerUp and ASUS Commando (P965)

The MSI is the motherboard I used for our quick Core 2 Duo comparison pipeline post a few months ago – I still have it sitting on my desk, and it seemed apt to include it in this test. The MSI i975X Platinum PowerUp offers two PCIe 1.1 slots, capable of Crossfire up to x8/x8. I also rummaged through my pile of old motherboards and found the ASUS Commando with a CPU installed, and as it offered x16+x4, this was tested also.

For Llano: The Gigabyte A75-UD4H and ASRock A75 Extreme6

Llano throws a little oddball into the mix, being a true quad core unlike Trinity. The A75-UD4H from Gigabyte was the first one to hand, and offers two PCIe slots at x8/x8. Like the Core 2 Duo setup, we are not SLI enabled.

After finding an A8-3850 CPU as another comparison point for the A6-3650, I pulled out the A75 Extreme6, which offers three-way CFX as x8/x8 + x4 from the chipset as well as the configurations offered by the A75-UD4H.

For Trinity: The Gigabyte F2A85X-UP4

Technically A85X motherboards for Trinity support up to x8/x8 in Crossfire, but the F2A85X-UP4, like other high end A85X motherboards, implements four lanes from the chipset for 3-way AMD linking. Our initial showing on three-way via that chipset linking was not that great, and this review will help quantify this.

For AM3: The ASUS Crosshair V Formula

As the 990FX covers a lot of processor families, the safest place to sit would be on one of the top motherboards available. Technically the Formula-Z is newer and supports Vishera easier, but we have not had the Formula-Z in to test, and the basic Formula was still able to run an FX-8350 as long as we kept the VRMs cool as a cucumber. The CVF offers up to three-way CFX and SLI testing (x16/x8/x8).

The Memory

Our good friends at G.Skill are putting their best foot forward in supplying us with high end kits to test. The issue with the memory is more dependent on what the motherboard will support – in order to keep testing consistent, no overclocks were performed. This meant that boards and BIOSes limited to a certain DRAM multiplier were set at the maximum multiplier possible. In order to keep things fairer overall, the modules were adjusted for tighter timings. All of this is noted in our final setup lists.

Our main memory testing kit is our trusty G.Skill 4x4GB DDR3-2400 RipjawsX kit which has been part of our motherboard testing for over twelve months. For times when we had two systems being tested side by side, a G.Skill 4x4GB DDR3-2400 Trident X kit was also used.

For The Beast, which is one of the systems that has the issue with higher memory dividers, we pulled in a pair of tri-channel kits from X58 testing. These are high-end kits as well, currently discontinued as they tended to stop working with too much voltage. We have sets of 3x2GB OCZ Blade DDR3-2133 8-9-8 and 3x1GB Dominator GT DDR3-2000 7-8-7 for this purpose, which we ran at 1333 6-7-6 due to motherboard limitations at stock settings.

To end, our Core 2 Duo CPUs clearly gets their own DDR2 memory for completeness. This is a 2x2GB kit of OCZ DDR2-1033 5-6-6.

 

 

Choosing a Gaming CPU: Single + Multi-GPU at 1440p, 400+ Data Points To Consider Testing Methodology, Hardware Configurations, and The Beast
Comments Locked

242 Comments

View All Comments

  • TheQweaker - Friday, May 10, 2013 - link

    Just in case, here is a pointer to the nVidia GPU AI Path finding in the developer zone:

    https://developer.nvidia.com/gpu-ai-path-finding

    And here is the title of a 2011 GPU AI Planning paper (research; not yet in a game): "Exploiting the Computational Power of the Graphics Card: Optimal State Space Planning on the GPU". You should be able to find the PDF on the web.

    My 2 cents is that it's a good topic for a final paper.

    -- The Qweaker.
  • yougotkicked - Friday, May 10, 2013 - link

    Thanks again, I think I will be doing GPU AI as my final paper, probably try to implement the A* family as massively parallel, or maybe a local beam search using hundreds of hill-climbing threads.
  • TheQweaker - Saturday, May 11, 2013 - link

    Nice project.

    2 more cents.

    Keep it simple is the best advice. It's better to have a running algorithm than none, even if it's slow.

    Also, ask you advisor whether he'd want you to compare with a CPU implementation of yours in order to evaluate the pros and cons between your sequential implementation and your // implemenation. I did NOT write "evaluate gains from seq to //" as GPU programming is currently not fully understood, probably even not by nVidia engineers.

    Finally, here is book title: "CUDA Programming: A Developer's Guide to Parallel Computing with GPUs". But there are many others these days.

    OK. That w
  • TheQweaker - Saturday, May 11, 2013 - link

    as my last post.

    -- The Qweaker.
    (sorry for the cut, I wrongly clicked on submit)
  • yougotkicked - Monday, May 13, 2013 - link

    thanks a lot for all your input, I intend to evaluate not only the advantages of GPU computing, but it's weak points as well, so I'll be sure to demonstrate the differences between a sequential algorithm, a parallel CPU algorithm, and a massively parallel GPU algorithm.
  • Azusis - Wednesday, May 8, 2013 - link

    Could you test the Q6600 and i7-920 in your next roundup? I have many PC gaming friends, and we all seem to have a Q6600, i7-920, or 2500k in our rigs. Thanks! Great job on the article.
  • IanCutress - Wednesday, May 8, 2013 - link

    I have a Q9400 coming in soon from family - Getting one of the Nehalem/Westmere range is definitely on my to-do list for the next update :)
  • sonofgodfrey - Thursday, May 9, 2013 - link

    I too have a Q6600, but it would be interesting to see the high end (non-extreme edition) Core 2s as well: E8600 & Q9650. Just for yucks, perhaps a socket 775 Pentium 4 could also make an appearance? :)
  • gonks - Wednesday, May 8, 2013 - link

    i knew it from some time ago, but this proves once again that it's time to upgrade my good old c2d (conroe) E6600 @ 3.2Ghz
  • Quizzical - Wednesday, May 8, 2013 - link

    You've got a lot of data there. And it's good data if your main purpose is to compare a Radeon HD 7970 to a GeForce GTX 580. Unfortunately, most of it is worthless if you're trying to isolate CPU performance, which is the ostensible purpose of the article. You've gone far out of your way to try to make games GPU-limited so that you wouldn't be able to tell what the various CPUs can do when they're the main limiting factors.

    Loosely, the CPU has to do any work to run a game that isn't done by the GPU. The contents of this can vary wildly from game to game. Unless you're using DirectX 11 multithreaded rendering, only one thread can communicate with the video card at a time. But that one rendering thread mostly consists of passing data to the video card, so you don't do much in the way of real computations there. You do sort some things so that you don't have to switch programs, textures, and so forth more often than necessary, though you can have a separate sorting thread if you're (probably unreasonably) worried that this is going to mean too much work for the rendering thread.

    Actually determining what data needs to be passed to the video card can comprise the bulk of the CPU work that a game needs to do. But this portion is mostly trivial to scale to as many threads as you care to--at least within reason. It's a completely straightforward producer-consumer queue with however many "producer" threads you want and the rendering thread as the single "consumer" thread that takes the data set up by other threads and passes it along to the video card.

    Not quite all of the work of setting up data for the GPU is trivial to break into as many threads as necessary, though. At the start of a new frame, you have to figure out exactly where the camera is going to go in that frame. This is likely going to be very fast (e.g., tens or hundreds of microseconds), but it does need to be done before you go compute where everything else is relative to the camera.

    While I haven't programmed AI, I'd expect that you could likewise break it up into as many threads as you cared to, as you could "save" the state of the game at some instant in time and have separate threads compute what all AI has to do based on the state of the game at that moment, without needing to know anything about other game characters were choosing at the same time. Some games are heavy on AI computations, while online games may do essentially no AI computations client-side, so this varies wildly from game to game.

    A game engine may do a lot of other things besides these, such as processing inputs, loading data off of the hard drive, sending data over the Internet, or whatever. Some such things can't be readily scaled to many CPU cores, but if you count by CPU work necessary, few games will have all that much stuff to do other than setting up data for the GPU and computing AI.

    But most of the work that a CPU has to do doesn't care what graphical settings you're using. Anything that isn't part of the graphics engine certainly doesn't care. The only parts of a the CPU side of game engine that care what monitor resolution you're using are likely to be a handful of lines to set the resolution when you change it and a few lines to check whether an object is off the camera and therefore doesn't need to be processed in that particular frame--and culling such objects is likely done mostly to save on the GPU load. Any settings that can be adjusted in video drivers (e.g., anti-aliasing or anisotropic filtering) are done almost entirely on the video card and carry a negligible CPU load.

    Thus, if you're trying to isolate CPU performance, you turn down or off settings that don't affect the CPU load. In particular, you want a very low monitor resolution, no anti-aliasing, no anisotropic filtering, and no post-processing effects of any sort. Otherwise, you're just trying to make the game mostly CPU bound, and end up with data that looks like most of what you've collected.

    Furthermore, even if you do the measurements properly, there's also the question of whether the games you've chosen are representative of what most people will play. If you grab the games that you usually benchmark for video cards reviews, then you're going out of your way to pick games that are unrepresentative. Tech sites like this that review hardware tend to gravitate toward badly-coded games that aren't representative of most of the games that people will play. If this video card gets 200 frames per second at max settings in one game and that video card gets 300, what's the difference in real-world game experience? If you want to differentiate between different video cards, you need games that are more demanding, and simply being really inefficient is one way to do that.

    Of course, if you were trying to see how different CPUs affect performance in a mostly GPU-limited game, that can be interesting in an esoteric sense. It would probably tend to favor high single-threaded performance because the only difference you'd be able to pick out are due to things that happen between frames, which is the time that the video card is most likely to be forced to wait on the CPU briefly.

    But if you were trying to do that, why not just use a Radeon HD 5450? The question answers itself.

    If you would like to get some data that will be more representative of how games handle CPUs, then you'll need to do some things very differently. For starters, use just a single powerful GPU, to avoid any CrossFire or SLI weirdness. A GeForce GTX Titan is ideal, but a Radeon HD 7970 or GeForce GTX 680 would be fine. For that matter, if you're not stupid about picking graphical settings, something weaker like a Radeon HD 7870 or GeForce GTX 660 would probably work just fine. But you need to choose the graphical settings intelligently, by turning down or off any graphical settings that don't affect CPU load. In particular, anti-aliasing, anisotropic filtering, and all post-processing effects should be completely off. Use a fairly low monitor resolution; certainly no higher than 1920x1080, and you could make a good case for 1366x768.

    And then don't pick your usual set of games that you use to do video card reviews. You chose those games precisely because they're outliers that won't give a good gauge of CPU performance, so they'll sabotage your measurements if you're trying to isolate CPU performance. Rather, pick games that you rejected from doing video card reviews because they were unable to distinguish between video cards very well. If the results are that in a typical game, this processor can deliver 200 frames per second and that one can do 300, then so be it. If a Core i7-3570K and an FX-6300 can deliver hundreds of frames per second in most games (as is likely if the game runs well on, say, a 2 GHz Core 2 Duo), then you shouldn't shy away from that conclusion.

Log in

Don't have an account? Sign up now