L2 Cache: What it does

We often take for granted that having an L2 cache means that your system runs faster than it would if it wasn’t there, but what does that L2 cache actually do?

L2 cache, just like any other cache, acts as sort of a middle man between two mediums, in this case, your CPU’s L1 cache and your system memory (as well as other storage mediums).  When the CPU wants to request a bit of data, it first searches in its L1 cache to see if it can find it there; if it does, then this results in what is known as a cache hit and the CPU retrieves it from the extremely fast, low latency L1 cache. 

If it can’t retrieve it from L1 cache, it then goes to the L2 cache where it attempts to do the same – obtain a cache “hit.”  In the event of a miss, the CPU must then go all the way to system memory in order to retrieve the data it needs.  With the L2 cache of today’s CPUs operating at a much higher frequency and at much lower latency than system memory, if the L2 cache weren’t there or the cache mapping technique wasn’t as effective, we would see considerably lower performance figures from our systems. 

 

Cache Mapping Techniques

We just established that the function of the L2 cache is to provide access to commonly used data in system RAM.  It does so by essentially mapping the cache lines of the L2 cache to multiple addresses in the system memory (the number of which is defined by the cacheable memory area of the L2 cache). 

There are a number of methods that can be used to dictate how this mapping should occur.  On one end of the spectrum, we have a direct mapped cache, which divides the system memory into a number of equal sections, each one being mapped to a single cache line in the L2 cache.

The beauty of a direct mapped cache allows it to be searched relatively quickly and effectively since everything is organized into sections of equal size, but with this comes the sacrifice of hit rate because the technique does not allow for any bias toward more frequently used sections of data. 

On the other end of the spectrum, we have a fully associative cache, which is the exact opposite of a direct mapped cache.  Instead of equally dividing up the memory into sections mapped to individual address lines, a fully associative cache acts as more of a dynamic entity that allows for a cache line to be mapped to any section of system memory. 

This flexibility allows for a much greater hit rate since allowances can be made for the most frequently used data.  However, since there is no organized structure to the mapping technique, searching through a fully associative cache is much slower than through a direct mapped cache.

Establishing a mid-point between these two cache mapping techniques, we have a set associative cache, which is what the current crop of processors uses. 

A set associative cache divides the cache into various sections, referred to as sets, with each set containing a number of cache lines.  With an 8-way set associative L2 cache, each set contains 8 cache lines, and in a 16-way set associative L2 cache, each set contains 16 cache lines. 

The beauty of this is that the cache acts as if it were a direct mapped cache except that, instead of the 1 cache line per memory section requirement, we get x number of cache lines per section of memory addresses. 

This helps to sustain a balance between the pros and the cons of a direct mapped and a fully associative cache.

In the case of the Thunderbird and the Pentium III Coppermine, the 16-way set associative L2 cache of the Thunderbird allows for a higher hit rate for the L2 cache than the 8-way set associative L2 cache of the Pentium III Coppermine.  In comparison, the older Athlons featured a 2-way set associative L2 cache.

Thunderbird's L2 vs. Coppermine's L2 Aluminum vs. Copper
Comments Locked

0 Comments

View All Comments

Log in

Don't have an account? Sign up now