caching memory computer-science cpu-architecture cpu-cache

Separated tag array versus combined with data array

I am wondering where the tags normally are stored. I have seen some combined tag-data caches where the the tags and data are stored together, and where only the tag portion are accessed before accessing the data portion when there is a matching tag with the memory address. In the other hand then I have seen completely separated tag and data caches, with separated valid bits and other bits.

I am wondering which of these approaches are commonly used, and if there is any difference in performance or energy efficiency between these two structures?

Thanks in advance.

Solution

First level caches are generally implemented as virtually indexed and physically tagged. Which means you take the virtual address, get the index bits and start locating the entry indexed by index bits, in the cache. At the same time, you issue the virtual address to the virtual to physical translation unit - (eg: TLB) and get the physical address.

If the valid bit is not set, you sent the physical address down to the next level cache. If the valid bit is set, (by this time you will already have the physical address from TLB), you do the tag comparison.

Think of the cacheline + tag implementation as a list tuples (eg: in python) which can be indexed by the index bits. Once you've locate the entry in the list, check for 0th entry for tag and 1st entry for data in the selected tuple.

Now to the real question as to do we keep tag and data physically adjacent in the cache and what sort of benefits we have by doing so? I think its down to what sort of application domains you would be using the cache.

Now think of tag and data portion are two separate units and they can be powered up or down(not switching off, just DVFS).

If your application has a higher ratio of (num cache accesses/num instructions executed) it makes sense to keep both powered up and linked as you are more likely to get a cache request and when the tag indicates a hit data is live and ready to go.

If your application has a low ratio of (num cache accesses/num instructions executed), there is no point of wasting power keeping both powered up. You could just keep tag powered up and on a hit in tag, you powered up the data portion and respond.