siteat.blogg.se - Mesi cache coherence

The 元 caches in Intel processors starting with Nehalem-EX uses the MESIF protocol with an inclusive directory (used on a hit) for the entire NUMA node. (It's possible that the type changed from WB to WT, so the first WT access could hit in M.) So the effective protocol for a WT line is ESI or SI. A cache line whose physical address falls within a region with the write-through (WT) memory type doesn't use the M state. The fact that four states are available doesn't necessarily mean that all are being used. Intel processors with three levels of caches also uses MESI for the L1D and L2. Old Intel processors (90s and early 2000s) with two levels of caches use MESI for the L1D and L2. Intel processors almost always support the write-back policy in all data and unified caches. SI is more likely because it requires only a single bit of state per entry. This means that the L1D either uses SI or ESI. The M state doesn't make sense in a write-through cache. For example, the L1D in the AMD Bulldozer is a write-through cache. Some microarchitectures have caches that only support the write-through write hit policy. The M and E states are only relevant and the cache supports modifying an existing line. So an entry can either be in the S or I state. The instruction cache (L1I) always uses the SI protocol because a line is never modified between the point of fill and the point of invalidation.

Some hardware performance event names allude to what coherence protocol is used in the cache to which the events apply. You won't find a section in any official document that directly tells you all the protocols that caches use. However, the coherence protocols are not always clearly documented. The organization of the cache hierarchy is always clearly documented by Intel and AMD. There a lot of different designs even if you just consider the processors released in the past few years. These design aspects vary by vendor and processor generation and models within the same generation. The number of cache levels, how each level is organized with respect to other processors or cores in the system, and the coherence protocol implemented in each cache is defined by the core microarchitecture, the uncore microarchitecture, and, in some cases, relevant boot-time configuration options.