MC-01h - coherence introduction
- reference system (fig. 2.1)
- Cache Coherence Interface / Coherence Protocol
- Defining Consistency-agnostic Coherence via Invariants
- Some remarks
without further notes, all contents are copy-verbatim or derived from Chapter 2 of
A Primer on Memory Consistency and Cache Coherence (2nd Edition), Synthesis
Lectures on Computer Architecture, Vijay Nagarajan, Daniel J.Sorin, Mark D.Hill
and David A. Wold
https://pages.cs.wisc.edu/~markhill/papers/primer2020_2nd_edition.pdf
this page is under CC-BY-SA-4.0. The above notice must be preserved.
§ reference system (fig. 2.1)
+---------------------------------------------------------------+
| |
| +----+ +----+ |
| |Core| |Core| |
| +----+ +----| |
| | ^ | ^ |
| V | V | |
| +--+--+ +-+ +-----+ +-+ |
| |Cache| <-> |P| ``````` |Cache| <-> |P| (L1 and L2) |
| |Ctrlr| |C| |Ctrlr| |C| |
| +-----+ +-+ +-----+ +-+ |
| | ^ | ^ |
| | | | | |
| V | V | |
| ------------------------------------------------ |
| Interconnection Network |
| ------------------------------------------------ |
| | ^ |
| | | |
| V | |
| +-------------+ +-----+ (L3 cache) |
| | LLC/Memory | <-> | LLC | |
| | Controller | +-----+ |
| +-------------+ Multicore |
| | ^ Processor |
| | | Chip |
+-------|---|---------------------------------------------------+
| |
V |
+-------+------+
| Main Memory |
+--------------+
The multicore processor chip consists of multiple single-threaded cores.
- Each core has its own private data cache (PC).
- Last-level cache (LLC) is shared by all cores (commonly the L3 cache).
- LLC is logically a "memory-side cache", thus not part of cache-coherence.
- LLC also serves as on-chip memory controller (?)
Omitted from baseline diagram:
- instruction cachees, Multiple-level caches, caches shared among cores (not
LLC), virtually addressed caches
- TLBs
- DMA
§ Cache Coherence Interface / Coherence Protocol
The processor cores interact with the coherence protocol (next) through a coherence interface via two methods
- read-request :
read(addr) -> value
- write-request :
write(addr, addr) -> ack
The coherence protocols can be classified into two categories based on the nature of their coherence interfaces: whether there is a clean separation of cohreence from consistency model, or whether they are indivisible
Consistency-agnostic coherence
a write is made visible to all other cores
before returning; writes are propagated synchronously. The cache coherence
protocol abstracts aways the caches completely and presents an illusion of
atomic memory (i.e. no caches present). The ordering madated by the consistency
model specs are enforced by the core pipeline. (or .. the cache coherence is
entirely separated from the consistency model.)
Consistency-directed coherence
a write can return before it has been made visible to all processors, (writes
are propagated asynchronously) allowing for stale values (in real time) to be
observed). This class of coherence protocol must also ensure that the
consistency model (or, memory ordering) is observed.
§ Defining Consistency-agnostic Coherence via Invariants
There are multiple ways of defining this coherence. One is through a set of two invariants.
-
Single-Writer, Multiple-Read (SWMR) Invariant. For any memory location A, at any given time, there exists only a single core that may write to A (and can also read it), OR some number of cores that may only read A. (pretty much the classical synchronization problem)
-
Data-Value Invariant.. The value of the memory location at the start of an epoch is the same as the value of the memory location at the end of its last read-write epoch.
Notation for each memory location, its lifetime is divided up into epochs.
- SWMR: in each epoch, either a single core has read-write access or some number of cores (possibly zero) have read-only access.
- In the following example, SWMR alone does not ensure that e.g. the write from core 3 is observed by core 1 read. To ensure the cache does not break memory consistency additional invariant is required.
TIME
|---------------+---------------+---------------+------------------->
| Read-only | Read-write | Read-write | Read-only
| Cores 2, 5 | Core 3 | Core 1 | Cores 1,2 and 3
There are other (equivalent) definition of the same coherence (sidebar in sec.2.4)
When is coherence relevant?
- Coherence pertains to all storage structures that hold blocks from the shared address space. Including L1 data cache, L2 cache, LLC and main memory; also L1 instruction cache and TLBs
- Coherence is not directly visible to the programmer. Processor pipeline and coherence protocol jointly enforce the consistency model. Only the consistency model is visible to the programmer.
§ Some remarks
- In practice, coherence is usually maintained at the granularity of cache blocks.
- A coherence protocol must ensure that writes are made visible to all processors
- Safety invariant : badthings must not happen. Liveness invariant: good things must eventually happen