Difference between revisions of "Cache Coherence Protocols"
Nilayvaish (talk | contribs) (→MOESI_CMP_directory: Moved the protocol to separate page.) |
Nilayvaish (talk | contribs) (→MOESI_CMP_token: moved moesi cmp token to a separate page.) |
||
Line 128: | Line 128: | ||
[[File:MOESI_hammer_dir_FSM.jpg|center]] | [[File:MOESI_hammer_dir_FSM.jpg|center]] | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
== Network_test == | == Network_test == |
Revision as of 00:23, 9 July 2013
Common Notations and Data Structures
Coherence Messages
These are described in the <protocol-name>-msg.sm file for each protocol.
Message | Description |
---|---|
ACK/NACK | positive/negative acknowledgement for requests that wait for the direction of resolution before deciding on the next action. Examples are writeback requests, exclusive requests. |
GETS | request for shared permissions to satisfy a CPU's load or IFetch. |
GETX | request for exclusive access. |
INV | invalidation request. This can be triggered by the coherence protocol itself, or by the next cache level/directory to enforce inclusion or to trigger a writeback for a DMA access so that the latest copy of data is obtained. |
PUTX | request for writeback of cache block. Some protocols (e.g. MOESI_CMP_directory) may use this only for writeback requests of exclusive data. |
PUTS | request for writeback of cache block in shared state. |
PUTO | request for writeback of cache block in owned state. |
PUTO_Sharers | request for writeback of cache block in owned state but other sharers of the block exist. |
UNBLOCK | message to unblock next cache level/directory for blocking protocols. |
AccessPermissions
These are associated with each cache block and determine what operations are permitted on that block. It is closely correlated with coherence protocol states.
Permissions | Description |
---|---|
Invalid | The cache block is invalid. The block must first be obtained (from elsewhere in the memory hierarchy) before loads/stores can be performed. No action on invalidates (except maybe sending an ACK). No action on replacements. The associated coherence protocol states are I or NP and are stable states in every protocol. |
Busy | TODO |
Read_Only | Only operations permitted are loads, writebacks, invalidates. Stores cannot be performed before transitioning to some other state. |
Read_Write | Loads, stores, writebacks, invalidations are allowed. Usually indicates that the block is dirty. |
Data Structures
- Message Buffers:TODO
- TBE Table: TODO
- Timer Table: This maintains a map of address-based timers. For each target address, a timeout value can be associated and added to the Timer table. This data structure is used, for example, by the L1 cache controller implementation of the MOESI_CMP_directory protocol to trigger separate timeouts for cache blocks. Internally, the Timer Table uses the event queue to schedule the timeouts. The TimerTable supports a polling-based interface, isReady() to check if a timeout has occurred. Timeouts on addresses can be set using the set() method and removed using the unset() method.
- Related Files:
- src/mem/ruby/system/TimerTable.hh: Declares the TimerTable class
- src/mem/ruby/system/TimerTable.cc: Implementation of the methods of the TimerTable class, that deals with setting addresses & timeouts, scheduling events using the event queue.
- Related Files:
Coherence controller FSM Diagrams
- The Finite State Machines show only the stable states
- Transitions are annotated using the notation "Event list" or "Event list : Action list" or "Event list : Action list : Event list". For example, Store : GETX indicates that on a Store event, a GETX message was sent whereas GETX : Mem Read indicates that on receiving a GETX message, a memory read request was sent. Only the main triggers and actions are listed.
- Optional actions (e.g. writebacks depending on whether or not the block is dirty) are enclosed within [ ]
- In the diagrams, the transition labels are associated with the arc that cuts across the transition label or the closest arc.
MOESI_hammer
This is an implementation of AMD's Hammer protocol, which is used in AMD's Hammer chip (also know as the Opteron or Athlon 64). The protocol implements both the original a HyperTransport protocol, as well as the more recent ProbeFilter protocol. The protocol also includes a full-bit directory mode.
Related Files
- src/mem/protocols
- MOESI_hammer-cache.sm: cache controller specification
- MOESI_hammer-dir.sm: directory controller specification
- MOESI_hammer-dma.sm: dma controller specification
- MOESI_hammer-msg.sm: message type specification
- MOESI_hammer.slicc: container file
Cache Hierarchy
This protocol implements a 2-level private cache hierarchy. It assigns separate Instruction and Data L1 caches, and a unified L2 cache to each core. These caches are private to each core and are controlled with one shared cache controller. This protocol enforce exclusion between L1 and L2 caches.
Stable States and Invariants
States | Invariants |
---|---|
MM | The cache block is held exclusively by this node and is potentially locally modified (similar to conventional "M" state). |
O | The cache block is owned by this node. It has not been modified by this node. No other node holds this block in exclusive mode, but sharers potentially exist. |
M | The cache block is held in exclusive mode, but not written to (similar to conventional "E" state). No other node holds a copy of this block. Stores are not allowed in this state. |
S | The cache line holds the most recent, correct copy of the data. Other processors in the system may hold copies of the data in the shared state, as well. The cache line can be read, but not written in this state. |
I | The cache line is invalid and does not hold a valid copy of the data. |
Cache controller
The notation used in the controller FSM diagrams is described here.
MOESI_hammer supports cache flushing. To flush a cache line, the cache controller first issues a GETF request to the directory to block the line until the flushing is completed. It then issues a PUTF and writes back the cache line.
Directory controller
MOESI_hammer memory module, unlike a typical directory protocol, does not contain any directory state and instead broadcasts requests to all the processors in the system. In parallel, it fetches the data from the DRAM and forward the response to the requesters.
probe filter: TODO
- Stable States and Invariants
States | Invariants |
---|---|
NX | Not Owner, probe filter entry exists, block in O at Owner. |
NO | Not Owner, probe filter entry exists, block in E/M at Owner. |
S | Data clean, probe filter entry exists pointing to the current owner. |
O | Data clean, probe filter entry exists. |
E | Exclusive Owner, no probe filter entry. |
- Controller
The notation used in the controller FSM diagrams is described here.
Network_test
This is a dummy cache coherence protocol that is used to operate the ruby network tester. The details about running the network tester can be found here.
Related Files
- src/mem/protocols
- Network_test-cache.sm: cache controller specification
- Network_test-dir.sm: directory controller specification
- Network_test-msg.sm: message type specification
- Network_test.slicc: container file
Cache Hierarchy
This protocol assumes a 1-level cache hierarchy. The role of the cache is to simply send messages from the cpu to the appropriate directory (based on the address), in the appropriate virtual network (based on the message type). It does not track any state. Infact, no CacheMemory is created unlike other protocols. The directory receives the messages from the caches, but does not send any back. The goal of this protocol is to enable simulation/testing of just the interconnection network.
Stable States and Invariants
States | Invariants |
---|---|
I | Default state of all cache blocks |
Cache controller
- Requests, Responses, Triggers:
- Load, Instruction fetch, Store from the core.
The network tester (in src/cpu/testers/networktest/networktest.cc) generates packets of the type ReadReq, INST_FETCH, and WriteReq, which are converted into RubyRequestType:LD, RubyRequestType:IFETCH, and RubyRequestType:ST, respectively, by the RubyPort (in src/mem/ruby/system/RubyPort.hh/cc). These messages reach the cache controller via the Sequencer. The destination for these messages is determined by the traffic type, and embedded in the address. More details can be found here.
- Main Operation:
- The goal of the cache is only to act as a source node in the underlying interconnection network. It does not track any states.
- On a LD from the core:
- it returns a hit, and
- maps the address to a directory, and issues a message for it of type MSG, and size Control (8 bytes) in the request vnet (0).
- Note: vnet 0 could also be made to broadcast, instead of sending a directed message to a particular directory, by uncommenting the appropriate line in the a_issueRequest action in Network_test-cache.sm
- On a IFETCH from the core:
- it returns a hit, and
- maps the address to a directory, and issues a message for it of type MSG, and size Control (8 bytes) in the forward vnet (1).
- On a ST from the core:
- it returns a hit, and
- maps the address to a directory, and issues a message for it of type MSG, and size Data (72 bytes) in the response vnet (2).
- Note: request, forward and response are just used to differentiate the vnets, but do not have any physical significance in this protocol.
Directory controller
- Requests, Responses, Triggers:
- MSG from the cores
- Main Operation:
- The goal of the directory is only to act as a destination node in the underlying interconnection network. It does not track any states.
- The directory simply pops its incoming queue upon receiving the message.
Other features
- This protocol assumes only 3 vnets.
- It should only be used when running the ruby network test.