Ruby

From gem5
Revision as of 02:54, 11 March 2011 by Rsen (talk | contribs) (Interconnection Network)
Jump to: navigation, search

Ruby

High level components of Ruby

  Need to talk about the overview of Ruby and what are the major components.

Rathijit will do it

SLICC + Coherence protocols:

   Need to say what is SLICC and whats its purpose. 
   Talk about high level strcture of a typical coherence protocol file, that SLICC uses to generate code. 
   A simple example structure from protocol like MI_example can help here.

Nilay will do it

Protocol independent memory components

  1. Cache Memory
  2. Replacement Policies
  3. Memory Controller

Arka will do it

Interconnection Network

The interconnection network connects the various components of the memory hierarchy (cache, memory, dma controllers) together. There are 3 key parts to this:

  1. Topology specification: These are specified with python files that describe the topology (mesh/crossbar/ etc.), link latencies and link bandwidth. The network topology is thus configurable.
  2. Cycle-accurate network model: Garnet is a cycle accurate, pipelined, network model that builds and simulates the specified topology. It simulates the router pipeline and movement of flits across the network subject to the routing algorithm, latency and bandwidth constraints.
  3. Network Power model: The Orion power model is used to keep track of router and link activity in the network. It calculates both router static power and link and router dynamic power as flits move through the network.

More details about the network model implementation are described #Implementation of Ruby:Interconnection Network here

Implementation of Ruby

Directory Structure

  • src/mem/
    • protocols: SLICC specification for coherence protocols
    • slicc: implementation for SLICC parser and code generator
    • ruby
      • buffers: implementation for message buffers that are used for exchanging information between the cache, directory, memory controllers and the interconnect
      • common: frequently used data structures, e.g. Address (with bit-manipulation methods), histogram, data block, basic types (int32, uint64, etc.)
      • eventqueue: Ruby’s event queue mechanism for scheduling events
      • filters: various Bloom filters
      • network: Interconnect implementation, sample topology specification, network power calculations
      • profiler: Profiling for cache events, memory controller events
      • recorder: Cache warmup and access trace recording
      • slicc_interface: Message data structure, various mappings (e.g. address to directory node), utility functions (e.g. conversion between address & int, convert address to cache line address)
      • system: Protocol independent memory components – CacheMemory, DirectoryMemory, Sequencer, RubyPort

SLICC

   Explain functionality/ capability of SLICC
   Talk about
   AST, Symbols, Parser and code generation in some details but NO need to cover every file and/or functions. 
   Few examples should suffice.

Nilay will do it

Protocols

   Need to talk about each protocol being shipped. Need to talk about protocol specific configuration parameters.
   NO need to explain every action or every state/events, but need to give overall idea and how it works
   and assumptions (if any).
MI example

This is a simple cache coherence protocol that is used to illustrate protocol specification using SLICC.

Related Files

MI_example-cache.sm (cache controller specification)
MI_example-dir.sm (directory controller specification)
MI_example-dma.sm (dma controller specification)
MI_example-msg.sm (message type specification)
MI_example.slicc (container file)

Cache Hierarchy

This protocol assumes a 1-level cache hierarchy. The cache is private to each node. The caches are kept coherent by a directory controller. Since the hierarchy is only 1-level, there is no inclusion/exclusion requirement.

Stable States and Invariants
  • M : The cache block has been accessed (read/written) by this node. No other node holds a copy of the cache block
  • I : The cache block at this node is invalid
Cache controller
  • Requests, Responses, Triggers:
    • Load, Instruction fetch, Store from the core
    • Replacement from self
    • Data from the directory controller
    • Forwarded request (intervention) from the directory controller
    • Writeback acknowledgement from the directory controller
    • Invalidations from directory controller (on dma activity)
  • Main Operation:
    • On a load/Instruction fetch/Store request from the core:
      • it checks whether the corresponding block is present in the M state. If so, it returns a hit
      • otherwise, if in I state, it initiates a GETX request from the directory controller
    • Note: This protocol does not differentiate between load and store requests
    • On a replacement trigger from self:
      • it evicts the block, issues a writeback request to the directory controller
      • it waits for acknowledgement from the directory controller
    • Note: writebacks are acknowledged to prevent races
    • On a forwarded request from the directory controller:
      • This means that the block was in M state at this node when the request was generated by some other node
      • It sends the block directly to the requesting node (cache-to-cache transfer)
      • It evicts the block from this node
    • Invalidations are similar to replacements
Directory controller
  • Requests, Responses, Triggers:
    • GETX from the cores, Forwarded GETX to the cores
    • Data from memory, Data to the cores
    • Writeback requests from the cores, Writeback acknowledgements to the cores
    • DMA read, write requests from the DMA controllers
  • Main Operation:
    • The directory maintains track of which core has a block in the M state. It designates this core as owner of the block.
    • On a GETX request from a core:
      • If the block is not present, a memory fetch request is initiated
      • If the block is already present, then it means the request is generated from some other core
        • In this case, a forwarded request is sent to the original owner
        • Ownership of the block is transferred to the requestor
    • On a writeback request from a core:
      • If the core is owner, the data is written to memory and acknowledgement is sent back to the core
      • If the core is not owner, a NACK is sent back
        • This can happen in a race condition
        • The core evicted the block while a forwarded request some other core was on the way and the directory has already changed ownership for the core
        • The evicting core holds the data till the forwarded request arrives
    • On DMA accesses (read/write)
      • Invalidation is sent to the owner node (if any). Otherwise data is fetched from memory.
      • This ensures that the most recent data is available.
Other features
    • MI protocols don't support LL/SC semantics.
    • This protocol has no timeout mechanisms.
MOESI_hammer

Somayeh will do it

MOESI_CMP_token

Shoaib will do it

MOESI_CMP_directory

In contrast with the MESI protocol, the MOESI protocol introduces an additional Owned state. This enables sharing of a block after modification without needing to write it back to memory first. However, in that case, only 1 node is the owner while the others are sharers. The owner node has the responsibility to write the block back to memory on eviction. Sharers may evict the block without writeback. An overview of the protocol can be found here.

Related Files

MOESI_CMP_directory-L1cache.sm (L1 cache controller specification)
MOESI_CMP_directory-L2cache.sm (L1 cache controller specification)
MOESI_CMP_directory-dir.sm (directory controller specification)
MOESI_CMP_directory-dma.sm (dma controller specification)
MOESI_CMP_directory-msg.sm (message type specification)
MOESI_CMP_directory.slicc (container file)

Cache Hierarchy
Stable States and Invariants
L1 Cache controller
L2 Cache controller
Directory controller
Other features

Rathijit will do it

MESI_CMP_directory

Arka will do it

Protocol Independent Memory components

System

Arka will do it

Sequencer

Arka will do it

CacheMemory

Arka will do it

DMASequencer

Rathijit will do it

Memory Controller

Rathijit will do it

Interconnection Network

Topology specification

Python files specify connections. Shortest path graph traversals program the routing tables.

Network implementation
  1. SimpleNetwork
  2. Garnet

Life of a memory request in Ruby

Cpu model generates a packet -> RubyPort converts it to a ruby request -> L1 cache controller converts it to a protocol specific message ...etc.

Arka will do it