Difference between revisions of "Register Indexing"

From gem5
Jump to: navigation, search
(Created page with 'A "nickle tour" of flattening and register indexing in the CPU models. First, an instruction has identified that it needs register such and such as determined by its encoding (…')
 
Line 1: Line 1:
A "nickle tour" of flattening and register indexing in the CPU models.  
+
CPU register indexing in gem5 is a complicated by the need to support multiple ISAs with sometimes very different register semantics (register windows, condition codes, mode-based alternate register sets, etc.).  In addition, this support has evolved gradually as new ISAs have been added, so older code may not take advantage of newer features or terminology.
  
First, an instruction has identified that it needs register such and such as determined by its encoding (or the fact that it always uses a certain register, or ...). For the sake of argument, lets say we're talking about SPARC, the register is %g1, and the second bank of globals is active. From the instructions point of view, the unflattened register is %g1, which, likely, is just represented by the index 1.
+
There are three types of register indices used internally in the CPU models: relative, unified, and flattened.
  
Next, we need to map from the instruction's view of the register file(s) down to actual storage locations. Think of this like virtual memory. The instruction is working within an index space which is like a virtual address space, and it needs to be mapped down to the flattened space which is like physical memory. Here, the index 1 is likely mapped to, say, 9, where 0-7 is the first bank of globals and 8-15 is the second.
+
=== Relative ===
  
This is the point where the CPU gets involved. The index 9 refers to an actual register the instruction expects to access, and it's the CPU's job to make that happen. Before this point, all the work was done by the ISA with no insight available to the CPU, and beyond this point all the work is done by the CPU with no insight available to the ISA.
+
A '''relative register index''' is the index that is encoded in a machine instruction. There is a separate index space for each type of register (integer, floating point, etc.), starting at 0.  The register type is implied by the opcode.  Thus a value of "1" in a source register field may mean integer register 1 (e.g., "%r1") or floating point register 1 (e.g., "%f1") depending on the type of the instruction.
  
The CPU is free to provide a register directly like the simple CPU by having an array and just reading and writing the 9th element on behalf of the instruction. The CPU could, alternatively, do something complicated like renaming and mapping the flattened index further into a physical register like O3.
+
=== Unified ===
  
 +
While relative register indices are good for keeping instruction encodings compact, they are ambiguous, and thus not convenient for things like managing dependencies.  To avoid this ambiguity, the decoder maps the relative register indices into a '''unified register space''' by adding type-specific offsets to relocate each relative index range into a unique position.  Integer registers are unmodified, and continue to start at zero.  Floating-point register indices are offset by (at least) the number of integer registers, so that the first FP register (e.g., "%f0") gets a unified index that is greater than that of the last integer register.  Similarly, miscellaneous (a.k.a. control) registers are mapped past the end of the FP register index space.
  
One important property of all this, which makes sense if you think about the virtual memory analogy, is that the size of the index space before flattening has nothing to do with the size after. The virtual memory space could be very large (presumably with gaps) and map to a smaller physical space, or it could be small and map to a larger physical space where the extra is for, say, other virtual spaces used at other times. You need to make sure you're using the right size (post flattening) to size your tables because that's the space of possible options.
+
=== Flattened ===
  
One other tricky part comes from the fact that we add offsets into the indices to distinguish ints from floats from miscs. Those offsets might be one thing in the preflattening world, but then need to be something else in the post flattening world to keep things from landing on top of each other without leaving gaps. It's easy to make a mistake here, and it's one of the reasons I don't like this offset idea as a way to keep the different types separate. I'd rather see a two dimensional index where the second coordinate was a register type. But in the world as it exists today, this is something you have to keep track of.
+
Unified register indices provide an unambiguous description of all the registers that are accessible as instruction operands at a given point in the execution.  Unfortunately, due to the complex features of some ISAs, they do not always unambiguously identify the actual state that the instruction is referencing. For example, in ISAs with register windows (notably SPARC), a particular register identifier such as "%o0" will refer to a different register after a "save" or "restore" operation than it did previously.  Several ISAs have registers that are hidden in normal operation, but get mapped on top of ordinary registers when an interrupt occurs (e.g., ARM's mode-specific registers), or under explicit supervisor control (e.g., SPARC's "alternate globals").
 +
 
 +
We solve this problem by maintaining a '''flattened register space''' which provides a distinct index for every unique register storage location.  For example, the integer portion of the SPARC flattened register space has distinct indices for the globals and the alternate globals, as well as for each of the available register windows.  The "flattening" process of translating from a unified or relative register index to a flattened register index varies by ISA.  On some ISAs, the mapping is trivial, while others use table lookups to do the translation.
 +
 
 +
A key distinction between the generation of unified and flattened register indices is that the former can always be done statically while the latter often depends on dynamic processor state. That is, the translation from relative to unified indices depends only on the context provided by the instruction itself (which is convenient as the translation is done in the decoder).  In contrast, the mapping to a flattened register index may depend on processor state such as the interrupt level or the current window pointer on SPARC.
 +
 
 +
=== Caveats ===
 +
 
 +
* The description above is intended to illustrate the typical usage of these index types.  There may be exceptions that don't precisely follow this description, but I got tired of writing "typically" in every sentence.
 +
* The terms 'relative' and 'unified' were invented for use in this documentation, so you are unlikely see them in the code until the code starts catching up with this page.

Revision as of 20:03, 30 July 2013

CPU register indexing in gem5 is a complicated by the need to support multiple ISAs with sometimes very different register semantics (register windows, condition codes, mode-based alternate register sets, etc.). In addition, this support has evolved gradually as new ISAs have been added, so older code may not take advantage of newer features or terminology.

There are three types of register indices used internally in the CPU models: relative, unified, and flattened.

Relative

A relative register index is the index that is encoded in a machine instruction. There is a separate index space for each type of register (integer, floating point, etc.), starting at 0. The register type is implied by the opcode. Thus a value of "1" in a source register field may mean integer register 1 (e.g., "%r1") or floating point register 1 (e.g., "%f1") depending on the type of the instruction.

Unified

While relative register indices are good for keeping instruction encodings compact, they are ambiguous, and thus not convenient for things like managing dependencies. To avoid this ambiguity, the decoder maps the relative register indices into a unified register space by adding type-specific offsets to relocate each relative index range into a unique position. Integer registers are unmodified, and continue to start at zero. Floating-point register indices are offset by (at least) the number of integer registers, so that the first FP register (e.g., "%f0") gets a unified index that is greater than that of the last integer register. Similarly, miscellaneous (a.k.a. control) registers are mapped past the end of the FP register index space.

Flattened

Unified register indices provide an unambiguous description of all the registers that are accessible as instruction operands at a given point in the execution. Unfortunately, due to the complex features of some ISAs, they do not always unambiguously identify the actual state that the instruction is referencing. For example, in ISAs with register windows (notably SPARC), a particular register identifier such as "%o0" will refer to a different register after a "save" or "restore" operation than it did previously. Several ISAs have registers that are hidden in normal operation, but get mapped on top of ordinary registers when an interrupt occurs (e.g., ARM's mode-specific registers), or under explicit supervisor control (e.g., SPARC's "alternate globals").

We solve this problem by maintaining a flattened register space which provides a distinct index for every unique register storage location. For example, the integer portion of the SPARC flattened register space has distinct indices for the globals and the alternate globals, as well as for each of the available register windows. The "flattening" process of translating from a unified or relative register index to a flattened register index varies by ISA. On some ISAs, the mapping is trivial, while others use table lookups to do the translation.

A key distinction between the generation of unified and flattened register indices is that the former can always be done statically while the latter often depends on dynamic processor state. That is, the translation from relative to unified indices depends only on the context provided by the instruction itself (which is convenient as the translation is done in the decoder). In contrast, the mapping to a flattened register index may depend on processor state such as the interrupt level or the current window pointer on SPARC.

Caveats

  • The description above is intended to illustrate the typical usage of these index types. There may be exceptions that don't precisely follow this description, but I got tired of writing "typically" in every sentence.
  • The terms 'relative' and 'unified' were invented for use in this documentation, so you are unlikely see them in the code until the code starts catching up with this page.