X86 Instruction decoding
Contents
Overview
X86 instruction encodings have several unique characteristics which make them harder to deal with than the encodings for the other ISAs M5 supports. Despite that, x86 is decoded using the same basic mechanisms as the other ISAs.
At the lowest level, instructions can take any number of bytes (up to a maximum) and can have any alignment. That means that when a CPU brings in bytes of memory to decode, it may contain several instructions, the end of one and the start of the next, the middle of an instruction, etc. It's the predecoder's job to take the incoming stream of bytes and to turn them into a stream of objects, called ExtMachInsts, which represent the encoding of a single instruction from beginning to end in a unique and uniform way. The predecoder also makes the guarantee that if any two instruction representations are the same, the instruction object that implements it is also the same.
Next, the instruction representations need to be translated into an instruction object which the CPU can execute. This is the job of the x86 decoder which is implemented as ISA description files and automatically generated. In addition to the instruction objects and the decoder itself, a ROM object is defined which holds microops which can be executed independent of any containing instruction.
Finally, the instruction object is handed back to the CPU. If the object is a regular instruction it can be executed directly. If the object is a macroop, aka a microcoded instruction, the CPU's decode section pulls microops out of it one at a time and actually executes those. Control flow within the microops may move execution out of the current macroop and into the ROM. If that happens, instructions are pulled from the ROM until the next instruction starts.
Predecoder
The diagram above shows the high level state machine that runs the x86 predecoder. The CPU feeds the predcoder bytes in certain sized chunks which, in other ISAs, would correspond with instructions. In x86 these are just bytes and are processed one at a time to move the predecoder through the state machine. The predecoder may not have to consume the entire chunk, or it may go through several before it has an entire instruction. Once it's finished, there should be a completed ExtMachInst representing one instruction ready for the CPU to use.
Reset State
Resets internal state to prepare for the next instruction.
Prefix State
Recognizes and records the various prefixes allowed for an instruction. When a byte comes in that is not recognized as a prefix, the machine goes to the Opcode state.
Opcode State
Gathers up all the bytes that represent the instructions opcode. Instructions can have 1, 2 or even 3 byte opcodes. This state also figures out how many bytes of immediate value, if any will be part of this instruction, and what the default address size, operand size, and stack size are. If this instruction uses a modRM byte, the next state is ModRM. Failing that, if there are any immediate bytes to collect, the next state is Imm. If neither of those are true, then the instruction has been gathered and the machine resets.
ModRM State
Processes the modRM byte. This includes determining which registers will be used as part of address computation or as operands directly, and how many, if any, bytes will be used for the displacement. This state also determines whether to expect an SIB byte. If so, the next state is SIB. If not and there are displacement bytes to collect, the next state is Disp. If not and there are immediate bytes to collect (as determined in the Opcode state), the next state is Imm. Otherwise, this instruction is finished.
SIB State
Very similar to the ModRM except processing the SIB byte.
Disp State
Collects as many displacement bytes as it can from the bytes provided by the CPU. If the whole displacement has been collected, the state machine continues on to Imm if immediate bytes are needed, or it completes the current instruction.
Imm State
The same as Disp except for immediate bytes.
Decoder generation