Hazard (computer architecture)
Encyclopedia
Hazards are problems with the instruction pipeline
in central processing unit
(CPU) microarchitecture
s that potentially result in incorrect computation. There are typically three types of hazards:
There are several methods used to deal with hazards, including pipeline stalls, pipeline bubbling, register forwarding, and in the case of out-of-order execution
, the scoreboarding
method and the Tomasulo algorithm
.
s, and instructions may be executed out-of-order
. A hazard occurs when two or more of these simultaneous (possibly out of order) instructions conflict.
s (sometimes known as race hazards). There are three situations in which a data hazard can occur:
consider two instructions i and j, with i occurring before j in program order.
A read after write (RAW) data hazard refers to a situation where an instruction refers to a result that has not yet been calculated or retrieved. This can occur because even though an instruction is executed after a previous instruction, the previous instruction has not been completely processed through the pipeline.
For example:
i1. R2 <- R1 + R3
i2. R4 <- R2 + R3 + r4
The first instruction is calculating a value to be saved in register 2, and the second is going to use this value to compute a result for register 4. However, in a pipeline, when we fetch the operands for the 2nd operation, the results from the first will not yet have been saved, and hence we have a data dependency.
We say that there is a data dependency with instruction 2, as it is dependent on the completion of instruction 1.
A write after read (WAR) data hazard represents a problem with concurrent execution.
For example:
i1. R4 <- R1 + R3
i2. R3 <- R1 + R2
If we are in a situation that there is a chance that i2 may be completed before i1 (i.e. with concurrent execution) we must ensure that we do not store the result of register 3 before i1 has had a chance to fetch the operands.
A write after write (WAW) data hazard may occur in a concurrent execution
environment.
For example:
i1. R2 <- R4 + R7
i2. R2 <- R1 + R2
We must delay the WB (Write Back) of i2 until the execution of i1.
es. On many instruction pipeline microarchitectures, the processor will not know the outcome of the branch when it needs to insert a new instruction into the pipeline (normally the fetch stage).
s into the pipeline. Thus, before the next instruction (which would cause the hazard) is executed, the previous one will have had sufficient time to complete and prevent the hazard. If the number of NOPs is equal to the number of stages in the pipeline, the processor has been cleared of all instructions and can proceed free from hazards. This is called flushing the pipeline. All forms of stalling introduce a delay before the processor can resume execution.
In the case of out-of-order execution
, the algorithm used can be:
We can delegate the task of removing data dependencies to the compiler, which can fill in an appropriate number of NOP instructions between dependent instructions to ensure correct operation, or re-order instructions where possible.
For instance, let's say we want to write the value 3 to register 1, (which already contains a 6), and then add 7 to register 1 and store the result in register 2, i.e.:
Following execution, register 2 should contain the value 10. However, if Instruction 1 (write 3 to register 1) does not completely exit the pipeline before Instruction 2 starts execution, it means that Register 1 does not contain the value 3 when Instruction 2 performs its addition. In such an event, Instruction 2 adds 7 to the old value of register 1 (6), and so register 2 would contain 13 instead, i.e.:
Instruction 2: Register 2 = Register 1 + 7 = 13
Instruction 1: Register 1 = 3
This error occurs because Instruction 2 reads Register 1 before Instruction 1 has committed/stored the result of its write operation to Register 1. So when Instruction 2 is reading the contents of Register 1, register 1 still contains 6, not 3.
Forwarding (described below) helps correct such errors by depending on the fact that the output of Instruction 1 (which is 3) can be used by subsequent instructions before the value 3 is committed to/stored in Register 1.
Forwarding applied to our example means that we do not wait to commit/store the output of Instruction 1 in Register 1 (in this example, the output is 3) before making that output available to the subsequent instruction (in this case, Instruction 2). The effect is that Instruction 2 uses the correct (the more recent) value of Register 1: the commit/store was made immediately and not pipelined.
With forwarding enabled, the ID/EX or Instruction Decode/Execution stage of the pipeline now has two inputs: the value read from the register specified (in this example, the value 6 from Register 1), and the new value of Register 1 (in this example, this value is 3) which is sent from the next stage (EX/MEM) or Instruction Execute/Memory Access. Additional control logic is used to determine which input to use.
In the event that a branch causes a pipeline bubble after incorrect instructions have entered the pipeline, care must be taken to prevent any of the wrongly-loaded instructions from having any effect on the processor state excluding energy wasted processing them before they were discovered to be loaded incorrectly.
Instruction pipeline
An instruction pipeline is a technique used in the design of computers and other digital electronic devices to increase their instruction throughput ....
in central processing unit
Central processing unit
The central processing unit is the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU plays a role somewhat analogous to the brain in the computer. The term has been in...
(CPU) microarchitecture
Microarchitecture
In computer engineering, microarchitecture , also called computer organization, is the way a given instruction set architecture is implemented on a processor. A given ISA may be implemented with different microarchitectures. Implementations might vary due to different goals of a given design or...
s that potentially result in incorrect computation. There are typically three types of hazards:
- data hazards
- structural hazards
- control hazards (branching hazards)
There are several methods used to deal with hazards, including pipeline stalls, pipeline bubbling, register forwarding, and in the case of out-of-order execution
Out-of-order execution
In computer engineering, out-of-order execution is a paradigm used in most high-performance microprocessors to make use of instruction cycles that would otherwise be wasted by a certain type of costly delay...
, the scoreboarding
Scoreboarding
Scoreboarding is a centralized method, used in the CDC 6600 computer, for dynamically scheduling a pipeline so that the instructions can execute out of order when there are no conflicts and the hardware is available. In a scoreboard, the data dependencies of every instruction are logged...
method and the Tomasulo algorithm
Tomasulo algorithm
The Tomasulo algorithm is a hardware algorithm developed in 1967 by Robert Tomasulo from IBM. It allows sequential instructions that would normally be stalled due to certain dependencies to execute non-sequentially...
.
Background
Instructions in a pipelined processor are performed in several stages, so that at any given time several instructions are being processed in the various stages of the pipeline, such as fetch and execute. There are many different instruction pipeline microarchitectureMicroarchitecture
In computer engineering, microarchitecture , also called computer organization, is the way a given instruction set architecture is implemented on a processor. A given ISA may be implemented with different microarchitectures. Implementations might vary due to different goals of a given design or...
s, and instructions may be executed out-of-order
Out-of-order execution
In computer engineering, out-of-order execution is a paradigm used in most high-performance microprocessors to make use of instruction cycles that would otherwise be wasted by a certain type of costly delay...
. A hazard occurs when two or more of these simultaneous (possibly out of order) instructions conflict.
Data hazards
Data hazards occur when instructions that exhibit data dependence modify data in different stages of a pipeline. Ignoring potential data hazards can result in race conditionRace condition
A race condition or race hazard is a flaw in an electronic system or process whereby the output or result of the process is unexpectedly and critically dependent on the sequence or timing of other events...
s (sometimes known as race hazards). There are three situations in which a data hazard can occur:
- read after write (RAW), a true dependency
- write after read (WAR)
- write after write (WAW)
consider two instructions i and j, with i occurring before j in program order.
Read After Write (RAW)
(j tries to read a source before i writes to it)A read after write (RAW) data hazard refers to a situation where an instruction refers to a result that has not yet been calculated or retrieved. This can occur because even though an instruction is executed after a previous instruction, the previous instruction has not been completely processed through the pipeline.
Example
For example:
i1. R2 <- R1 + R3
i2. R4 <- R2 + R3 + r4
The first instruction is calculating a value to be saved in register 2, and the second is going to use this value to compute a result for register 4. However, in a pipeline, when we fetch the operands for the 2nd operation, the results from the first will not yet have been saved, and hence we have a data dependency.
We say that there is a data dependency with instruction 2, as it is dependent on the completion of instruction 1.
Write After Read (WAR)
(j tries to write a destination before it is read by i)A write after read (WAR) data hazard represents a problem with concurrent execution.
Example
For example:
i1. R4 <- R1 + R3
i2. R3 <- R1 + R2
If we are in a situation that there is a chance that i2 may be completed before i1 (i.e. with concurrent execution) we must ensure that we do not store the result of register 3 before i1 has had a chance to fetch the operands.
Write After Write (WAW)
(j tries to write an operand before it is written by i)A write after write (WAW) data hazard may occur in a concurrent execution
Concurrent computing
Concurrent computing is a form of computing in which programs are designed as collections of interacting computational processes that may be executed in parallel...
environment.
Example
For example:
i1. R2 <- R4 + R7
i2. R2 <- R1 + R2
We must delay the WB (Write Back) of i2 until the execution of i1.
Structural hazards
A structural hazard occurs when a part of the processor's hardware is needed by two or more instructions at the same time. A canonical example is a single memory unit that is accessed both in the fetch stage where an instruction is retrieved from memory, and the memory stage where data is written and/or read from memory. They can often be resolved by separating the component into orthogonal units (such as separate caches) or bubbling the pipeline.Control hazards (branch hazards)
Branching hazards (also known as control hazards) occur with branchBranch (computer science)
A branch is sequence of code in a computer program which is conditionally executed depending on whether the flow of control is altered or not . The term can be used when referring to programs in high level languages as well as program written in machine code or assembly language...
es. On many instruction pipeline microarchitectures, the processor will not know the outcome of the branch when it needs to insert a new instruction into the pipeline (normally the fetch stage).
Pipeline bubbling
Bubbling the pipeline, also known as a pipeline break or a pipeline stall, is a method for preventing data, structural, and branch hazards from occurring. As instructions are fetched, control logic determines whether a hazard could/will occur. If this is true, then the control logic inserts NOPNOP
In computer science, NOP or NOOP is an assembly language instruction, sequence of programming language statements, or computer protocol command that effectively does nothing at all....
s into the pipeline. Thus, before the next instruction (which would cause the hazard) is executed, the previous one will have had sufficient time to complete and prevent the hazard. If the number of NOPs is equal to the number of stages in the pipeline, the processor has been cleared of all instructions and can proceed free from hazards. This is called flushing the pipeline. All forms of stalling introduce a delay before the processor can resume execution.
Data hazards
There are several main solutions and algorithms used to resolve data hazards:- insert a pipeline bubble whenever a read after write (RAW) dependency is encountered, guaranteed to increase latency, or
- utilize out-of-order executionOut-of-order executionIn computer engineering, out-of-order execution is a paradigm used in most high-performance microprocessors to make use of instruction cycles that would otherwise be wasted by a certain type of costly delay...
to potentially prevent the need for pipeline bubbles - utilize register forwarding to use data from later stages in the pipeline
In the case of out-of-order execution
Out-of-order execution
In computer engineering, out-of-order execution is a paradigm used in most high-performance microprocessors to make use of instruction cycles that would otherwise be wasted by a certain type of costly delay...
, the algorithm used can be:
- scoreboardingScoreboardingScoreboarding is a centralized method, used in the CDC 6600 computer, for dynamically scheduling a pipeline so that the instructions can execute out of order when there are no conflicts and the hardware is available. In a scoreboard, the data dependencies of every instruction are logged...
, in which case a pipeline bubble will only be needed when there is no functional unit available - the Tomasulo algorithmTomasulo algorithmThe Tomasulo algorithm is a hardware algorithm developed in 1967 by Robert Tomasulo from IBM. It allows sequential instructions that would normally be stalled due to certain dependencies to execute non-sequentially...
, which utilizes register renamingRegister renamingIn computer architecture, register renaming refers to a technique used to avoid unnecessary serialization of program operations imposed by the reuse of registers by those operations.-Problem definition:...
allowing the continual issuing of instructions
We can delegate the task of removing data dependencies to the compiler, which can fill in an appropriate number of NOP instructions between dependent instructions to ensure correct operation, or re-order instructions where possible.
Register forwarding
Forwarding involves feeding output data into a previous stage of the pipeline. Forwarding is implemented by feeding back the output of an instruction into the previous stage(s) of the pipeline as soon as the output of that instruction is available.Example
- NOTE: In the following examples, computed values are in bold, while Register numbers are not.
For instance, let's say we want to write the value 3 to register 1, (which already contains a 6), and then add 7 to register 1 and store the result in register 2, i.e.:
- Instruction 0: Register 1 = 6
- Instruction 1: Register 1 = 3
- Instruction 2: Register 2 = Register 1 + 7 = 10
Following execution, register 2 should contain the value 10. However, if Instruction 1 (write 3 to register 1) does not completely exit the pipeline before Instruction 2 starts execution, it means that Register 1 does not contain the value 3 when Instruction 2 performs its addition. In such an event, Instruction 2 adds 7 to the old value of register 1 (6), and so register 2 would contain 13 instead, i.e.:
- Instruction 0: Register 1 =
This error occurs because Instruction 2 reads Register 1 before Instruction 1 has committed/stored the result of its write operation to Register 1. So when Instruction 2 is reading the contents of Register 1, register 1 still contains 6, not 3.
Forwarding (described below) helps correct such errors by depending on the fact that the output of Instruction 1 (which is 3) can be used by subsequent instructions before the value 3 is committed to/stored in Register 1.
Forwarding applied to our example means that we do not wait to commit/store the output of Instruction 1 in Register 1 (in this example, the output is 3) before making that output available to the subsequent instruction (in this case, Instruction 2). The effect is that Instruction 2 uses the correct (the more recent) value of Register 1: the commit/store was made immediately and not pipelined.
With forwarding enabled, the ID/EX or Instruction Decode/Execution stage of the pipeline now has two inputs: the value read from the register specified (in this example, the value 6 from Register 1), and the new value of Register 1 (in this example, this value is 3) which is sent from the next stage (EX/MEM) or Instruction Execute/Memory Access. Additional control logic is used to determine which input to use.
Control hazards (branch hazards)
To avoid control hazards microarchitectures can:- insert a pipeline bubble (discussed above), guaranteed to increase latencyLatencyLatency or latent may refer to:*Latency period , the time between exposure to a pathogen, chemical or radiation, and when symptoms first become apparent...
, or - use branch prediction and essentially guesstimateGuesstimateGuesstimate is an informal English contraction of guess and estimate, first used by American statisticians in 1934 or 1935. It is defined as an estimate made without using adequate or complete information, or, more strongly, as an estimate arrived at by guesswork or conjecture...
which instructions to insert, in which case a pipeline bubble will only be needed in the case of an incorrect prediction
In the event that a branch causes a pipeline bubble after incorrect instructions have entered the pipeline, care must be taken to prevent any of the wrongly-loaded instructions from having any effect on the processor state excluding energy wasted processing them before they were discovered to be loaded incorrectly.
See also
- feed-forwardFeed-forwardFeed-forward is a term describing an element or pathway within a control system which passes a controlling signal from a source in the control system's external environment, often a command signal from an external operator, to a load elsewhere in its external environment...
- Register renamingRegister renamingIn computer architecture, register renaming refers to a technique used to avoid unnecessary serialization of program operations imposed by the reuse of registers by those operations.-Problem definition:...
- Data dependencyData dependencyA data dependency in computer science is a situation in which a program statement refers to the data of a preceding statement. In compiler theory, the technique used to discover data dependencies among statements is called dependence analysis.There are three types of dependencies: data, name, and...
- Hazard (logic)Hazard (logic)In digital logic, a hazard in a system is an undesirable effect caused by either a deficiency in the system or external influences. Logic hazards are manifestations of a problem in which changes in the input variables do not change the output correctly due to some form of delay caused by logic...
- Speculative executionSpeculative executionSpeculative execution in computer systems is doing work, the result of which may not be needed. This performance optimization technique is used in pipelined processors and other systems.-Main idea:...
- Branch delayBranch delay slotIn computer architecture, a delay slot is an instruction slot that gets executed without the effects of a preceding instruction. The most common form is a single arbitrary instruction located immediately after a branch instruction on a RISC or DSP architecture; this instruction will execute even if...
- Branch predicationBranch predicationBranch predication is a strategy in computer architecture design for mitigating the costs usually associated with conditional branches, particularly branches to short sections of code...
- Branch predictorBranch predictorIn computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch will go before this is known for sure. The purpose of the branch predictor is to improve the flow in the instruction pipeline...