Peephole optimization
Encyclopedia
In compiler theory, peephole optimization is a kind of optimization
performed over a very small set of instructions in a segment of generated code. The set is called a "peephole" or a "window". It works by recognising sets of instructions that don't actually do anything, or that can be replaced by a leaner set of instructions.
There can, of course, be other types of peephole optimizations involving simplifying the target machine instructions, assuming that the target machine is known in advance. Advantages of a given architecture and instruction sets can be exploited in this case.
...
aload 1
aload 1
mul
...
can be replaced by
...
aload 1
dup
mul
...
This kind of optimization, like most peephole optimizations, makes certain assumptions about the efficiency of instructions. For instance, in this case, it is assumed that the
) is more efficient than the
identified as
a = b + c;
d = a + e;
is straightforwardly implemented as
but can be optimised to
Furthermore, if the compiler knew that the variable
Suppose the compiler generates the following Z80 instructions for each procedure call:
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR
POP HL
POP DE
POP BC
POP AF
If there were two consecutive subroutine calls, they would look like this:
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR1
POP HL
POP DE
POP BC
POP AF
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR2
POP HL
POP DE
POP BC
POP AF
The sequence POP regs followed by PUSH for the same registers is generally redundant. In cases where it is redundant, a peephole optimization would remove these instructions. In the example, this would cause another redundant POP/PUSH pair to appear in the peephole, and these would be removed in turn. Removing all of the redundant code in the example above would eventually leave the following code:
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR1
CALL _ADDR2
POP HL
POP DE
POP BC
POP AF
programmers
to implement them using a pattern matching
algorithm
.
Optimization (computer science)
In computer science, program optimization or software optimization is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources...
performed over a very small set of instructions in a segment of generated code. The set is called a "peephole" or a "window". It works by recognising sets of instructions that don't actually do anything, or that can be replaced by a leaner set of instructions.
Replacement rules
Common techniques applied in peephole optimization:- Constant foldingConstant foldingConstant folding and constant propagation are related compiler optimizations used by many modern compilers. An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and simultaneously remove dead code.- Constant folding...
- Evaluate constant subexpressions in advance. - Strength reductionStrength reductionStrength reduction is a compiler optimization where expensive operations are replaced with equivalent but less expensive operations. The classic example of strength reduction converts "strong" multiplications inside a loop into "weaker" additions – something that frequently occurs in array...
- Replace slow operations with faster equivalents. - Null sequences - Delete useless operations
- Combine Operations: Replace several operations with one equivalent.
- Algebraic Laws: Use algebraic laws to simplify or reorder instructions.
- Special Case Instructions: Use instructions designed for special operand cases.
- Address Mode Operations: Use address modes to simplify code.
There can, of course, be other types of peephole optimizations involving simplifying the target machine instructions, assuming that the target machine is known in advance. Advantages of a given architecture and instruction sets can be exploited in this case.
Replacing slow instructions with faster ones
The following Java bytecodeJava bytecode
Java bytecode is the form of instructions that the Java virtual machine executes. Each bytecode opcode is one byte in length, although some require parameters, resulting in some multi-byte instructions. Not all of the possible 256 opcodes are used. 51 are reserved for future use...
...
aload 1
aload 1
mul
...
can be replaced by
...
aload 1
dup
mul
...
This kind of optimization, like most peephole optimizations, makes certain assumptions about the efficiency of instructions. For instance, in this case, it is assumed that the
dup
operation (which duplicates and pushes the top of the stackStack (data structure)
In computer science, a stack is a last in, first out abstract data type and linear data structure. A stack can have any abstract data type as an element, but is characterized by only three fundamental operations: push, pop and stack top. The push operation adds a new item to the top of the stack,...
) is more efficient than the
aload X
operation (which loads a local variableVariable (programming)
In computer programming, a variable is a symbolic name given to some known or unknown quantity or information, for the purpose of allowing the name to be used independently of the information it represents...
identified as
X
and pushes it on the stack).Removing redundant code
Another example is to eliminate redundant load stores.a = b + c;
d = a + e;
is straightforwardly implemented as
MOV b, R0 # Copy b to the register
ADD c, R0 # Add c to the register, the register is now b+c
MOV R0, a # Copy the register to a
MOV a, R0 # Copy a to the register
ADD e, R0 # Add e to the register, the register is now a+e [(b+c)+e]
MOV R0, d # Copy the register to d
but can be optimised to
MOV b, R0 # Copy b to the register
ADD c, R0 # Add c to the register, which is now b+c (a)
MOV R0, a # Copy the register to a
ADD e, R0 # Add e to the register, which is now b+c+e [(a)+e]
MOV R0, d # Copy the register to d
Furthermore, if the compiler knew that the variable
a
was not used again, the middle operation could be omitted.Removing redundant stack instructions
If the compiler saves registers on the stack before calling a subroutine and restores them when returning, consecutive calls to subroutines may have redundant stack instructions.Suppose the compiler generates the following Z80 instructions for each procedure call:
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR
POP HL
POP DE
POP BC
POP AF
If there were two consecutive subroutine calls, they would look like this:
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR1
POP HL
POP DE
POP BC
POP AF
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR2
POP HL
POP DE
POP BC
POP AF
The sequence POP regs followed by PUSH for the same registers is generally redundant. In cases where it is redundant, a peephole optimization would remove these instructions. In the example, this would cause another redundant POP/PUSH pair to appear in the peephole, and these would be removed in turn. Removing all of the redundant code in the example above would eventually leave the following code:
PUSH AF
PUSH BC
PUSH DE
PUSH HL
CALL _ADDR1
CALL _ADDR2
POP HL
POP DE
POP BC
POP AF
Implementation
Modern architectures typically allow for many hundreds of different kinds of peephole optimizations, and it is therefore often appropriate for compilerCompiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...
programmers
Computer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...
to implement them using a pattern matching
Pattern matching
In computer science, pattern matching is the act of checking some sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact. The patterns generally have the form of either sequences or tree structures...
algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
.
See also
- Object code optimizers, discussion in relation to general algorithmic efficiencyAlgorithmic efficiencyIn computer science, efficiency is used to describe properties of an algorithm relating to how much of various types of resources it consumes. Algorithmic efficiency can be thought of as analogous to engineering productivity for a repeating or continuous process, where the goal is to reduce...
- Capex CorporationCapex CorporationCapex Corporation was a software house based in Phoenix, Arizona founded by three former employees of General Electric. It was subsequently acquired by Computer Associates.-Products:* Autotab - an early batch spreadsheet program...
- produced the COBOLCOBOLCOBOL is one of the oldest programming languages. Its name is an acronym for COmmon Business-Oriented Language, defining its primary domain in business, finance, and administrative systems for companies and governments....
optimizer, an early mainframe object code optimizerObject code optimizerAn object code optimizer, sometimes also known as a post pass optimizer or, for small sections of code, peephole optimizer, takes the output from a source language compile step - the object code or binary file - and tries to replace identifiable sections of the code with replacement code that is...
for IBMIBMInternational Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...
Cobol - SuperoptimizationSuperoptimizationSuperoptimization is the task of finding the optimal code sequence for a single, loop-free sequence of instructions. While garden-variety compiler optimizations really just improve code , a superoptimizer's goal is to find the optimal sequence at the outset.The term superoptimization was first...
External links
- [ftp://ftp.cs.princeton.edu/pub/lcc/contrib/copt.shar The copt general-purpose peephole optimizer by Christopher W. Fraser]
- The original paper