Disassembler
Encyclopedia
A disassembler is a computer program
that translates machine language into assembly language
—the inverse operation to that of an assembler. A disassembler differs from a decompiler
, which targets a high-level language rather than an assembly language. Disassembly, the output of a disassembler, is often formatted for human-readability rather than suitability for input to an assembler, making it principally a reverse-engineering
tool.
Assembly language source code
generally permits the use of constant
s and programmer comment
s. These are usually removed from the assembled machine code
by the assembler. If so, a disassembler operating on the machine code would produce disassembly lacking these constants and comments; the disassembled output becomes more difficult for a human to interpret than the original annotated source code. Some disassemblers make use of the symbolic debugging information present in object files such as ELF. The Interactive Disassembler
allow the human user to make up mnemonic symbols for values or regions of code in an interactive session: human insight applied to the disassembly process often parallels human creativity in the code writing process.
Disassembly is not an exact science: on CISC
platforms with variable-width instructions, or in the presence of self-modifying code
, it is possible for a single program to have two or more reasonable disassemblies. Determining which instructions would actually be encountered during a run of the program reduces
to the proven-unsolvable halting problem
.
Any interactive debugger
will include some way of viewing the disassembly of the program being debugged. Often, the same disassembly tool will be packaged as a standalone disassembler distributed along with the debugger. For example, objdump
, part of GNU Binutils, is related to the interactive debugger gdb.
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...
that translates machine language into assembly language
Assembly language
An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture...
—the inverse operation to that of an assembler. A disassembler differs from a decompiler
Decompiler
A decompiler is the name given to a computer program that performs, as far as possible, the reverse operation to that of a compiler. That is, it translates a file containing information at a relatively low level of abstraction into a form having a higher level of abstraction...
, which targets a high-level language rather than an assembly language. Disassembly, the output of a disassembler, is often formatted for human-readability rather than suitability for input to an assembler, making it principally a reverse-engineering
Reverse engineering
Reverse engineering is the process of discovering the technological principles of a device, object, or system through analysis of its structure, function, and operation...
tool.
Assembly language source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...
generally permits the use of constant
Constant (programming)
In computer programming, a constant is an identifier whose associated value cannot typically be altered by the program during its execution...
s and programmer comment
Comment (computer programming)
In computer programming, a comment is a programming language construct used to embed programmer-readable annotations in the source code of a computer program. Those annotations are potentially significant to programmers but typically ignorable to compilers and interpreters. Comments are usually...
s. These are usually removed from the assembled machine code
Machine code
Machine code or machine language is a system of impartible instructions executed directly by a computer's central processing unit. Each instruction performs a very specific task, typically either an operation on a unit of data Machine code or machine language is a system of impartible instructions...
by the assembler. If so, a disassembler operating on the machine code would produce disassembly lacking these constants and comments; the disassembled output becomes more difficult for a human to interpret than the original annotated source code. Some disassemblers make use of the symbolic debugging information present in object files such as ELF. The Interactive Disassembler
Interactive Disassembler
The Interactive Disassembler, more commonly known as simply IDA, is a disassembler for computer software which generates assembly language source code from machine-executable code. It supports a variety of executable formats for different processors and operating systems. It also can be used as a...
allow the human user to make up mnemonic symbols for values or regions of code in an interactive session: human insight applied to the disassembly process often parallels human creativity in the code writing process.
Disassembly is not an exact science: on CISC
Complex instruction set computer
A complex instruction set computer , is a computer where single instructions can execute several low-level operations and/or are capable of multi-step operations or addressing modes within single instructions...
platforms with variable-width instructions, or in the presence of self-modifying code
Self-modifying code
In computer science, self-modifying code is code that alters its own instructions while it is executing - usually to reduce the instruction path length and improve performance or simply to reduce otherwise repetitively similar code, thus simplifying maintenance...
, it is possible for a single program to have two or more reasonable disassemblies. Determining which instructions would actually be encountered during a run of the program reduces
Reduction (complexity)
In computability theory and computational complexity theory, a reduction is a transformation of one problem into another problem. Depending on the transformation used this can be used to define complexity classes on a set of problems....
to the proven-unsolvable halting problem
Halting problem
In computability theory, the halting problem can be stated as follows: Given a description of a computer program, decide whether the program finishes running or continues to run forever...
.
Problems of disassembly
Writing a disassembler which produces code which, when assembled, produces exactly the same binary, is non-trivial; there are often differences. However, even when a totally correct disassembly is produced, problems remain if the program is to be modified. For example, the same machine language jump instruction can be generated by assembly code which jumps to a specified location (for example, to execute specific code), or which jumps by a specified number of bytes (for example, to skip over an unwanted branch). A disassembler cannot know what is intended, and may use either syntax, generating a disassembly which reproduces the original binary. However, if a programmer wants to add instructions between the jump instruction and its destination, it is necessary to understand the program's operation to determine whether the jump should be absolute or relative, i.e., whether its destination should remain at a fixed location, or be moved so as to skip both the original and added instructions.Examples of disassemblers
A disassembler may be stand-alone or interactive. A stand-alone disassembler, when executed, generates an assembly language file which can be examined; an interactive one shows the effect of any change the user makes immediately. For example, the disassembler may initially not know that a section of the program is actually code, and treat it as data; if the user specifies that it is code, the resulting disassembled code is shown immediately, allowing the user to examine it and take further action during the same run.Any interactive debugger
Debugger
A debugger or debugging tool is a computer program that is used to test and debug other programs . The code to be examined might alternatively be running on an instruction set simulator , a technique that allows great power in its ability to halt when specific conditions are encountered but which...
will include some way of viewing the disassembly of the program being debugged. Often, the same disassembly tool will be packaged as a standalone disassembler distributed along with the debugger. For example, objdump
Objdump
objdump is a program for displaying various information about object files. For instance, it can be used as a disassembler to view executable in assembly form...
, part of GNU Binutils, is related to the interactive debugger gdb.
- IDAInteractive DisassemblerThe Interactive Disassembler, more commonly known as simply IDA, is a disassembler for computer software which generates assembly language source code from machine-executable code. It supports a variety of executable formats for different processors and operating systems. It also can be used as a...
- OllyDbgOllyDbgOllyDbg is an x86 debugger that emphasizes binary code analysis, which is useful when source code is not available. It traces registers, recognizes procedures, API calls, switches, tables, constants and strings, as well as locates routines from object files and libraries. Version 1.10 is the final...
is a 32-bit assembler level analysing debugger - OLIVER and SIMONSIMON (Batch Interactive test/debug)SIMON was a proprietary test/debugging toolkit for interactively testing Batch programs designed to run on IBM's System 360/370/390 architecture....
include disassemblers for Assembler, COBOL, and PL/1
External links
- List of disassemblers in Wikibooks
- transformation Wiki on disassembly
- Boomerang A general, open source, retargetable decompiler of machine code programs
- Online Disassembler, a free online disassembler of arms, mips, ppc, and x86 code