Intel HEX
Encyclopedia
Intel HEX is a file format
for conveying binary information for applications like programming microcontroller
s, EPROM
s, and other kinds of chips. It is one of the oldest file formats available for this purpose, having been in use since the 1970s.
In a typical application, a compiler
converts a program
(such as in C
or assembly language
) to machine code
and outputs it into a HEX file; that file is then imported by a programmer
to "burn" the machine code into a chip.
, with each line containing hexadecimal
values encoding a sequence of data and their starting offset or absolute address.
There are three types of Intel HEX: 8-bit, 16-bit, and 32-bit. They are distinguished by their byte order
.
Each line of Intel HEX file consists of six parts:
There are six record types:
There are various format subtypes:
Beware! Byte-swapped data might be more confusing. It is possible to misinterpret the byte order in case of I16HEX and I32HEX.
A similar encoding, with slightly different ASCII formatting, termed SREC
is used with Motorola processors.
:100110002146017EB7C20001FF5F16002148011988
:10012000194E79234623965778239EDA3F01B2CAA7
:100130003F0156702B5E712B722B732146013421C7
:00000001FF
File format
A file format is a particular way that information is encoded for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for...
for conveying binary information for applications like programming microcontroller
Microcontroller
A microcontroller is a small computer on a single integrated circuit containing a processor core, memory, and programmable input/output peripherals. Program memory in the form of NOR flash or OTP ROM is also often included on chip, as well as a typically small amount of RAM...
s, EPROM
EPROM
An EPROM , or erasable programmable read only memory, is a type of memory chip that retains its data when its power supply is switched off. In other words, it is non-volatile. It is an array of floating-gate transistors individually programmed by an electronic device that supplies higher voltages...
s, and other kinds of chips. It is one of the oldest file formats available for this purpose, having been in use since the 1970s.
In a typical application, a compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...
converts a program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...
(such as in C
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
or assembly language
Assembly language
An assembly language is a low-level programming language for computers, microprocessors, microcontrollers, and other programmable devices. It implements a symbolic representation of the machine codes and other constants needed to program a given CPU architecture...
) to machine code
Machine code
Machine code or machine language is a system of impartible instructions executed directly by a computer's central processing unit. Each instruction performs a very specific task, typically either an operation on a unit of data Machine code or machine language is a system of impartible instructions...
and outputs it into a HEX file; that file is then imported by a programmer
Programmer (hardware)
In field of computer hardware, the term programmer, chip programmer or device programmer refers to a hardware device that configures programmable non-volatile circuits such as EPROMs, EEPROMs, Flashs, PALs, FPGAs or programmable logic circuits....
to "burn" the machine code into a chip.
Format
The format is a text fileText file
A text file is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists within a computer file system...
, with each line containing hexadecimal
Hexadecimal
In mathematics and computer science, hexadecimal is a positional numeral system with a radix, or base, of 16. It uses sixteen distinct symbols, most often the symbols 0–9 to represent values zero to nine, and A, B, C, D, E, F to represent values ten to fifteen...
values encoding a sequence of data and their starting offset or absolute address.
There are three types of Intel HEX: 8-bit, 16-bit, and 32-bit. They are distinguished by their byte order
Endianness
In computing, the term endian or endianness refers to the ordering of individually addressable sub-components within the representation of a larger data item as stored in external memory . Each sub-component in the representation has a unique degree of significance, like the place value of digits...
.
Each line of Intel HEX file consists of six parts:
- Start code, one character, an ASCIIASCIIThe American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
colon ':'. - Byte count, two hex digits, a number of bytes (hex digit pairs) in the data field. 16 (0x10) or 32 (0x20) bytes of data are the usual compromise values between line length and address overhead.
- Address, four hex digits, a 16-bit address of the beginning of the memory position for the data. Limited to 64 kilobytes, the limit is worked around by specifying higher bits via additional record types. This address is big endian.
- Record type, two hex digits, 00 to 05, defining the type of the data field.
- Data, a sequence of n bytes of the data themselves, represented by 2n hex digits.
- ChecksumChecksumA checksum or hash sum is a fixed-size datum computed from an arbitrary block of digital data for the purpose of detecting accidental errors that may have been introduced during its transmission or storage. The integrity of the data can be checked at any later time by recomputing the checksum and...
, two hex digits - the least significant byte of the two's complementTwo's complementThe two's complement of a binary number is defined as the value obtained by subtracting the number from a large power of two...
of the sum of the values of all fields except fields 1 and 6 (Start code ":" byte and two hex digits of the Checksum). It is calculated by adding together the hex-encoded bytes (hex digit pairs), then leaving only the least significant byte of the result, and making a 2's complement (either by subtracting the byte from 0x100, or inverting it by XOR-ing with 0xFF and adding 0x01). If you are not working with 8-bit variables, you must suppress the overflow by AND-ing the result with 0xFF. The overflow may occur since both 0x100-0 and (0x00 XOR 0xFF)+1 equal 0x100. If the checksum is correctly calculated, adding all the bytes (the Byte count, both bytes in Address, the Record type, each Data byte and the Checksum) together will always result in a value wherein the least significant byte is zero (0x00).
For example, on :0300300002337A1E
03 + 00 + 30 + 00 + 02 + 33 + 7A = E2, 2's complement is 1E
There are six record types:
- 00, data record, contains data and 16-bit address. The format described above.
- 01, End Of File record. Must occur exactly once per file in the last line of the file. The byte count is 00 and the data field is empty. Usually the address field is also 0000, in which case the complete line is ':00000001FF'. Originally the End Of File record could contain a start address for the program being loaded, e.g. :00AB2F0125 would cause a jump to address AB2F. This was convenient when programs were loaded from punched paper tape.
- 02, Extended Segment Address Record, segment-base address (two hex digit pairs in big endian order). Used when 16 bits are not enough, identical to 80x86 real mode addressing. The address specified by the data field of the most recent 02 record is multiplied by 16 (shifted 4 bits left) and added to the subsequent 00 record addresses. This allows addressing of up to a megabyte of address space. The address field of this record has to be 0000, the byte count is 02 (the segment is 16-bit). The least significant hex digit of the segment address is always 0.
- 03, Start Segment Address Record. For 80x86 processors, it specifies the initial content of the CS:IP registers. The address field is 0000, the byte count is 04, the first two bytes are the CSCode segmentIn computing, a code segment, also known as a text segment or simply as text, is one of the sections of a program in an object file or in memory, which contains executable instructions....
value, the latter two are the IP value. - 04, Extended Linear Address Record, allowing for fully 32 bit addressing (up to 4GiB). The address field is 0000, the byte count is 02. The two data bytes (two hex digit pairs in big endian order) represent the upper 16 bits of the 32 bit address for all subsequent 00 type records until the next 04 type record comes. If there is not a 04 type record, the upper 16 bits default to 0000. To get the absolute address for subsequent 00 type records, the address specified by the data field of the most recent 04 record is added to the 00 record addresses.
- 05, Start Linear Address Record. The address field is 0000, the byte count is 04. The 4 data bytes represent the 32-bit value loaded into the EIP register of the 80386 and higher CPU.
There are various format subtypes:
- I8HEX or INTEL 8, 8-bit format. Allows usage of 00 and 01 records.
- I16HEX or INTEL 16, 16-bit format (supertype of 8-bit format). Allows usage of 00, 01, 02 and 03 records. The data field endiannessEndiannessIn computing, the term endian or endianness refers to the ordering of individually addressable sub-components within the representation of a larger data item as stored in external memory . Each sub-component in the representation has a unique degree of significance, like the place value of digits...
may be byte-swapped. - I32HEX or INTEL 32, 32-bit format (supertype of 16-bit format). Allows usage of 00, 01, 02, 03, 04 and 05 records. The data field endianness may be byte-swapped.
Beware! Byte-swapped data might be more confusing. It is possible to misinterpret the byte order in case of I16HEX and I32HEX.
A similar encoding, with slightly different ASCII formatting, termed SREC
SREC (file format)
The Motorola S-record format is an ASCII hexadecimal text encoding for binary data. It is also known as the SREC or S19 format. Each record contains a checksum to detect data that has been corrupted during transmission. The first record may include arbitrary comments such as a program name or...
is used with Motorola processors.
Example
:10010000214601360121470136007EFE09D2190140:100110002146017EB7C20001FF5F16002148011988
:10012000194E79234623965778239EDA3F01B2CAA7
:100130003F0156702B5E712B722B732146013421C7
:00000001FF
External links
- Intel Hexadecimal Object File Format Specification 1988 (PDF), Revision A, January 6, 1988
- Format description at PIC List
- Format description
- SRecord, multi-platform GPL'ed tool for manipulating EPROM load files.
- Binex, a converter between Intel HEX and binary.
- libgis, open source library to handle Intel HEX (and more).
- SB-Projects: fileformats: intel hex, clear and well structured reference with various examples