Bit field
Encyclopedia
A bit field is a common idiom used in computer programming
to compactly store multiple logical values as a short series of bit
s where each of the single bits can be addressed separately. A bit field is most commonly used to represent integral types of known, fixed bit-width. A well-known usage of bit-fields
is to represent single bit flag
s with each flag stored in a separate bit.
A bit field is distinguished from a bit array in that the latter is used to store a large set of bits indexed by integers and is often wider than any integral type supported by the language. Bit fields, on the other hand, typically fit within a machine word, and the denotation of bits is independent of their numerical index.
The bitwise operators AND, OR
, and NOT
are used in combination to set/unset flags, or to determine whether certain flags are set/unset, respectively. In order to do the latter, a bit mask is required with all its bits disabled except for those that are supposed to be tested. If AND-ing the value of the bit set with the to-be-tested bit mask results in the same bit mask, then all these bits are enabled in the original bit set.
:
This object-oriented class can be used as follows:
For the flag, it is recommended to use int instead of unsigned char because an unsigned char will only support values up to 255 which can be easily exceeded in bit fields as the values grow exponentially (2^n), i.e., this value will be exceeded after 8 items. A value higher than int will reduce performance as flags are usually processed using one assembly instruction. With a higher value than int, more instructions may be required on some architectures, notably x86. Also, most compilers implement enumerations (like the above) using the int as well. Therefore, higher values than 31 couldn't be represented in an enumeration anyway (log(2147483647)/log(2) = 31, whereas 2147483647 is the upper limit of an int, values <0 aren't used). In theory, compilers could have used an unsigned type to make the available range wider. However, this is not the case for most compilers.
Instead of using hardcoded numerical representations for the powers of two (
Kernighan
and Ritchie
's book, The C Programming Language
, describes a method for defining and accessing fields directly. Using this method, bitwise operators are not needed as bit members can be accessed the same as members of a structure without the need to create a object-oriented class like the one above. An example using C's
in memory is CPU dependent and memory padding rules can vary between compiler
s. In addition, less well-optimized compilers sometimes generate poor quality code for reading and writing bit members and there are potentially thread safety issues relating to bit fields because most machines cannot manipulate arbitrary sets of bits in memory, but must instead load and store whole words. For instance, the following code would not be thread-safe
, despite the use of a mutex for each member:
The root of the problem is that on most machines it is impossible to load and store
principle when combining
Quantities defined as the combination of bits are actually elements of the elementary abelian group
(Z/2Z)n; and the
relation defined as
Computer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...
to compactly store multiple logical values as a short series of bit
Bit
A bit is the basic unit of information in computing and telecommunications; it is the amount of information stored by a digital device or other physical system that exists in one of two possible distinct states...
s where each of the single bits can be addressed separately. A bit field is most commonly used to represent integral types of known, fixed bit-width. A well-known usage of bit-fields
is to represent single bit flag
Flag (computing)
In computer programming, flag can refer to one or more bits that are used to store a binary value or code that has an assigned meaning, but can refer to uses of other data types...
s with each flag stored in a separate bit.
A bit field is distinguished from a bit array in that the latter is used to store a large set of bits indexed by integers and is often wider than any integral type supported by the language. Bit fields, on the other hand, typically fit within a machine word, and the denotation of bits is independent of their numerical index.
Implementation
Although languages such as C or C++ have built-in support for bit fields, these can be still implemented manually, even in languages lacking native bit field support. It suffices to have a set of integer constants, to which each a power of two is assigned, that semantically associates each individual bit with its respective semantic state.The bitwise operators AND, OR
Logical disjunction
In logic and mathematics, a two-place logical connective or, is a logical disjunction, also known as inclusive disjunction or alternation, that results in true whenever one or more of its operands are true. E.g. in this context, "A or B" is true if A is true, or if B is true, or if both A and B are...
, and NOT
Negation
In logic and mathematics, negation, also called logical complement, is an operation on propositions, truth values, or semantic values more generally. Intuitively, the negation of a proposition is true when that proposition is false, and vice versa. In classical logic negation is normally identified...
are used in combination to set/unset flags, or to determine whether certain flags are set/unset, respectively. In order to do the latter, a bit mask is required with all its bits disabled except for those that are supposed to be tested. If AND-ing the value of the bit set with the to-be-tested bit mask results in the same bit mask, then all these bits are enabled in the original bit set.
Examples
Example implementation in CC (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....
:
This object-oriented class can be used as follows:
For the flag, it is recommended to use int instead of unsigned char because an unsigned char will only support values up to 255 which can be easily exceeded in bit fields as the values grow exponentially (2^n), i.e., this value will be exceeded after 8 items. A value higher than int will reduce performance as flags are usually processed using one assembly instruction. With a higher value than int, more instructions may be required on some architectures, notably x86. Also, most compilers implement enumerations (like the above) using the int as well. Therefore, higher values than 31 couldn't be represented in an enumeration anyway (log(2147483647)/log(2) = 31, whereas 2147483647 is the upper limit of an int, values <0 aren't used). In theory, compilers could have used an unsigned type to make the available range wider. However, this is not the case for most compilers.
Instead of using hardcoded numerical representations for the powers of two (
0x08
), the use of the bit shift operator (1 << 3
) with an incrementing shift operand is used for easier readability.Kernighan
Brian Kernighan
Brian Wilson Kernighan is a Canadian computer scientist who worked at Bell Labs alongside Unix creators Ken Thompson and Dennis Ritchie and contributed to the development of Unix. He is also coauthor of the AWK and AMPL programming languages. The 'K' of K&R C and the 'K' in AWK both stand for...
and Ritchie
Dennis Ritchie
Dennis MacAlistair Ritchie , was an American computer scientist who "helped shape the digital era." He created the C programming language and, with long-time colleague Ken Thompson, the UNIX operating system...
's book, The C Programming Language
The C Programming Language (book)
The C Programming Language is a well-known programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the language, as well as co-designed the Unix operating system with which development of the language was closely intertwined...
, describes a method for defining and accessing fields directly. Using this method, bitwise operators are not needed as bit members can be accessed the same as members of a structure without the need to create a object-oriented class like the one above. An example using C's
structStructstruct is a computer science term for a record that is used to store more than one value.struct is used in the following programming languages:* struct * struct vs. C++ classes...
keyword and C++'s bool
data type follows:Drawbacks of the structure-based approach
Bit members in structures as presented above have potential practical drawbacks. First, the ordering of bitsEndianness
In computing, the term endian or endianness refers to the ordering of individually addressable sub-components within the representation of a larger data item as stored in external memory . Each sub-component in the representation has a unique degree of significance, like the place value of digits...
in memory is CPU dependent and memory padding rules can vary between compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...
s. In addition, less well-optimized compilers sometimes generate poor quality code for reading and writing bit members and there are potentially thread safety issues relating to bit fields because most machines cannot manipulate arbitrary sets of bits in memory, but must instead load and store whole words. For instance, the following code would not be thread-safe
Thread-safe
Thread safety is a computer programming concept applicable in the context of multi-threaded programs. A piece of code is thread-safe if it only manipulates shared data structures in a thread-safe manner, which enables safe execution by multiple threads at the same time...
, despite the use of a mutex for each member:
The root of the problem is that on most machines it is impossible to load and store
flag
and counter
separately, when both are stored in the same word. (In order for this to be thread-safe you should use a single mutex to protect both flag
and counter
, instead of two.)Homogeneity and mathematical structure
In the examples above, the individual power-of-two values are declared as macros (ending up being the machine word). Since bit fields are in essence destined to be combined with the bitwise OR operator, such code fails the type safetyType safety
In computer science, type safety is the extent to which a programming language discourages or prevents type errors. A type error is erroneous or undesirable program behaviour caused by a discrepancy between differing data types...
principle when combining
Preference_PlaysGolf | Preference_LikesIceCream
that does not belong to the enumeration.Quantities defined as the combination of bits are actually elements of the elementary abelian group
Elementary Abelian group
In group theory, an elementary abelian group is a finite abelian group, where every nontrivial element has order p, where p is a prime; in particular it is a p-group....
(Z/2Z)n; and the
relation defined as
a <= b
when (a & b) = a
only creates a partial order, i.e. 1011b
is greater than 1010b but
1011b
and 1101b
are not comparable (whereas the suffix b denotes that the numbers are binary).
This remark is of interest when designing variable importance debug channels (ranging from informative to fatal); the regular integer comparison cannot be used to filter out part of the messages.
Nevertheless, a bit field can be safely and elegantly implemented using a bit array where the bit indices for each flag are values of an enumerated typeEnumerated typeIn computer programming, an enumerated type is a data type consisting of a set of named values called elements, members or enumerators of the type. The enumerator names are usually identifiers that behave as constants in the language...
(like the class in Java); this avoids the dangers of direct bitwise manipulations.
See also
- Mask (computing)Mask (computing)In computer science, a mask is data that is used for bitwise operations.Using a mask, multiple bits in a byte, nibble, word can be set either on, off or inverted from on to off in a single bitwise operation.-Masking bits to 1:...
- BitboardBitboardA bitboard is a data structure commonly used in computer systems that play board games.A bitboard, often used for boardgames such as chess, checkers and othello, is a specialization of the bitset data structure, where each bit represents a game position or state, designed for optimization of speed...
, used in chess and similar games.
- Bit array
- Flag word
External links