Unusual software bug
Encyclopedia
Unusual software bugs are a class of software bug
Software bug
A software bug is the common term used to describe an error, flaw, mistake, failure, or fault in a computer program or system that produces an incorrect or unexpected result, or causes it to behave in unintended ways. Most bugs arise from mistakes and errors made by people in either a program's...

s that are considered exceptionally difficult to understand and repair. There are several kinds, mostly named after scientists who discovered counterintuitive things.

Bohrbug

A bohrbug (named after the Bohr atom model) is a bug that manifests itself consistently under a well-defined (but possibly unknown) set of conditions. Thus, in contrast with heisenbugs, a bohrbug does not disappear or alter its characteristics when it is researched. These include the easiest bugs to fix (where the nature of the problem is obvious), but also bugs that are hard to find and fix and remain in the software during the operational phase. Sometimes an error might occur only when a unique data set is entered, or unique circumstances are encountered. These kinds of bugs are often present in parts of source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

 that are not invoked very often and thus might remain undetected for an extended period of time, and are sometimes termed a ghost in the code.

For example, an overflow
Arithmetic overflow
The term arithmetic overflow or simply overflow has the following meanings.# In a computer, the condition that occurs when a calculation produces a result that is greater in magnitude than that which a given register or storage location can store or represent.# In a computer, the amount by which a...

 bug in a by-the-book binary search algorithm
Binary search algorithm
In computer science, a binary search or half-interval search algorithm finds the position of a specified value within a sorted array. At each stage, the algorithm compares the input key value with the key value of the middle element of the array. If the keys match, then a matching element has been...

 may exhibit itself only when the data array under search is very large and the item to be searched for is located near the end of the array. Because programmer
Programmer
A programmer, computer programmer or coder is someone who writes computer software. The term computer programmer can refer to a specialist in one area of computer programming or to a generalist who writes code for many kinds of software. One who practices or professes a formal approach to...

s tend to test
Software testing
Software testing is an investigation conducted to provide stakeholders with information about the quality of the product or service under test. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software...

 their work using small arrays of data, and only recently have there existed machines with enough memory to hold a sufficiently large array, such a bug may go undetected for many years.

Mandelbug

A mandelbug (named after fractal
Fractal
A fractal has been defined as "a rough or fragmented geometric shape that can be split into parts, each of which is a reduced-size copy of the whole," a property called self-similarity...

 innovator Benoît Mandelbrot
Benoît Mandelbrot
Benoît B. Mandelbrot was a French American mathematician. Born in Poland, he moved to France with his family when he was a child...

) is a computer bug whose causes are so complex that its behavior appears chaotic
Chaos theory
Chaos theory is a field of study in mathematics, with applications in several disciplines including physics, economics, biology, and philosophy. Chaos theory studies the behavior of dynamical systems that are highly sensitive to initial conditions, an effect which is popularly referred to as the...

 or even non-deterministic. This word also implies that the speaker thinks it is a bohrbug rather than a heisenbug.

Mandelbug is sometimes used to describe a bug whose behavior does not appear chaotic, but whose causes are so complex that there is no practical solution. An example of this is a bug caused by a flaw in the fundamental design of the entire system.

In the literature, there are inconsistent statements about the relationships between bohrbug, heisenbug, and mandelbug: according to the above definition, mandelbugs are bohrbugs. Heisenbug and bohrbug are considered antonyms. Moreover, it is claimed that all heisenbugs are mandelbugs.

In a column in IEEE Computer, mandelbug is considered the complementary antonym to bohrbug; i.e., a software bug is either a bohrbug or a mandelbug. The apparently complex behavior of a mandelbug is assumed to be caused either by long delays between fault activation and the failure occurrence, or by influences of other software system elements (hardware, operating system, other applications) on the fault's behavior. Heisenbugs (whose behavior is influenced by a debugger, or other means of investigating the fault) are mandelbugs.

Heisenbug

A heisenbug (named after the Heisenberg uncertainty principle
Uncertainty principle
In quantum mechanics, the Heisenberg uncertainty principle states a fundamental limit on the accuracy with which certain pairs of physical properties of a particle, such as position and momentum, can be simultaneously known...

) is a computer bug that disappears or alters its characteristics when an attempt is made to study it
Observer effect (information technology)
In information technology, the observer effect is the potential impact of the act of observing a process output while the process is running. For example: if a process uses a log file to record its progress, the process could slow...

.

One common example is a bug that occurs in a program
Computer program
A computer program is a sequence of instructions written to perform a specified task with a computer. A computer requires programs to function, typically executing the program's instructions in a central processor. The program has an executable form that the computer can use directly to execute...

 that was compiled with an optimizing compiler
Compiler
A compiler is a computer program that transforms source code written in a programming language into another computer language...

, but not in the same program when compiled without optimization (e.g., for generating a debug-mode version). Another example is a bug caused by a race condition
Race condition
A race condition or race hazard is a flaw in an electronic system or process whereby the output or result of the process is unexpectedly and critically dependent on the sequence or timing of other events...

. A heisenbug may also appear in a system that does not conform to the command-query separation
Command-query separation
Command-query separation is a principle of imperative computer programming. It was devised by Bertrand Meyer as part of his pioneering work on the Eiffel programming language....

 design guideline, since a routine called more than once could return different values each time, generating hard-to-reproduce bugs in a race condition scenario.

The name heisenbug is a pun on the Heisenberg
Werner Heisenberg
Werner Karl Heisenberg was a German theoretical physicist who made foundational contributions to quantum mechanics and is best known for asserting the uncertainty principle of quantum theory...

 uncertainty principle
Uncertainty principle
In quantum mechanics, the Heisenberg uncertainty principle states a fundamental limit on the accuracy with which certain pairs of physical properties of a particle, such as position and momentum, can be simultaneously known...

, a quantum physics concept which is commonly (yet inaccurately) used to refer to the fact that in the Copenhagen Interpretation
Copenhagen interpretation
The Copenhagen interpretation is one of the earliest and most commonly taught interpretations of quantum mechanics. It holds that quantum mechanics does not yield a description of an objective reality but deals only with probabilities of observing, or measuring, various aspects of energy quanta,...

 model of quantum mechanical behaviour, observers affect what they are observing, by the mere act of observing it alone (this is actually the observer effect
Observer effect (physics)
In physics, the term observer effect refers to changes that the act of observation will make on the phenomenon being observed. This is often the result of instruments that, by necessity, alter the state of what they measure in some manner...

, and is commonly confused with the Heisenberg uncertainty principle).

One common reason for heisenbug-like behaviour is that executing
Execution (computers)
Execution in computer and software engineering is the process by which a computer or a virtual machine carries out the instructions of a computer program. The instructions in the program trigger sequences of simple actions on the executing machine...

 a program in debug
Debugger
A debugger or debugging tool is a computer program that is used to test and debug other programs . The code to be examined might alternatively be running on an instruction set simulator , a technique that allows great power in its ability to halt when specific conditions are encountered but which...

 mode often cleans memory
Computer storage
Computer data storage, often called storage or memory, refers to computer components and recording media that retain digital data. Data storage is one of the core functions and fundamental components of computers....

 before the program starts, and forces variables onto stack
Stack-based memory allocation
Stacks in computing architectures are regions of memory where data is added or removed in a last-in-first-out manner.In most modern computer systems, each thread has a reserved region of memory referred to as its stack. When a function executes, it may add some of its state data to the top of the...

 locations, instead of keeping them in registers. These differences in execution can alter the effect of bugs involving out-of-bounds member access, incorrect assumptions about the initial contents of memory, or floating-point
Floating point
In computing, floating point describes a method of representing real numbers in a way that can support a wide range of values. Numbers are, in general, represented approximately to a fixed number of significant digits and scaled using an exponent. The base for the scaling is normally 2, 10 or 16...

 comparisons (for instance, when a floating-point variable in a 32-bit stack location is compared to one in an 80-bit register). Another reason is that debuggers commonly provide watches
Breakpoint
In software development, a breakpoint is an intentional stopping or pausing place in a program, put in place for debugging purposes. It is also sometimes simply referred to as a pause....

 or other user interfaces that cause additional code (such as property accessors) to be executed, which can, in turn, change the state of the program. Yet another reason is a fandango on core
Fandango on core
Fandango on core is a computer programming term for the effects of a pointer running out of bounds, often leading to a core dump, or failures in other seemingly unrelated processes. In extreme situations, fandango on core may lead to the overwriting of operating system code, possibly causing data...

, the effect of a pointer running out of bounds. Many heisenbugs are caused by uninitialized values.

Time can also be a factor in heisenbugs. Executing a program under control of a debugger can change the execution timing of the program as compared to normal execution. Time-sensitive bugs such as race condition
Race condition
A race condition or race hazard is a flaw in an electronic system or process whereby the output or result of the process is unexpectedly and critically dependent on the sequence or timing of other events...

s may not reproduce when the program is slowed down by single-stepping source lines in the debugger. This is particularly true when the behavior involves interaction with an entity not under the control of a debugger, such as when debugging network packet processing between two machines and only one is under debugger control.

In an interview Bruce Lindsay tells of being there when the term was first used, and that it was created because Heisenberg said, "The more closely you look at one thing, the less closely can you see something else."

This claim of origin is almost certainly wrong, as the term has been used for over two decades. For example, the earliest Google-archived mention is from the mailing list (later Usenet
Usenet
Usenet is a worldwide distributed Internet discussion system. It developed from the general purpose UUCP architecture of the same name.Duke University graduate students Tom Truscott and Jim Ellis conceived the idea in 1979 and it was established in 1980...

 news group) comp.risks, moderated by Peter G. Neumann
Peter G. Neumann
Peter G. Neumann is a researcher who has worked on the Multics operating system in the 1960s. He edits the Computer Risks columns for ACM Software Engineering Notes and Communications of the ACM. He founded ACM SIGSOFT and is a Fellow of the ACM, IEEE and AAAS.He studied at Harvard University ,...

. In RISKS Digest Volume 4 : Issue 34, dated 23 December 1986, Zhahai Stewart contributes an item titled "Another heisenbug" noting that many such contributions have appeared in recent issues of RISKS Digest. The term, and especially the distinction Heisenbug/Bohrbug, was already mentioned in 1985 by Jim Gray in a paper about software failures.

Schrödinbug

A schrödinbug is a bug that manifests only after someone reading source code
Source code
In computer science, source code is text written using the format and syntax of the programming language that it is being written in. Such a language is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source...

 or using the program in an unusual way notices that it never should have worked in the first place, at which point the program promptly stops working for everybody until fixed. The Jargon File adds: "Though... this sounds impossible, it happens; some programs have harbored latent schrödinbugs for years."

The name schrödinbug was introduced in the version 2.9.9 of the Jargon file, published in April 1992. It is derived from the Schrödinger's cat
Schrödinger's cat
Schrödinger's cat is a thought experiment, usually described as a paradox, devised by Austrian physicist Erwin Schrödinger in 1935. It illustrates what he saw as the problem of the Copenhagen interpretation of quantum mechanics applied to everyday objects. The scenario presents a cat that might be...

 thought experiment. A well-written program executing in a reliable computing environment is expected to follow the principle of determinism
Determinism
Determinism is the general philosophical thesis that states that for everything that happens there are conditions such that, given them, nothing else could happen. There are many versions of this thesis. Each of them rests upon various alleged connections, and interdependencies of things and...

, and that being so the quantum questions of observability (i.e., breaking the program by reading the source code) posited by Schrödinger (i.e., killing the cat by opening the box) affecting the operation of a program is unexpected.

Repairing an obviously defective piece of code is often more important than determining what arcane set of circumstances caused it to work at all (or appear to work) in the first place, and why it then stopped. Because of this, many of these bugs are never fully understood. When a bug of this type is examined in enough detail, it can usually be reclassified as a bohrbug, heisenbug, or mandelbug.

Phase of the Moon bug

The phase of the moon is sometimes spouted as a silly parameter on which a bug might depend, such as when exasperated after trying to isolate the true cause. The Jargon File
Jargon File
The Jargon File is a glossary of computer programmer slang. The original Jargon File was a collection of terms from technical cultures such as the MIT AI Lab, the Stanford AI Lab and others of the old ARPANET AI/LISP/PDP-10 communities, including Bolt, Beranek and Newman, Carnegie Mellon...

 documents two rare instances in which data processing problems were actually caused by phase-of-the-moon timing.

In general, programs that exhibit time-dependent behavior are vulnerable to time-dependent failures. These could occur during a certain part of a scheduled process, or at special times, such as on leap days or when a process crosses a daylight saving time
Daylight saving time
Daylight saving time —also summer time in several countries including in British English and European official terminology —is the practice of temporarily advancing clocks during the summertime so that afternoons have more daylight and mornings have less...

, day, month, year, or century boundary (as with the Year 2000 bug).

Statistical bug

Statistical bugs can only be detected in aggregates and not in single runs of a section of code. These are bugs that usually affect code that is supposed to produce random
Random number generation
A random number generator ) is a computational or physical device designed to generate a sequence of numbers or symbols that lack any pattern, i.e. appear random....

 or pseudo-random
Pseudorandom number generator
A pseudorandom number generator , also known as a deterministic random bit generator , is an algorithm for generating a sequence of numbers that approximates the properties of random numbers...

 output. An example is code to generate points uniformly distributed
Uniform distribution (continuous)
In probability theory and statistics, the continuous uniform distribution or rectangular distribution is a family of probability distributions such that for each member of the family, all intervals of the same length on the distribution's support are equally probable. The support is defined by...

 on the surface of a sphere, say, and the result is that there are significantly more points in the northern hemisphere than the southern one. Tracing in detail through a single run of the point generator can completely fail to shed light on the location of such a bug because it is impossible to identify the output of any one run as wrong – after all, it's intended to be random. Only when many points are generated does the problem become apparent. Popular debugging techniques such as checking pre-
Precondition
In computer programming, a precondition is a condition or predicate that must always be true just prior to the execution of some section of code or before an operation in a formal specification....

 and postcondition
Postcondition
In computer programming, a postcondition is a condition or predicate that must always be true just after the execution of some section of code or after an operation in a formal specification. Postconditions are sometimes tested using assertions within the code itself...

s can do little to help. Similar problems can also occur in numerical algorithms in which each individual operation is accurate to within a given tolerance but where numerical errors accumulate only after a large number of runs, especially if the errors have a systematic bias. A simple example of this is the strfry function in the GNU C Library.

Alpha particle bug (single event upset)

The term alpha particle bug derives from the historical phenomenon of soft errors caused by cosmic rays. These are energetic charged subatomic particles, originating from outer space. When cosmic rays collide with molecules in the atmosphere, they produce a shower
Air shower (physics)
An air shower is an extensive cascade of ionized particles and electromagnetic radiation produced in the atmosphere when a primary cosmic ray enters the atmosphere...

 of billions of high energy radioactive particles. These particles could disturb an electron in RAM, and thus change a 0 to a 1, and vice-versa. Thus the term is used to describe a class of bug where an issue was only seen once, was verifiable at the time, but source code analysis indicates that the bug should be impossible, thus the only explanation is that an alpha particle disturbed an electron. The likely cause of such bugs is build or integration errors, or some form of unusual memory corruption. This bug is often referred to by spacecraft developers as a single event upset
Single event upset
A single event upset is a change of state caused by ions or electro-magnetic radiation striking a sensitive node in a micro-electronic device, such as in a microprocessor, semiconductor memory, or power transistors. The state change is a result of the free charge created by ionization in or close...

.

According to a study done by Intel in 1990, the number of errors caused by cosmic rays increases with the altitude of the computer and drops to zero if the computer is running in a cave. Therefore, computer chips in airplanes, space craft and other sensitive systems will have error checking in ram while common desktop computers will not. In 1998, only one error per month per 256 MiB
MIB
MIB may refer to any of several concepts:* Master of International Business, a postgraduate business degree* Melayu Islam Beraja, the adopted national philosophy of Brunei* Motion induced blindness, a visual illusion in peripheral vision...

 of ram was expected for a desktop computer. However, as chip density
Moore's Law
Moore's law describes a long-term trend in the history of computing hardware: the number of transistors that can be placed inexpensively on an integrated circuit doubles approximately every two years....

 increases, Intel expects the errors caused by cosmic rays to increase and be a limiting factor in design.

A 2011 'Black Hat' paper - "Bitsquatting - DNS Hijacking without Exploitation" - discusses the real-life security implications of such bit-flips in the internet's DNS system. The paper found up to 3,434 incorrect requests a day due to bit-flip changes for various common domains. Many of these bit-flips would probably be attributable to hardware problems, but some could be attributed to Alpha particles.

See also

  • Cargo cult programming
    Cargo cult programming
    Cargo cult programming is a style of computer programming that is characterized by the ritual inclusion of code or program structures that serve no real purpose...

  • CHESS
    CHESS model checker
    CHESS is a software model checker for finding errors/heisenbugs in multithreaded software by systematic exploration of thread schedules. It finds errors, such as data-races, deadlocks, livelocks, and data-corruption induced access violations, that are extremely hard to find with current testing tools...

    —a tool for detecting and reproducing Heisenbugs (Windows)
  • Memory debugger
    Memory debugger
    A memory debugger is a programming tool for finding memory leaks and buffer overflows. These are due to bugs related to the allocation and deallocation of dynamic memory. Programs written in languages that have garbage collection, such as managed code, might also need memory debuggers, e.g...

  • Jinx
    Jinx Debugger
    Jinx is a concurrency debugger that deterministically controls the interleaving of workloads across processor cores, focusing on shared memory interactions. Using this deterministic approach, Jinx is able to increase the frequency of occurrence of elusive shared memory bugs, sometimes called...

    —a tool that automatically explores executions likely to expose Heisenbugs

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK