Software testing controversies
Encyclopedia
There is considerable variety among software testing
writers and consultants about what constitutes responsible software testing. Members of the "context-driven" school of testing believe that there are no "best practices" of testing, but rather that testing is a set of skills that allow the tester to select or invent testing practices to suit each unique situation. In addition, prominent members of the community consider much of the writing about software testing to be doctrine, mythology, and folklore. Some contend that this belief directly contradicts standards such as the IEEE 829
test documentation standard, and organizations such as the Food and Drug Administration
who promote them. The context-driven school's retort is that Lessons Learned in Software Testing includes one lesson supporting the use IEEE 829 and another opposing it; that not all software testing occurs in a regulated environment and that practices appropriate for such environments would be ruinously expensive, unnecessary, and inappropriate for other contexts; and that in any case the FDA generally promotes the principle of the least burdensome approach.
Some of the major controversies include:
. Instead of assuming that testers have full access to source code and complete specifications, these writers, including Kaner and James Bach, argued that testers must learn to work under conditions of uncertainty and constant change. Meanwhile, an opposing trend toward process "maturity" also gained ground, in the form of the Capability Maturity Model
. The agile testing movement (which includes but is not limited to forms of testing practiced on agile development
projects) has popularity mainly in commercial circles, whereas the CMM was embraced by government and military software providers.
However, saying that "maturity models" like CMM gained ground against or opposing Agile testing may not be right. Agile movement is a 'way of working', while CMM is a process improvement idea.
But another point of view must be considered: the operational culture of an organization. While it may be true that testers must have an ability to work in a world of uncertainty, it is also true that their flexibility must have direction. In many cases test cultures are self-directed and as a result fruitless; unproductive results can ensue. Furthermore, providing positive evidence of defects may either indicate that you have found the tip of a much larger problem, or that you have exhausted all possibilities. A framework is a test of Testing. It provides a boundary that can measure (validate) the capacity of our work. Both sides have, and will continue to argue the virtues of their work. The proof however is in each and every assessment of delivery quality. It does little good to test systematically if you are too narrowly focused. On the other hand, finding a bunch of errors is not an indicator that Agile methods was the driving force; you may simply have stumbled upon an obviously poor piece of work.
ing means simultaneous test design and test execution with an emphasis on learning. Scripted testing means that learning and test design happen prior to test execution, and quite often the learning has to be done again during test execution. Exploratory testing is very common, but in most writing and training about testing it is barely mentioned and generally misunderstood. Some writers consider it a primary and essential practice. Structured exploratory testing is a compromise when the testers are familiar with the software. A vague test plan, known as a test charter, is written up, describing what functionalities need to be tested but not how, allowing the individual testers to choose the method and steps of testing.
There are two main disadvantages associated with a primarily exploratory testing approach. The first is that there is no opportunity to prevent defects, which can happen when the designing of tests in advance serves as a form of structured static testing that often reveals problems in system requirements and design. The second is that, even with test charters, demonstrating test coverage and achieving repeatability of tests using a purely exploratory testing approach is difficult. For this reason, a blended approach of scripted and exploratory testing is often used to reap the benefits while mitigating each approach's disadvantages.
is so expensive relative to its value that it should be used sparingly. Others, such as advocates of agile development, recommend automating 100% of all tests. A challenge with automation is that automated testing requires automated test oracles (an oracle is a mechanism or principle by which a problem in the software can be recognized). Such tools have value in load testing software (by signing on to an application with hundreds or thousands of instances simultaneously), or in checking for intermittent errors in software. The success of automated software testing depends on complete and comprehensive test planning. Software development strategies such as test-driven development
are highly compatible with the idea of devoting a large part of an organization's testing resources to automated testing. Many large software organizations perform automated testing. Some have developed their own automated testing environments specifically for internal development, and not for resale.
Quis Custodiet Ipsos Custodes
(Who watches the watchmen?), or is alternatively referred
informally, as the "Heisenbug" concept (a common misconception that confuses Heisenberg
's uncertainty principle
with observer effect
). The idea is that any form of observation is also an interaction, that the act of testing can also affect that which is being tested.
In practical terms the test engineer is testing software (and sometimes hardware or firmware
) with other software (and hardware and firmware). The process can fail in ways that are not the result of defects in the target but rather result from defects in (or indeed intended features of) the testing tool.
There are metrics being developed to measure the effectiveness of testing. One method is by analyzing code coverage
(this is highly controversial) - where everyone can agree what areas are not being covered at all and try to improve coverage in these areas.
Bugs can also be placed into code on purpose, and the number of bugs that have not been found can be predicted based on the percentage of intentionally placed bugs that were found. The problem is that it assumes that the intentional bugs are the same type of bug as the unintentional ones.
Finally, there is the analysis of historical find-rates. By measuring how many bugs are found and comparing them to predicted numbers (based on past experience with similar projects), certain assumptions regarding the effectiveness of testing can be made. While not an absolute measurement of quality, if a project is halfway complete and there have been no defects found, then changes may be needed to the procedures being employed by QA.
Software testing
Software testing is an investigation conducted to provide stakeholders with information about the quality of the product or service under test. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software...
writers and consultants about what constitutes responsible software testing. Members of the "context-driven" school of testing believe that there are no "best practices" of testing, but rather that testing is a set of skills that allow the tester to select or invent testing practices to suit each unique situation. In addition, prominent members of the community consider much of the writing about software testing to be doctrine, mythology, and folklore. Some contend that this belief directly contradicts standards such as the IEEE 829
IEEE 829
IEEE 829-1998, also known as the 829 Standard for Software Test Documentation, is an IEEE standard that specifies the form of a set of documents for use in eight defined stages of software testing, each stage potentially producing its own separate type of document...
test documentation standard, and organizations such as the Food and Drug Administration
Food and Drug Administration
The Food and Drug Administration is an agency of the United States Department of Health and Human Services, one of the United States federal executive departments...
who promote them. The context-driven school's retort is that Lessons Learned in Software Testing includes one lesson supporting the use IEEE 829 and another opposing it; that not all software testing occurs in a regulated environment and that practices appropriate for such environments would be ruinously expensive, unnecessary, and inappropriate for other contexts; and that in any case the FDA generally promotes the principle of the least burdensome approach.
Some of the major controversies include:
Agile vs. traditional
Starting around 1990, a new style of writing about testing began to challenge what had come before. The seminal work in this regard is widely considered to be Testing Computer Software, by Cem KanerCem Kaner
Cem Kaner J.D., Ph.D., is a Professor of Software Engineering at Florida Institute of Technology, and the Director of Florida Tech's Center for Software Testing Education & Research since 2004...
. Instead of assuming that testers have full access to source code and complete specifications, these writers, including Kaner and James Bach, argued that testers must learn to work under conditions of uncertainty and constant change. Meanwhile, an opposing trend toward process "maturity" also gained ground, in the form of the Capability Maturity Model
Capability Maturity Model
The Capability Maturity Model is a development model that was created after study of data collected from organizations that contracted with the U.S. Department of Defense, who funded the research. This model became the foundation from which CMU created the Software Engineering Institute...
. The agile testing movement (which includes but is not limited to forms of testing practiced on agile development
Agile software development
Agile software development is a group of software development methodologies based on iterative and incremental development, where requirements and solutions evolve through collaboration between self-organizing, cross-functional teams...
projects) has popularity mainly in commercial circles, whereas the CMM was embraced by government and military software providers.
However, saying that "maturity models" like CMM gained ground against or opposing Agile testing may not be right. Agile movement is a 'way of working', while CMM is a process improvement idea.
But another point of view must be considered: the operational culture of an organization. While it may be true that testers must have an ability to work in a world of uncertainty, it is also true that their flexibility must have direction. In many cases test cultures are self-directed and as a result fruitless; unproductive results can ensue. Furthermore, providing positive evidence of defects may either indicate that you have found the tip of a much larger problem, or that you have exhausted all possibilities. A framework is a test of Testing. It provides a boundary that can measure (validate) the capacity of our work. Both sides have, and will continue to argue the virtues of their work. The proof however is in each and every assessment of delivery quality. It does little good to test systematically if you are too narrowly focused. On the other hand, finding a bunch of errors is not an indicator that Agile methods was the driving force; you may simply have stumbled upon an obviously poor piece of work.
Exploratory vs. scripted
Exploratory testExploratory test
Exploratory testing is an approach to software testing that is concisely described as simultaneous learning, test design and test execution. Cem Kaner, who coined the term in 1983, now defines exploratory testing as "a style of software testing that emphasizes the personal freedom and...
ing means simultaneous test design and test execution with an emphasis on learning. Scripted testing means that learning and test design happen prior to test execution, and quite often the learning has to be done again during test execution. Exploratory testing is very common, but in most writing and training about testing it is barely mentioned and generally misunderstood. Some writers consider it a primary and essential practice. Structured exploratory testing is a compromise when the testers are familiar with the software. A vague test plan, known as a test charter, is written up, describing what functionalities need to be tested but not how, allowing the individual testers to choose the method and steps of testing.
There are two main disadvantages associated with a primarily exploratory testing approach. The first is that there is no opportunity to prevent defects, which can happen when the designing of tests in advance serves as a form of structured static testing that often reveals problems in system requirements and design. The second is that, even with test charters, demonstrating test coverage and achieving repeatability of tests using a purely exploratory testing approach is difficult. For this reason, a blended approach of scripted and exploratory testing is often used to reap the benefits while mitigating each approach's disadvantages.
Manual vs. automated
Some writers believe that test automationTest automation
Test automation is the use of software to control the execution of tests, the comparison of actual outcomes to predicted outcomes, the setting up of test preconditions, and other test control and test reporting functions...
is so expensive relative to its value that it should be used sparingly. Others, such as advocates of agile development, recommend automating 100% of all tests. A challenge with automation is that automated testing requires automated test oracles (an oracle is a mechanism or principle by which a problem in the software can be recognized). Such tools have value in load testing software (by signing on to an application with hundreds or thousands of instances simultaneously), or in checking for intermittent errors in software. The success of automated software testing depends on complete and comprehensive test planning. Software development strategies such as test-driven development
Test-driven development
Test-driven development is a software development process that relies on the repetition of a very short development cycle: first the developer writes a failing automated test case that defines a desired improvement or new function, then produces code to pass that test and finally refactors the new...
are highly compatible with the idea of devoting a large part of an organization's testing resources to automated testing. Many large software organizations perform automated testing. Some have developed their own automated testing environments specifically for internal development, and not for resale.
Software design vs. software implementation
Ideally, software testers should not be limited only to testing software implementation, but also to testing software design. With this assumption, the role and involvement of testers will change dramatically. In such an environment, the test cycle will change too. To test software design, testers would review requirement and design specifications together with designer and programmer, potentially helping to identify bugs earlier in software development.Who watches the watchmen?
One principle in software testing is summed up by the classical Latin question posed by Juvenal:Quis Custodiet Ipsos Custodes
Quis custodiet ipsos custodes?
' is a Latin phrase traditionally attributed to the Roman poet Juvenal from his Satires , which is literally translated as "Who will guard the guards themselves?" Also sometimes rendered as "Who watches the watchmen?", the phrase has other idiomatic translations and adaptations such as "Who will...
(Who watches the watchmen?), or is alternatively referred
informally, as the "Heisenbug" concept (a common misconception that confuses Heisenberg
Werner Heisenberg
Werner Karl Heisenberg was a German theoretical physicist who made foundational contributions to quantum mechanics and is best known for asserting the uncertainty principle of quantum theory...
's uncertainty principle
Uncertainty principle
In quantum mechanics, the Heisenberg uncertainty principle states a fundamental limit on the accuracy with which certain pairs of physical properties of a particle, such as position and momentum, can be simultaneously known...
with observer effect
Observer effect
Observer effect may refer to:* Observer effect , the impact of observing a process while it is running* Observer effect , the impact of observing a physical system...
). The idea is that any form of observation is also an interaction, that the act of testing can also affect that which is being tested.
In practical terms the test engineer is testing software (and sometimes hardware or firmware
Firmware
In electronic systems and computing, firmware is a term often used to denote the fixed, usually rather small, programs and/or data structures that internally control various electronic devices...
) with other software (and hardware and firmware). The process can fail in ways that are not the result of defects in the target but rather result from defects in (or indeed intended features of) the testing tool.
There are metrics being developed to measure the effectiveness of testing. One method is by analyzing code coverage
Code coverage
Code coverage is a measure used in software testing. It describes the degree to which the source code of a program has been tested. It is a form of testing that inspects the code directly and is therefore a form of white box testing....
(this is highly controversial) - where everyone can agree what areas are not being covered at all and try to improve coverage in these areas.
Bugs can also be placed into code on purpose, and the number of bugs that have not been found can be predicted based on the percentage of intentionally placed bugs that were found. The problem is that it assumes that the intentional bugs are the same type of bug as the unintentional ones.
Finally, there is the analysis of historical find-rates. By measuring how many bugs are found and comparing them to predicted numbers (based on past experience with similar projects), certain assumptions regarding the effectiveness of testing can be made. While not an absolute measurement of quality, if a project is halfway complete and there have been no defects found, then changes may be needed to the procedures being employed by QA.