Automated species identification
Encyclopedia
Automated species identification is a method of making the expertise of taxonomists
Taxonomy
Taxonomy is the science of identifying and naming species, and arranging them into a classification. The field of taxonomy, sometimes referred to as "biological taxonomy", revolves around the description and use of taxonomic units, known as taxa...

 available to ecologists, parataxonomists
Parataxonomy
Parataxonomy is the use of less qualified assistance to, or replacement of, taxonomists in the practice and science of classification.Parataxonomy may be used to improve taxonomic efficiency by enabling more expert taxonomists to restrict their activity to the tasks that require their specialist...

 and others via computers, PDA's and other digital technology.

Introduction

The automated identification of biological objects such as insects (individuals) and/or groups (e.g., species
Species
In biology, a species is one of the basic units of biological classification and a taxonomic rank. A species is often defined as a group of organisms capable of interbreeding and producing fertile offspring. While in many cases this definition is adequate, more precise or differing measures are...

, guilds, characters) has been a dream among systematists
Systematics
Biological systematics is the study of the diversification of terrestrial life, both past and present, and the relationships among living things through time. Relationships are visualized as evolutionary trees...

 for centuries. The goal of some of the first multivariate biometric methods was to address the perennial problem of group discrimination and inter-group characterization. Despite much preliminary work in the 1950s and '60s, progress in designing and implementing practical systems for fully automated object biological identification has proven frustratingly slow. As recently as 2004 Dan Janzen
Daniel Janzen
Daniel Hunt Janzen is an evolutionary ecologist, naturalist, and conservationist and the son of a previous Director of the US Fish and Wildlife Service...

 updated the dream for a new audience:

The spaceship lands. He steps out. He points it around. It says ‘friendly–unfriendly—edible–poisonous—safe– dangerous—living–inanimate’. On the next sweep it says ‘Quercus oleoides—Homo sapiens—Spondias mombin—Solanum nigrum—Crotalus durissus—Morpho peleides— serpentine’. This has been in my head since reading science fiction in ninth grade half a century ago.

The species identification problem

Janzen’s preferred solution to this classic problem involved building machines to identify species from their DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...

. His predicted budget and proposed research team is “US$1 million and five bright people.” However, recent developments in computer architectures, as well as innovations in software design, have placed the tools needed to realize Janzen’s vision in the hands of the systematics
Systematics
Biological systematics is the study of the diversification of terrestrial life, both past and present, and the relationships among living things through time. Relationships are visualized as evolutionary trees...

 community not in several years hence, but now; and not just for DNA barcodes, but for digital images of organisms too. A recent survey of results accuracy results for small-scale trials (<50 taxa) obtained by such systems shows an average reproducible accuracy of over 85 percent with no significant correlation between accuracy and the number of included taxa or the type of group being assessed (e.g., butterflies, moths
Moths
Moths may refer to:* Gustav Moths , German rower* The Moths!, an English indie rock band* MOTHS, members of the Memorable Order of Tin Hats...

, bees, pollen
Pollen
Pollen is a fine to coarse powder containing the microgametophytes of seed plants, which produce the male gametes . Pollen grains have a hard coat that protects the sperm cells during the process of their movement from the stamens to the pistil of flowering plants or from the male cone to the...

, spore
Spore
In biology, a spore is a reproductive structure that is adapted for dispersal and surviving for extended periods of time in unfavorable conditions. Spores form part of the life cycles of many bacteria, plants, algae, fungi and some protozoa. According to scientist Dr...

s, foraminifera
Foraminifera
The Foraminifera , or forams for short, are a large group of amoeboid protists which are among the commonest plankton species. They have reticulating pseudopods, fine strands of cytoplasm that branch and merge to form a dynamic net...

, dinoflagellates, vertebrates). Moreover, these identifications, often involving thousands of individual specimens, can be made in a fraction of the time required by human experts and can be done on site, on demand, anywhere in the world.

These developments could not have come at a better time. As the taxonomic
Alpha taxonomy
Alpha taxonomy is the discipline concerned with finding, describing and naming species of living or fossil organisms. This field is supported by institutions holding collections of these organisms, with relevant data, carefully curated: such institutes include natural history museums, herbaria and...

 community already knows, the world is running out of specialists who can identify the very biodiversity
Biodiversity
Biodiversity is the degree of variation of life forms within a given ecosystem, biome, or an entire planet. Biodiversity is a measure of the health of ecosystems. Biodiversity is in part a function of climate. In terrestrial habitats, tropical regions are typically rich whereas polar regions...

 whose preservation has become a global concern. In commenting on this problem in palaeontology as long ago as 1993, Roger Kaesler recognized:

“… we are running out of systematic palaeontologists who have anything approaching synoptic knowledge of a major group of organisms ... Palaeontologists of the next century are unlikely to have the luxury of dealing at length with taxonomic problems … Palaeontology will have to sustain its level of excitement without the aid of systematists, who have contributed so much to its success.”
.

This expertise deficiency cuts as deeply into those commercial industries that rely on accurate identifications (e.g., agriculture
Agriculture
Agriculture is the cultivation of animals, plants, fungi and other life forms for food, fiber, and other products used to sustain life. Agriculture was the key implement in the rise of sedentary human civilization, whereby farming of domesticated species created food surpluses that nurtured the...

, biostratigraphy
Biostratigraphy
Biostratigraphy is the branch of stratigraphy which focuses on correlating and assigning relative ages of rock strata by using the fossil assemblages contained within them. Usually the aim is correlation, demonstrating that a particular horizon in one geological section represents the same period...

) as it does into a wide range of pure and applied research programmes (e.g., conservation
Conservation biology
Conservation biology is the scientific study of the nature and status of Earth's biodiversity with the aim of protecting species, their habitats, and ecosystems from excessive rates of extinction...

, biological oceanography
Oceanography
Oceanography , also called oceanology or marine science, is the branch of Earth science that studies the ocean...

, climatology
Climatology
Climatology is the study of climate, scientifically defined as weather conditions averaged over a period of time, and is a branch of the atmospheric sciences...

, ecology
Ecology
Ecology is the scientific study of the relations that living organisms have with respect to each other and their natural environment. Variables of interest to ecologists include the composition, distribution, amount , number, and changing states of organisms within and among ecosystems...

). It is also commonly, though informally, acknowledged that the technical, taxonomic literature of all organismal groups is littered with examples of inconsistent and incorrect identifications. This is due to a variety of factors, including taxonomists being insufficiently trained and skilled in making identifications (e.g., using different rules-of-thumb in recognizing the boundaries between similar groups), insufficiently detailed original group descriptions and/or illustrations, inadequate access to current monographs and well-curated collections and, of course, taxonomists having different opinions regarding group concepts. Peer review only weeds out the most obvious errors of commission or omission in this area, and then only when an author provides adequate representations (e.g., illustrations, recordings, gene sequences) of the specimens in question.

Systematics
Systematics
Biological systematics is the study of the diversification of terrestrial life, both past and present, and the relationships among living things through time. Relationships are visualized as evolutionary trees...

 too has much to gain, both practically and theoretically, from the further development and use of automated identification systems. It is now widely recognized that the days of systematics as a field populated by mildly eccentric individuals pursuing knowledge in splendid isolation from funding priorities and economic imperatives are rapidly drawing to a close. In order to attract both personnel and resources, systematics must transform itself into a “large, coordinated, international scientific enterprise”
Many have identified use of the Internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...

— especially via the World Wide Web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...

 — as the medium through which this transformation can be made. While establishment of a virtual, GenBank
GenBank
The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. This database is produced and maintained by the National Center for Biotechnology Information as part of the International Nucleotide Sequence...

-like system for accessing morphological
Morphology (biology)
In biology, morphology is a branch of bioscience dealing with the study of the form and structure of organisms and their specific structural features....

 data, audio clips, video files and so forth would be a significant step in the right direction, improved access to observational information and/or text-based descriptions alone will not address either the taxonomic impediment or low identification reproducibility issues successfully. Instead, the inevitable subjectivity associated with making critical decisions on the basis of qualitative criteria must be reduced or, at the very least, embedded within a more formally analytic context.

Properly designed, flexible, and robust, automated identification systems, organized around distributed computing architectures and referenced to authoritatively identified collections of training set data (e.g., images, gene sequences) can, in principle, provide all systematists with access to the electronic data archives and the necessary analytic tools to handle routine identifications of common taxa. Properly designed systems can also recognize when their algorithms cannot make a reliable identification and refer that image to a specialist (whose address can be accessed from another database). Such systems can also include elements of artificial intelligence and so improve their performance the more they are used. Most tantalizingly, once morphological (or molecular) models of a species have been developed and demonstrated to be accurate, these models can be queried to determine which aspects of the observed patterns of variation and variation limits are being used to achieve the identification, thus opening the way for the discovery of new and (potentially) more reliable taxonomic characters.

External links

Here are some links to the home pages of species identification systems. While
all were initially designed to identify specious invertebrate groups, the SPIDA and DAISY system are essentially generic and capable of classifying any image material presented. The ABIS and DrawWing system are restricted to insects with membranous wings as it operates by matching a specific set of characters based on wing venation.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK