Shannon index
Encyclopedia
The Shannon index, sometimes referred to as the Shannon-Wiener Index or the Shannon-Weaver Index, is one of several diversity indices
used to measure diversity in categorical data
. It is simply the Information entropy
of the distribution, treating species as symbols and their relative population sizes as the probability.
This article treats its use in the measurement of biodiversity
. The advantage of this index is that it takes into account the number of species and the evenness of the species. The index is increased either by having additional unique species, or by having a greater species evenness
.
The "Shannon-Weaver" name is a misnomer; apparently some biologists jumped to the conclusion that Warren Weaver
, author of an influential preface to the book form of Claude Shannon
's 1948 paper founding information theory
, was a cofounder of this theory. Weaver did play a crucial role in the rapid postwar development of information theory in a different way, however; as an influential early administrator of the Rockefeller Foundation, he ensured that the first information theorists received generous research grants. Norbert Wiener
had no hand in the index either, although his influential popularisation of cybernetics
was often conflated with information theory in the 1950s.
Because the Shannon Index gives a measure of both species numbers and the evenness of their abundance, the resulting figure does not give an absolute description of a site's biodiversity. It is particularly useful when comparing similar ecosystems or habitats, as it can highlight one example being richer or more even than another. There is always the need to inspect the data or use another index to unpack the true reasons for the difference.
where S is the total number of species and is the frequency of the th species (the probability that any given individual belongs to the species, hence p).
It can be shown that for any given number of species, there is a maximum possible , which occurs when all species are present in equal numbers.
An alternative form is
The second half of this version is a correction factor.
Expanding the index:
Now, let's define Clearly, since is a positive constant for a given population size, and is also a constant, then maximizing is equivalent to maximizing .
Now, if it can be proven that is maximized when the number of individuals per species in the first group matches the number of individuals per species in the second group, then it has been proved that the population has a maximum index only when each species in the population is evenly represented. doesn't depend on the total population. So may be built by simply adding the indices of two sub-populations. Since the population size is arbitrary, this proves that if you have two species (the smallest number that can be considered two groups), their index is maximized if they are present in equal numbers. So the rules of mathematical induction
have been satisfied.
To find out which value of will maximize , we must find the value of which satisfies the equation:
Differentiating,
Exponentiating:
Now by applying the definitions of and , we get
Diversity index
A diversity index is a statistic which is intended to measure the local members of a set consisting of various types of objects. Diversity indices can be used in many fields of study to assess the diversity of any population in which each member belongs to a unique group, type or species...
used to measure diversity in categorical data
Categorical data
In statistics, categorical data is that part of an observed dataset that consists of categorical variables, or for data that has been converted into that form, for example as grouped data...
. It is simply the Information entropy
Information entropy
In information theory, entropy is a measure of the uncertainty associated with a random variable. In this context, the term usually refers to the Shannon entropy, which quantifies the expected value of the information contained in a message, usually in units such as bits...
of the distribution, treating species as symbols and their relative population sizes as the probability.
This article treats its use in the measurement of biodiversity
Biodiversity
Biodiversity is the degree of variation of life forms within a given ecosystem, biome, or an entire planet. Biodiversity is a measure of the health of ecosystems. Biodiversity is in part a function of climate. In terrestrial habitats, tropical regions are typically rich whereas polar regions...
. The advantage of this index is that it takes into account the number of species and the evenness of the species. The index is increased either by having additional unique species, or by having a greater species evenness
Species evenness
Species evenness refers to how close in numbers each species in an environment are. Mathematically it is defined as a diversity index, a measure of biodiversity which quantifies how equal the community is numerically. So if there are 40 foxes, and 1000 dogs, the community is not very even. But if...
.
The "Shannon-Weaver" name is a misnomer; apparently some biologists jumped to the conclusion that Warren Weaver
Warren Weaver
Warren Weaver was an American scientist, mathematician, and science administrator...
, author of an influential preface to the book form of Claude Shannon
Claude Elwood Shannon
Claude Elwood Shannon was an American mathematician, electronic engineer, and cryptographer known as "the father of information theory"....
's 1948 paper founding information theory
Information theory
Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory was developed by Claude E. Shannon to find fundamental limits on signal processing operations such as compressing data and on reliably storing and...
, was a cofounder of this theory. Weaver did play a crucial role in the rapid postwar development of information theory in a different way, however; as an influential early administrator of the Rockefeller Foundation, he ensured that the first information theorists received generous research grants. Norbert Wiener
Norbert Wiener
Norbert Wiener was an American mathematician.A famous child prodigy, Wiener later became an early researcher in stochastic and noise processes, contributing work relevant to electronic engineering, electronic communication, and control systems.Wiener is regarded as the originator of cybernetics, a...
had no hand in the index either, although his influential popularisation of cybernetics
Cybernetics
Cybernetics is the interdisciplinary study of the structure of regulatory systems. Cybernetics is closely related to information theory, control theory and systems theory, at least in its first-order form...
was often conflated with information theory in the 1950s.
Definitions
- The number of individuals in species i; the abundance of species i.
- The number of species. Also called species richnessSpecies richnessSpecies richness is the number of different species in a given area. It is represented in equation form as S.Species richness is the fundamental unit in which to assess the homogeneity of an environment. Typically, species richness is used in conservation studies to determine the sensitivity of...
. - The total number of all individuals
- The relative abundance of each species, calculated as the proportion of individuals of a given species to the total number of individuals in the community:
Interpreting the Index
Typically the value of the index ranges from 1.5 (low species richness and evenness) to 3.5 (high species evenness and richness), though values beyond these limits may be encountered.Because the Shannon Index gives a measure of both species numbers and the evenness of their abundance, the resulting figure does not give an absolute description of a site's biodiversity. It is particularly useful when comparing similar ecosystems or habitats, as it can highlight one example being richer or more even than another. There is always the need to inspect the data or use another index to unpack the true reasons for the difference.
Computing the index
where S is the total number of species and is the frequency of the th species (the probability that any given individual belongs to the species, hence p).
It can be shown that for any given number of species, there is a maximum possible , which occurs when all species are present in equal numbers.
An alternative form is
The second half of this version is a correction factor.
Proof that maximum evenness maximizes the index
The following will prove that any given population will have a maximum Shannon Index if and only if each species represented is composed of the same number of individuals.Expanding the index:
Now, let's define Clearly, since is a positive constant for a given population size, and is also a constant, then maximizing is equivalent to maximizing .
Strategy
Let's split an arbitrarily sized population into two groups, with each group receiving an arbitrary number of individuals and an arbitrary number of species. Now, within each group, each species has the same number of individuals as any other species in that group, but the number of individuals per species in the first group may be different from the number of individuals per species in the second group.Now, if it can be proven that is maximized when the number of individuals per species in the first group matches the number of individuals per species in the second group, then it has been proved that the population has a maximum index only when each species in the population is evenly represented. doesn't depend on the total population. So may be built by simply adding the indices of two sub-populations. Since the population size is arbitrary, this proves that if you have two species (the smallest number that can be considered two groups), their index is maximized if they are present in equal numbers. So the rules of mathematical induction
Mathematical induction
Mathematical induction is a method of mathematical proof typically used to establish that a given statement is true of all natural numbers...
have been satisfied.
Proof
Now, divide the species into two groups. Within each group, the population is evenly distributed among the species present.- The number of individuals in the second group.
- The number of species in the second group.
- Number of individuals in each species in the second group.
- The number of individuals in the first group.
- The number of species in the first group.
- The individuals in each species in the first group.
To find out which value of will maximize , we must find the value of which satisfies the equation:
Differentiating,
Exponentiating:
Now by applying the definitions of and , we get
Result
Now we have accomplished the proof that the Shannon index is maximized when each species is present in equal numbers (see #Strategy). But what is the index in that case? Well, , so Therefore:See also
- BiodiversityBiodiversityBiodiversity is the degree of variation of life forms within a given ecosystem, biome, or an entire planet. Biodiversity is a measure of the health of ecosystems. Biodiversity is in part a function of climate. In terrestrial habitats, tropical regions are typically rich whereas polar regions...
- Species richnessSpecies richnessSpecies richness is the number of different species in a given area. It is represented in equation form as S.Species richness is the fundamental unit in which to assess the homogeneity of an environment. Typically, species richness is used in conservation studies to determine the sensitivity of...
- Information theoryInformation theoryInformation theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory was developed by Claude E. Shannon to find fundamental limits on signal processing operations such as compressing data and on reliably storing and...