Allele frequency
Encyclopedia
Allele frequency or Gene frequency is the proportion of all copies of a gene that is made up of a particular gene variant (allele
). In other words, it is the number of copies of a particular allele divided by the number of copies of all alleles at the genetic place (locus
) in a population
. It can be expressed for example as a percentage
. In population genetics
, allele frequencies are used to depict the amount of genetic
diversity at the individual, population, and species
level. It is also the relative proportion of all alleles of a gene that are of a designated type.
Given the following:
then the allele frequency is the fraction or percentage of all the occurrences of that locus that is occupied by a given allele and the frequency of one of the alleles is a/(n*N).
For example, if the frequency of an allele is 20% in a given population, then among population members, one in five chromosomes will carry that allele. Four out of five will be occupied by other variant(s) of the gene.
Note that for diploid genes the fraction of individuals that carry this allele may be nearly two in five (36%). The reason for this is that if the allele distributes randomly, then the binomial theorem
will apply: 32% of the population will be heterozygous for the allele (i.e. carry one copy of that allele and one copy of another in each somatic cell) and 4% will be homozygous (carrying two copies of the allele). Together, this means that 36% of diploid individuals would be expected to carry an allele that has a frequency of 20%. However, alleles distribute randomly only under certain assumptions, including the absence of selection
. When these conditions apply, a population is said to be in Hardy–Weinberg equilibrium.
The frequencies of all the alleles of a given gene often are graphed together as an allele frequency distribution
histogram
, or allele frequency spectrum. Population genetics studies the different "forces" that might lead to changes in the distribution and frequencies of alleles—in other words, to evolution
. Besides selection, these forces include genetic drift
, mutation
and migration.
with two alleles, then the frequency p of the A-allele and the frequency q of the a-allele are obtained by counting alleles. Because each homozygote AA consists only of A-alleles, and because half of the alleles of each heterozygote Aa are A-alleles, the total frequency p of A-alleles in the population is calculated as
Similarly, the frequency q of the a allele is given by
It would be expected that p and q sum to 1, since they are the frequencies of the only two alleles present. Indeed they do:
and from this we get:
and
If there are more than two different allelic forms, the frequency for each allele is simply the frequency of its homozygote plus half the sum of the frequencies for all the heterozygotes in which it appears.
Allele frequency can always be calculated from genotype frequency
, whereas the reverse requires that the Hardy–Weinberg conditions of random mating apply. This is partly due to the three genotype frequencies and the two allele frequencies. It is easier to reduce from three to two.
s of the individuals are as follows:
Then the allele frequencies of allele A and allele a are:
so if a locus is chosen at random there is a 70% chance it will be the A allele, and a 30% chance it will be the a allele.
from allele A to some other allele a (the probability that a copy of gene A will become a during the DNA replication preceding meiosis). If is the frequency of the A allele in generation t, then is the frequency of the a allele in generation t, and if there are no other causes of gene frequency change (no natural selection, for example), then the change in allele frequency in one generation is
where is the frequency of the preceding generation. This tells us that the frequency of A decreases (and the frequency of a increases) by an amount that is proportional to the mutation rate ú and to the proportion p of all the genes that are still available to mutate. Thus gets smaller as the frequency of p itself decreases, because there are fewer and fewer A alleles to mutate into a alleles. We can make an approximation that, after n generations of mutation,
Allele
An allele is one of two or more forms of a gene or a genetic locus . "Allel" is an abbreviation of allelomorph. Sometimes, different alleles can result in different observable phenotypic traits, such as different pigmentation...
). In other words, it is the number of copies of a particular allele divided by the number of copies of all alleles at the genetic place (locus
Locus (genetics)
In the fields of genetics and genetic computation, a locus is the specific location of a gene or DNA sequence on a chromosome. A variant of the DNA sequence at a given locus is called an allele. The ordered list of loci known for a particular genome is called a genetic map...
) in a population
Population
A population is all the organisms that both belong to the same group or species and live in the same geographical area. The area that is used to define a sexual population is such that inter-breeding is possible between any pair within the area and more probable than cross-breeding with individuals...
. It can be expressed for example as a percentage
Percentage
In mathematics, a percentage is a way of expressing a number as a fraction of 100 . It is often denoted using the percent sign, “%”, or the abbreviation “pct”. For example, 45% is equal to 45/100, or 0.45.Percentages are used to express how large/small one quantity is, relative to another quantity...
. In population genetics
Population genetics
Population genetics is the study of allele frequency distribution and change under the influence of the four main evolutionary processes: natural selection, genetic drift, mutation and gene flow. It also takes into account the factors of recombination, population subdivision and population...
, allele frequencies are used to depict the amount of genetic
Genetics
Genetics , a discipline of biology, is the science of genes, heredity, and variation in living organisms....
diversity at the individual, population, and species
Species
In biology, a species is one of the basic units of biological classification and a taxonomic rank. A species is often defined as a group of organisms capable of interbreeding and producing fertile offspring. While in many cases this definition is adequate, more precise or differing measures are...
level. It is also the relative proportion of all alleles of a gene that are of a designated type.
Given the following:
- a particular locus on a chromosomeChromosomeA chromosome is an organized structure of DNA and protein found in cells. It is a single piece of coiled DNA containing many genes, regulatory elements and other nucleotide sequences. Chromosomes also contain DNA-bound proteins, which serve to package the DNA and control its functions.Chromosomes...
and the geneGeneA gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...
occupying that locus - a population of N individuals carrying n loci in each of their somatic cellSomatic cellA somatic cell is any biological cell forming the body of an organism; that is, in a multicellular organism, any cell other than a gamete, germ cell, gametocyte or undifferentiated stem cell...
s (e.g. two loci in the cells of diploid species, which contain two sets of chromosomes) - different alleles of the gene exist
- one allele exists in a copies
then the allele frequency is the fraction or percentage of all the occurrences of that locus that is occupied by a given allele and the frequency of one of the alleles is a/(n*N).
For example, if the frequency of an allele is 20% in a given population, then among population members, one in five chromosomes will carry that allele. Four out of five will be occupied by other variant(s) of the gene.
Note that for diploid genes the fraction of individuals that carry this allele may be nearly two in five (36%). The reason for this is that if the allele distributes randomly, then the binomial theorem
Binomial theorem
In elementary algebra, the binomial theorem describes the algebraic expansion of powers of a binomial. According to the theorem, it is possible to expand the power n into a sum involving terms of the form axbyc, where the exponents b and c are nonnegative integers with , and the coefficient a of...
will apply: 32% of the population will be heterozygous for the allele (i.e. carry one copy of that allele and one copy of another in each somatic cell) and 4% will be homozygous (carrying two copies of the allele). Together, this means that 36% of diploid individuals would be expected to carry an allele that has a frequency of 20%. However, alleles distribute randomly only under certain assumptions, including the absence of selection
Selection
In the context of evolution, certain traits or alleles of genes segregating within a population may be subject to selection. Under selection, individuals with advantageous or "adaptive" traits tend to be more successful than their peers reproductively—meaning they contribute more offspring to the...
. When these conditions apply, a population is said to be in Hardy–Weinberg equilibrium.
The frequencies of all the alleles of a given gene often are graphed together as an allele frequency distribution
Frequency distribution
In statistics, a frequency distribution is an arrangement of the values that one or more variables take in a sample. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval, and in this way, the table summarizes the distribution of...
histogram
Histogram
In statistics, a histogram is a graphical representation showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson...
, or allele frequency spectrum. Population genetics studies the different "forces" that might lead to changes in the distribution and frequencies of alleles—in other words, to evolution
Evolution
Evolution is any change across successive generations in the heritable characteristics of biological populations. Evolutionary processes give rise to diversity at every level of biological organisation, including species, individual organisms and molecules such as DNA and proteins.Life on Earth...
. Besides selection, these forces include genetic drift
Genetic drift
Genetic drift or allelic drift is the change in the frequency of a gene variant in a population due to random sampling.The alleles in the offspring are a sample of those in the parents, and chance has a role in determining whether a given individual survives and reproduces...
, mutation
Mutation
In molecular biology and genetics, mutations are changes in a genomic sequence: the DNA sequence of a cell's genome or the DNA or RNA sequence of a virus. They can be defined as sudden and spontaneous changes in the cell. Mutations are caused by radiation, viruses, transposons and mutagenic...
and migration.
Calculation of allele frequencies from genotype frequencies
If , , and are the frequencies of the three genotypes at a locusLocus (genetics)
In the fields of genetics and genetic computation, a locus is the specific location of a gene or DNA sequence on a chromosome. A variant of the DNA sequence at a given locus is called an allele. The ordered list of loci known for a particular genome is called a genetic map...
with two alleles, then the frequency p of the A-allele and the frequency q of the a-allele are obtained by counting alleles. Because each homozygote AA consists only of A-alleles, and because half of the alleles of each heterozygote Aa are A-alleles, the total frequency p of A-alleles in the population is calculated as
Similarly, the frequency q of the a allele is given by
It would be expected that p and q sum to 1, since they are the frequencies of the only two alleles present. Indeed they do:
and from this we get:
and
If there are more than two different allelic forms, the frequency for each allele is simply the frequency of its homozygote plus half the sum of the frequencies for all the heterozygotes in which it appears.
Allele frequency can always be calculated from genotype frequency
Genotype frequency
In population genetics, the genotype frequency is the frequency or proportion In population genetics, the genotype frequency is the frequency or proportion In population genetics, the genotype frequency is the frequency or proportion (i.e. 0 In population genetics, the genotype frequency is the...
, whereas the reverse requires that the Hardy–Weinberg conditions of random mating apply. This is partly due to the three genotype frequencies and the two allele frequencies. It is easier to reduce from three to two.
An example population
Consider a population of ten individuals and a given locus with two possible alleles, A and a. Suppose that the genotypeGenotype
The genotype is the genetic makeup of a cell, an organism, or an individual usually with reference to a specific character under consideration...
s of the individuals are as follows:
- AA, Aa, AA, aa, Aa, AA, AA, Aa, Aa, and AA
Then the allele frequencies of allele A and allele a are:
so if a locus is chosen at random there is a 70% chance it will be the A allele, and a 30% chance it will be the a allele.
The effect of mutation
Let ú be the mutation rateMutation rate
In genetics, the mutation rate is the chance of a mutation occurring in an organism or gene in each generation...
from allele A to some other allele a (the probability that a copy of gene A will become a during the DNA replication preceding meiosis). If is the frequency of the A allele in generation t, then is the frequency of the a allele in generation t, and if there are no other causes of gene frequency change (no natural selection, for example), then the change in allele frequency in one generation is
where is the frequency of the preceding generation. This tells us that the frequency of A decreases (and the frequency of a increases) by an amount that is proportional to the mutation rate ú and to the proportion p of all the genes that are still available to mutate. Thus gets smaller as the frequency of p itself decreases, because there are fewer and fewer A alleles to mutate into a alleles. We can make an approximation that, after n generations of mutation,