Gene family
Encyclopedia
A gene family is a set of several similar genes, formed by duplication of a single original gene, and generally with similar biochemical functions. One such family are the genes for human haemoglobin subunits; the ten genes are in two clusters on different chromosomes, called the α-globin and β-globin loci.
Genes are categorized into families based on shared nucleotide or protein sequences. Phylogenetic techniques can be used as a more rigorous test. The positions of exon
s within the coding sequence can be used to infer common ancestry. Knowing the sequence of the protein
encoded by a gene can allow researchers to apply methods that find similarities among protein sequences that provide more information than similarities or differences among DNA
sequences. Furthermore, knowledge of the protein's secondary structure
gives further information about ancestry, since the organization of secondary structural elements presumably would be conserved even if the amino acid
sequence changes considerably. These methods often rely upon predictions based upon the DNA sequence.
If the genes of a gene family encode proteins, the term protein family
is often used in an analogous manner to gene family.
The expansion or contraction of gene families along a specific lineage can be due to chance, or can be the result of
natural selection. To distinguish between these two cases is often difficult in practice. Recent work uses a combination
of statistical models and algorithmic techniques to detect gene families that are under the effect of natural selection.
Genes are categorized into families based on shared nucleotide or protein sequences. Phylogenetic techniques can be used as a more rigorous test. The positions of exon
Exon
An exon is a nucleic acid sequence that is represented in the mature form of an RNA molecule either after portions of a precursor RNA have been removed by cis-splicing or when two or more precursor RNA molecules have been ligated by trans-splicing. The mature RNA molecule can be a messenger RNA...
s within the coding sequence can be used to infer common ancestry. Knowing the sequence of the protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...
encoded by a gene can allow researchers to apply methods that find similarities among protein sequences that provide more information than similarities or differences among DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...
sequences. Furthermore, knowledge of the protein's secondary structure
Secondary structure
In biochemistry and structural biology, secondary structure is the general three-dimensional form of local segments of biopolymers such as proteins and nucleic acids...
gives further information about ancestry, since the organization of secondary structural elements presumably would be conserved even if the amino acid
Amino acid
Amino acids are molecules containing an amine group, a carboxylic acid group and a side-chain that varies between different amino acids. The key elements of an amino acid are carbon, hydrogen, oxygen, and nitrogen...
sequence changes considerably. These methods often rely upon predictions based upon the DNA sequence.
If the genes of a gene family encode proteins, the term protein family
Protein family
A protein family is a group of evolutionarily-related proteins, and is often nearly synonymous with gene family. The term protein family should not be confused with family as it is used in taxonomy....
is often used in an analogous manner to gene family.
The expansion or contraction of gene families along a specific lineage can be due to chance, or can be the result of
natural selection. To distinguish between these two cases is often difficult in practice. Recent work uses a combination
of statistical models and algorithmic techniques to detect gene families that are under the effect of natural selection.