Pyrrolysine
Encyclopedia
Pyrrolysine is a naturally occurring, genetically coded amino acid
used by some methanogenic archaea
and one known bacterium in enzymes that are part of their methane
-producing metabolism. It is similar to lysine
, but with an added pyrroline
ring linked to the end of the lysine side chain. Produced by a specific tRNA and aminoacyl tRNA synthetase
, it forms part of an unusual genetic code
in these organisms, and is considered the 22nd proteinogenic amino acid
.
The joint nomenclature committee of the IUPAC/IUBMB has officially recommended the three-letter symbol Pyl and the one-letter symbol O for pyrrolysine.
is to direct production of proteins using genetic sequences that determine when or if each protein will be produced; what cell
s will produce it; and where it is located in the cell. Proteins form much of the physical structure of the body and catalyze a wide variety of chemical reactions, giving the genome the ability to control the body's biochemistry
. Nearly all proteins are made using only 20 standard building blocks called amino acids, which are often assembled in very long sequences according to a standard genetic code
. Specialized chemical reactions often require alterations of proteins after the fact by posttranslational modification
, or protein binding to specific cofactors
. Yet the genetic code itself is exactly the same among very many organisms, so that when researchers sequence DNA
from new or unknown sources they can often immediately draw conclusions about the chemical activity it carries out based on the assumption that a standard genetic code applies. The discovery of unusual amino acids specified by an expansion of the genetic code can call this assumption into question, so it is important to understand any such aberrations. Additionally, these variations indicate that the process of evolution that led to the establishment of the genetic code did not end before the universal common ancestor
perhaps some three to four billion years ago, but remains accessible to study even in the present day.
and MALDI mass spectrometry
, pyrrolysine is made up of 4-methylpyrroline
-5-carboxylate in amide
linkage with the ϵN of lysine
.
of several methyltransferase
s, where it is believed to rotate relatively freely. It is believed that the ring is involved in positioning and displaying the methyl group of methylamine
for attack by a corrinoid cofactor. The proposed models is that a nearby carboxylic acid
bearing residue, glutamate, becomes protonated, and the proton can then be transferred to the imine
ring nitrogen, exposing the adjacent ring carbon to nucleophilic addition
by methylamine. The positively charged nitrogen created by this interaction may then interact with the deprotonated glutamate, causing a shift in ring orientation and exposing the methyl group derived from the methylamine to the binding cleft where it can interact with corrinoid. In this way a net CH3+ is transferred to the cofactor's cobalt
atom with a change of oxidation state
from I to III. The methylamine-derived ammonia
is then released, restoring the original imine.
s of lysine such as hydroxylysine
, methyllysine
, and hypusine
, pyrrolysine is incorporated during translation
(protein synthesis) as directed by the genetic code
, just like the 20 standard amino acids. It is encoded in mRNA by the UAG codon, which in most organisms is the 'amber' stop codon
. This requires only the presence of the pylT gene, which encodes an unusual transfer RNA
(tRNA) with a CUA anticodon, and the pylS gene, which encodes a class II aminoacyl-tRNA synthetase that charges the pylT-derived tRNA with pyrrolysine. The UAG codon is followed by a PYLIS downstream sequence
, which forms a stem-loop
structure.
This novel tRNA-aaRS pair ("orthogonal pair") is independent of other synthetases and tRNAs in Escherichia coli
, and further possesses some flexibility in the range of amino acids processed, making it an attractive tool to allow the placement of a possibly wide range of functional chemical groups at arbitrarily specified locations in modified proteins. For example, the system provided one of two fluorophore
s incorporated site-specifically within calmodulin
to allow the real-time examination of changes within the protein by FRET
spectroscopy, and site-specific introduction of a photocaged lysine derivative. (See Expanded genetic code
)
of Methanosarcina
barkeri, with homologues in other sequenced members of the Methanosarcinaceae family: M. acetivorans, M. mazei, and M. thermophila. Pyrrolysine-containing genes are known to include monomethylamine methyltransferase (mtmB), dimethylamine methyltransferase (mtbB), and trimethylamine methyltransferase (mttB). Homologs
of pylS and pylT have also been found in an Antarctic archaeon, Methanococcoides burtoni and a Gram-positive
bacterium, Desulfitobacterium hafniense.
The occurrence in Desulfitobacterium is of special interest, because bacteria and archaea are separate domain
s in the three-domain system
by which living things are classified. When use of the amino acid appeared confined to the Methanosarcinaceae, the system was described as a "late archaeal invention" by which a 21st amino acid was added to the genetic code. Afterward it was concluded that "PylRS was already present in the last
universal common ancestor" some 3 billion years ago, but it only persisted in organisms using methylamines as energy sources. Another possibility is that evolution of the system involved a horizontal gene transfer
between unrelated microorganisms. The other genes of the Pyl operon mediate pyrrolysine biosynthesis, leading to description of the operon as a "natural genetic code expansion cassette".
Some differences exist between the bacterial and archaeal systems studied. Homology to pylS is broken into two separate proteins in D. hafniense. Most notably, the UAG codon appears to act as a stop codon in many of that organism's proteins, with only a single established use in coding pyrrolysine in that organism. By contrast, in methanogenic archaea it was not possible to identify any unambiguous UAG stop signal. Because there was only one known site where pyrrolysine is added in D. hafniense it was not possible to determine whether some additional sequence feature, analogous to the SECIS element for selenocysteine incorporation, might control when pyrrolysine is added. It was previously proposed that a specific downstream sequence "PYLIS", forming a stem-loop
in the mRNA, forced the incorporation of pyrrolysine instead of terminating translation
in methanogenic archaea. However, the PYLIS model has lost favor in view of the lack of structural homology between PYLIS elements and the lack of UAG stops in those species.
. More recent data favor direct charging of pyrrolysine on to the tRNA(CUA) by the protein product of the pylS gene, leading to the suggestion that the LysRS1:LysRS2 complex may participate in a parallel pathway designed to ensure that proteins containing the UAG codon can be fully translated using lysine as a substitute amino acid in the event of pyrrolysine deficiency. Further study found that the genes encoding LysRS1 and LysRS2 are not required for normal growth on methanol and methylamines with normal methyltransferase levels, and they cannot replace pylS in a recombinant system for UAG amber stop codon suppression.
Amino acid
Amino acids are molecules containing an amine group, a carboxylic acid group and a side-chain that varies between different amino acids. The key elements of an amino acid are carbon, hydrogen, oxygen, and nitrogen...
used by some methanogenic archaea
Archaea
The Archaea are a group of single-celled microorganisms. A single individual or species from this domain is called an archaeon...
and one known bacterium in enzymes that are part of their methane
Methane
Methane is a chemical compound with the chemical formula . It is the simplest alkane, the principal component of natural gas, and probably the most abundant organic compound on earth. The relative abundance of methane makes it an attractive fuel...
-producing metabolism. It is similar to lysine
Lysine
Lysine is an α-amino acid with the chemical formula HO2CCH4NH2. It is an essential amino acid, which means that the human body cannot synthesize it. Its codons are AAA and AAG....
, but with an added pyrroline
Pyrroline
Pyrrolines, also known under the name dihydropyrroles, are three different heterocyclic organic chemical compounds that differ in the position of the double bond. Pyrrolines are formally derived from the aromate pyrrole by hydrogenation...
ring linked to the end of the lysine side chain. Produced by a specific tRNA and aminoacyl tRNA synthetase
Aminoacyl tRNA synthetase
An aminoacyl tRNA synthetase is an enzyme that catalyzes the esterification of a specific amino acid or its precursor to one of all its compatible cognate tRNAs to form an aminoacyl-tRNA. This is sometimes called "charging" the tRNA with the amino acid...
, it forms part of an unusual genetic code
Genetic code
The genetic code is the set of rules by which information encoded in genetic material is translated into proteins by living cells....
in these organisms, and is considered the 22nd proteinogenic amino acid
Proteinogenic amino acid
Proteinogenic amino acids are those amino acids that can be found in proteins and require cellular machinery coded for in the genetic code of any organism for their isolated production. There are 22 standard amino acids, but only 21 are found in eukaryotes. Of the 22, 20 are directly encoded by...
.
The joint nomenclature committee of the IUPAC/IUBMB has officially recommended the three-letter symbol Pyl and the one-letter symbol O for pyrrolysine.
Introduction and context
One key function of the genomeGenome
In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....
is to direct production of proteins using genetic sequences that determine when or if each protein will be produced; what cell
Cell (biology)
The cell is the basic structural and functional unit of all known living organisms. It is the smallest unit of life that is classified as a living thing, and is often called the building block of life. The Alberts text discusses how the "cellular building blocks" move to shape developing embryos....
s will produce it; and where it is located in the cell. Proteins form much of the physical structure of the body and catalyze a wide variety of chemical reactions, giving the genome the ability to control the body's biochemistry
Biochemistry
Biochemistry, sometimes called biological chemistry, is the study of chemical processes in living organisms, including, but not limited to, living matter. Biochemistry governs all living organisms and living processes...
. Nearly all proteins are made using only 20 standard building blocks called amino acids, which are often assembled in very long sequences according to a standard genetic code
Genetic code
The genetic code is the set of rules by which information encoded in genetic material is translated into proteins by living cells....
. Specialized chemical reactions often require alterations of proteins after the fact by posttranslational modification
Posttranslational modification
Posttranslational modification is the chemical modification of a protein after its translation. It is one of the later steps in protein biosynthesis, and thus gene expression, for many proteins....
, or protein binding to specific cofactors
Cofactor (biochemistry)
A cofactor is a non-protein chemical compound that is bound to a protein and is required for the protein's biological activity. These proteins are commonly enzymes, and cofactors can be considered "helper molecules" that assist in biochemical transformations....
. Yet the genetic code itself is exactly the same among very many organisms, so that when researchers sequence DNA
DNA
Deoxyribonucleic acid is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms . The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in...
from new or unknown sources they can often immediately draw conclusions about the chemical activity it carries out based on the assumption that a standard genetic code applies. The discovery of unusual amino acids specified by an expansion of the genetic code can call this assumption into question, so it is important to understand any such aberrations. Additionally, these variations indicate that the process of evolution that led to the establishment of the genetic code did not end before the universal common ancestor
Last universal ancestor
The last universal ancestor , also called the last universal common ancestor , or the cenancestor, is the most recent organism from which all organisms now living on Earth descend. Thus it is the most recent common ancestor of all current life on Earth...
perhaps some three to four billion years ago, but remains accessible to study even in the present day.
Composition
As determined by X-ray crystallographyX-ray crystallography
X-ray crystallography is a method of determining the arrangement of atoms within a crystal, in which a beam of X-rays strikes a crystal and causes the beam of light to spread into many specific directions. From the angles and intensities of these diffracted beams, a crystallographer can produce a...
and MALDI mass spectrometry
Mass spectrometry
Mass spectrometry is an analytical technique that measures the mass-to-charge ratio of charged particles.It is used for determining masses of particles, for determining the elemental composition of a sample or molecule, and for elucidating the chemical structures of molecules, such as peptides and...
, pyrrolysine is made up of 4-methylpyrroline
Pyrroline
Pyrrolines, also known under the name dihydropyrroles, are three different heterocyclic organic chemical compounds that differ in the position of the double bond. Pyrrolines are formally derived from the aromate pyrrole by hydrogenation...
-5-carboxylate in amide
Amide
In chemistry, an amide is an organic compound that contains the functional group consisting of a carbonyl group linked to a nitrogen atom . The term refers both to a class of compounds and a functional group within those compounds. The term amide also refers to deprotonated form of ammonia or an...
linkage with the ϵN of lysine
Lysine
Lysine is an α-amino acid with the chemical formula HO2CCH4NH2. It is an essential amino acid, which means that the human body cannot synthesize it. Its codons are AAA and AAG....
.
Catalytic function
The extra pyrroline ring is incorporated into the active siteActive site
In biology the active site is part of an enzyme where substrates bind and undergo a chemical reaction. The majority of enzymes are proteins but RNA enzymes called ribozymes also exist. The active site of an enzyme is usually found in a cleft or pocket that is lined by amino acid residues that...
of several methyltransferase
Methyltransferase
A methyltransferase is a type of transferase enzyme that transfers a methyl group from a donor to an acceptor.Methylation often occurs on nucleic bases in DNA or amino acids in protein structures...
s, where it is believed to rotate relatively freely. It is believed that the ring is involved in positioning and displaying the methyl group of methylamine
Methylamine
Methylamine is the organic compound with a formula of CH3NH2. This colourless gas is a derivative of ammonia, but with one H atom replaced by a methyl group. It is the simplest primary amine. It is sold as a solution in methanol, ethanol, THF, and water, or as the anhydrous gas in pressurized...
for attack by a corrinoid cofactor. The proposed models is that a nearby carboxylic acid
Carboxylic acid
Carboxylic acids are organic acids characterized by the presence of at least one carboxyl group. The general formula of a carboxylic acid is R-COOH, where R is some monovalent functional group...
bearing residue, glutamate, becomes protonated, and the proton can then be transferred to the imine
Imine
An imine is a functional group or chemical compound containing a carbon–nitrogen double bond, with the nitrogen attached to a hydrogen atom or an organic group. If this group is not a hydrogen atom, then the compound is known as a Schiff base...
ring nitrogen, exposing the adjacent ring carbon to nucleophilic addition
Nucleophilic addition
In organic chemistry, a nucleophilic addition reaction is an addition reaction where in a chemical compound a π bond is removed by the creation of two new covalent bonds by the addition of a nucleophile....
by methylamine. The positively charged nitrogen created by this interaction may then interact with the deprotonated glutamate, causing a shift in ring orientation and exposing the methyl group derived from the methylamine to the binding cleft where it can interact with corrinoid. In this way a net CH3+ is transferred to the cofactor's cobalt
Cobalt
Cobalt is a chemical element with symbol Co and atomic number 27. It is found naturally only in chemically combined form. The free element, produced by reductive smelting, is a hard, lustrous, silver-gray metal....
atom with a change of oxidation state
Oxidation state
In chemistry, the oxidation state is an indicator of the degree of oxidation of an atom in a chemical compound. The formal oxidation state is the hypothetical charge that an atom would have if all bonds to atoms of different elements were 100% ionic. Oxidation states are typically represented by...
from I to III. The methylamine-derived ammonia
Ammonia
Ammonia is a compound of nitrogen and hydrogen with the formula . It is a colourless gas with a characteristic pungent odour. Ammonia contributes significantly to the nutritional needs of terrestrial organisms by serving as a precursor to food and fertilizers. Ammonia, either directly or...
is then released, restoring the original imine.
Genetic coding
Unlike posttranslational modificationPosttranslational modification
Posttranslational modification is the chemical modification of a protein after its translation. It is one of the later steps in protein biosynthesis, and thus gene expression, for many proteins....
s of lysine such as hydroxylysine
Hydroxylysine
5-Hydroxylysine is an amino acid with the molecular formula C6H14N2O3. It was first discovered in 1921 by Donald Van Slyke. It is a hydroxy derivative of lysine. It is most widely known as a component of collagen....
, methyllysine
Methyllysine
In proteins, the amino acid residue lysine can be methylated once, twice or thrice on its terminal sidechain ammonium group.Such methylated lysines play an important role in epigenetics; the methylation of specific lysines of certain histones in a nucleosome alters the binding of the surrounding...
, and hypusine
Hypusine
Hypusine is an unusual amino acid found in all eukaryotes and in some archaea, but not in bacteria. The only known protein containing hypusine is eukaryotic translation initiation factor 5A and a similar protein found in archaebacteria. In humans, two isoforms of eIF-5A have been described:...
, pyrrolysine is incorporated during translation
Translation (genetics)
In molecular biology and genetics, translation is the third stage of protein biosynthesis . In translation, messenger RNA produced by transcription is decoded by the ribosome to produce a specific amino acid chain, or polypeptide, that will later fold into an active protein...
(protein synthesis) as directed by the genetic code
Genetic code
The genetic code is the set of rules by which information encoded in genetic material is translated into proteins by living cells....
, just like the 20 standard amino acids. It is encoded in mRNA by the UAG codon, which in most organisms is the 'amber' stop codon
Stop codon
In the genetic code, a stop codon is a nucleotide triplet within messenger RNA that signals a termination of translation. Proteins are based on polypeptides, which are unique sequences of amino acids. Most codons in messenger RNA correspond to the addition of an amino acid to a growing polypeptide...
. This requires only the presence of the pylT gene, which encodes an unusual transfer RNA
Transfer RNA
Transfer RNA is an adaptor molecule composed of RNA, typically 73 to 93 nucleotides in length, that is used in biology to bridge the three-letter genetic code in messenger RNA with the twenty-letter code of amino acids in proteins. The role of tRNA as an adaptor is best understood by...
(tRNA) with a CUA anticodon, and the pylS gene, which encodes a class II aminoacyl-tRNA synthetase that charges the pylT-derived tRNA with pyrrolysine. The UAG codon is followed by a PYLIS downstream sequence
PYLIS downstream sequence
In biology, the PYLIS downstream sequence is a stem-loop structure which appears on some mRNA sequences. This structural motif causes the UAG stop codon to be translated to the amino acid pyrrolysine instead of ending the protein translation...
, which forms a stem-loop
Stem-loop
Stem-loop intramolecular base pairing is a pattern that can occur in single-stranded DNA or, more commonly, in RNA. The structure is also known as a hairpin or hairpin loop. It occurs when two regions of the same strand, usually complementary in nucleotide sequence when read in opposite directions,...
structure.
This novel tRNA-aaRS pair ("orthogonal pair") is independent of other synthetases and tRNAs in Escherichia coli
Escherichia coli
Escherichia coli is a Gram-negative, rod-shaped bacterium that is commonly found in the lower intestine of warm-blooded organisms . Most E. coli strains are harmless, but some serotypes can cause serious food poisoning in humans, and are occasionally responsible for product recalls...
, and further possesses some flexibility in the range of amino acids processed, making it an attractive tool to allow the placement of a possibly wide range of functional chemical groups at arbitrarily specified locations in modified proteins. For example, the system provided one of two fluorophore
Fluorophore
A fluorophore, in analogy to a chromophore, is a component of a molecule which causes a molecule to be fluorescent. It is a functional group in a molecule which will absorb energy of a specific wavelength and re-emit energy at a different wavelength...
s incorporated site-specifically within calmodulin
Calmodulin
Calmodulin is a calcium-binding protein expressed in all eukaryotic cells...
to allow the real-time examination of changes within the protein by FRET
Fret
A fret is a raised portion on the neck of a stringed instrument, that extends generally across the full width of the neck. On most modern western instruments, frets are metal strips inserted into the fingerboard...
spectroscopy, and site-specific introduction of a photocaged lysine derivative. (See Expanded genetic code
Expanded genetic code
An expanded genetic code refers to an artificially modified genetic code in which one or more specific codons have been allocated to encode an amino acid which is not among the twenty/twenty-two found in nature.-Background:...
)
Evolution
The pylT and pylS genes are part of an operonOperon
In genetics, an operon is a functioning unit of genomic DNA containing a cluster of genes under the control of a single regulatory signal or promoter. The genes are transcribed together into an mRNA strand and either translated together in the cytoplasm, or undergo trans-splicing to create...
of Methanosarcina
Methanosarcina
Methanosarcina are the only known anaerobic methanogens to produce methane using all three known metabolic pathways for methanogenesis. Most methanogens make methane from carbon dioxide and hydrogen gas. Some others utilize acetate in the acetoclastic pathway...
barkeri, with homologues in other sequenced members of the Methanosarcinaceae family: M. acetivorans, M. mazei, and M. thermophila. Pyrrolysine-containing genes are known to include monomethylamine methyltransferase (mtmB), dimethylamine methyltransferase (mtbB), and trimethylamine methyltransferase (mttB). Homologs
Homology (biology)
Homology forms the basis of organization for comparative biology. In 1843, Richard Owen defined homology as "the same organ in different animals under every variety of form and function". Organs as different as a bat's wing, a seal's flipper, a cat's paw and a human hand have a common underlying...
of pylS and pylT have also been found in an Antarctic archaeon, Methanococcoides burtoni and a Gram-positive
Gram-positive
Gram-positive bacteria are those that are stained dark blue or violet by Gram staining. This is in contrast to Gram-negative bacteria, which cannot retain the crystal violet stain, instead taking up the counterstain and appearing red or pink...
bacterium, Desulfitobacterium hafniense.
The occurrence in Desulfitobacterium is of special interest, because bacteria and archaea are separate domain
Domain (biology)
In biological taxonomy, a domain is the highest taxonomic rank of organisms, higher than a kingdom. According to the three-domain system of Carl Woese, introduced in 1990, the Tree of Life consists of three domains: Archaea, Bacteria and Eukarya...
s in the three-domain system
Three-domain system
The three-domain system is a biological classification introduced by Carl Woese in 1977 that divides cellular life forms into archaea, bacteria, and eukaryote domains. In particular, it emphasizes the separation of prokaryotes into two groups, originally called Eubacteria and Archaebacteria...
by which living things are classified. When use of the amino acid appeared confined to the Methanosarcinaceae, the system was described as a "late archaeal invention" by which a 21st amino acid was added to the genetic code. Afterward it was concluded that "PylRS was already present in the last
universal common ancestor" some 3 billion years ago, but it only persisted in organisms using methylamines as energy sources. Another possibility is that evolution of the system involved a horizontal gene transfer
Horizontal gene transfer
Horizontal gene transfer , also lateral gene transfer , is any process in which an organism incorporates genetic material from another organism without being the offspring of that organism...
between unrelated microorganisms. The other genes of the Pyl operon mediate pyrrolysine biosynthesis, leading to description of the operon as a "natural genetic code expansion cassette".
Some differences exist between the bacterial and archaeal systems studied. Homology to pylS is broken into two separate proteins in D. hafniense. Most notably, the UAG codon appears to act as a stop codon in many of that organism's proteins, with only a single established use in coding pyrrolysine in that organism. By contrast, in methanogenic archaea it was not possible to identify any unambiguous UAG stop signal. Because there was only one known site where pyrrolysine is added in D. hafniense it was not possible to determine whether some additional sequence feature, analogous to the SECIS element for selenocysteine incorporation, might control when pyrrolysine is added. It was previously proposed that a specific downstream sequence "PYLIS", forming a stem-loop
Stem-loop
Stem-loop intramolecular base pairing is a pattern that can occur in single-stranded DNA or, more commonly, in RNA. The structure is also known as a hairpin or hairpin loop. It occurs when two regions of the same strand, usually complementary in nucleotide sequence when read in opposite directions,...
in the mRNA, forced the incorporation of pyrrolysine instead of terminating translation
Translation
Translation is the communication of the meaning of a source-language text by means of an equivalent target-language text. Whereas interpreting undoubtedly antedates writing, translation began only after the appearance of written literature; there exist partial translations of the Sumerian Epic of...
in methanogenic archaea. However, the PYLIS model has lost favor in view of the lack of structural homology between PYLIS elements and the lack of UAG stops in those species.
Potential for an alternate translation
The tRNA(CUA) can be charged with lysine in vitro by the concerted action of the M. barkeri Class I and Class II Lysyl-tRNA synthetases, which do not recognize pyrrolysine. Charging a tRNA(CUA) with lysine was originally hypothesized to be the first step in translating UAG amber codons as pyrrolysine, a mechanism analogous to that used for selenocysteineSelenocysteine
Selenocysteine is an amino acid that is present in several enzymes .-Nomenclature:...
. More recent data favor direct charging of pyrrolysine on to the tRNA(CUA) by the protein product of the pylS gene, leading to the suggestion that the LysRS1:LysRS2 complex may participate in a parallel pathway designed to ensure that proteins containing the UAG codon can be fully translated using lysine as a substitute amino acid in the event of pyrrolysine deficiency. Further study found that the genes encoding LysRS1 and LysRS2 are not required for normal growth on methanol and methylamines with normal methyltransferase levels, and they cannot replace pylS in a recombinant system for UAG amber stop codon suppression.
See also
- Genetic codeGenetic codeThe genetic code is the set of rules by which information encoded in genetic material is translated into proteins by living cells....
- translationTranslation (genetics)In molecular biology and genetics, translation is the third stage of protein biosynthesis . In translation, messenger RNA produced by transcription is decoded by the ribosome to produce a specific amino acid chain, or polypeptide, that will later fold into an active protein...
- selenocysteineSelenocysteineSelenocysteine is an amino acid that is present in several enzymes .-Nomenclature:...
, the 21st genetically encoded amino acid.