Protein mass spectrometry
Encyclopedia
Protein mass spectrometry refers to the application of mass spectrometry
Mass spectrometry
Mass spectrometry is an analytical technique that measures the mass-to-charge ratio of charged particles.It is used for determining masses of particles, for determining the elemental composition of a sample or molecule, and for elucidating the chemical structures of molecules, such as peptides and...

 to the study of proteins. Mass spectrometry is an important emerging method for the characterization of proteins. The two primary methods for ionization of whole proteins are electrospray ionization
Electrospray ionization
Electrospray ionization is a technique used in mass spectrometry to produce ions. It is especially useful in producing ions from macromolecules because it overcomes the propensity of these molecules to fragment when ionized...

 (ESI) and matrix-assisted laser desorption/ionization
Matrix-assisted laser desorption/ionization
Matrix-assisted laser desorption/ionization is a soft ionization technique used in mass spectrometry, allowing the analysis of biomolecules and large organic molecules , which tend to be fragile and fragment when ionized by more conventional ionization methods...

 (MALDI). In keeping with the performance and mass range of available mass spectrometers, two approaches are used for characterizing proteins. In the first, intact proteins are ionized by either of the two techniques described above, and then introduced to a mass analyzer. This approach is referred to as "top-down
Top-down proteomics
Top-down proteomics is a method of protein identification that uses an ion trapping mass spectrometer to store an isolated protein ion for mass measurement and tandem mass spectrometry analysis. The name is derived from the similar approach to DNA seqencing...

" strategy of protein analysis. In the second, proteins are enzymatically digested into smaller peptides using a protease
Protease
A protease is any enzyme that conducts proteolysis, that is, begins protein catabolism by hydrolysis of the peptide bonds that link amino acids together in the polypeptide chain forming the protein....

 such as trypsin
Trypsin
Trypsin is a serine protease found in the digestive system of many vertebrates, where it hydrolyses proteins. Trypsin is produced in the pancreas as the inactive proenzyme trypsinogen. Trypsin cleaves peptide chains mainly at the carboxyl side of the amino acids lysine or arginine, except when...

. Subsequently these peptides are introduced into the mass spectrometer and identified by peptide mass fingerprinting
Peptide mass fingerprinting
Peptide mass fingerprinting is an analytical technique for protein identification that was developed in 1993 by several groups independently. In this method, the unknown protein of interest is first cleaved into smaller peptides, whose absolute masses can be accurately measured with a mass...

 or tandem mass spectrometry
Tandem mass spectrometry
Tandem mass spectrometry, also known as MS/MS or MS2, involves multiple steps of mass spectrometry selection, with some form of fragmentation occurring in between the stages.-Tandem MS instruments:...

. Hence, this latter approach (also called "bottom-up
Bottom-up proteomics
Bottom-up proteomics is a common method to identify proteins and characterize their amino acid sequences and post-translational modifications by proteolytic digestion of proteins prior to analysis by mass spectrometry. The proteins may first be purified by a method such as gel electrophoresis...

" proteomics) uses identification at the peptide level to infer the existence of proteins.

Whole protein mass analysis is primarily conducted using either time-of-flight
Time-of-flight
Time of flight describes a variety of methods that measure the time that it takes for an object, particle or acoustic, electromagnetic or other wave to travel a distance through a medium...

 (TOF) MS, or Fourier transform ion cyclotron resonance
Fourier transform ion cyclotron resonance
Fourier transform ion cyclotron resonance mass spectrometry, also known as Fourier transform mass spectrometry, is a type of mass analyzer for determining the mass-to-charge ratio of ions based on the cyclotron frequency of the ions in a fixed magnetic field...

 (FT-ICR). These two types of instrument are preferable here because of their wide mass range, and in the case of FT-ICR, its high mass accuracy. Mass analysis of proteolytic peptides is a much more popular method of protein characterization, as cheaper instrument designs can be used for characterization. Additionally, sample preparation is easier once whole proteins have been digested into smaller peptide fragments. The most widely used instrument for peptide mass analysis are the MALDI time-of-flight
Time-of-flight
Time of flight describes a variety of methods that measure the time that it takes for an object, particle or acoustic, electromagnetic or other wave to travel a distance through a medium...

 instruments as they permit the acquisition of peptide mass fingerprints
Peptide mass fingerprinting
Peptide mass fingerprinting is an analytical technique for protein identification that was developed in 1993 by several groups independently. In this method, the unknown protein of interest is first cleaved into smaller peptides, whose absolute masses can be accurately measured with a mass...

 (PMFs) at high pace (1 PMF can be analyzed in approx. 10 sec). Multiple stage quadrupole-time-of-flight and the quadrupole ion trap
Quadrupole ion trap
A quadrupole ion trap exists in both linear and 3D varieties and refers to an ion trap that uses constant DC and radio frequency oscillating AC electric fields to trap ions. It is commonly used as a component of a mass spectrometer...

 also find use in this application.

Protein and peptide fractionation coupled with mass spectrometry

Proteins of interest to biological researchers are usually part of a very complex mixture of other proteins and molecules that co-exist in the biological medium. This presents two significant problems. First, the two ionization techniques used for large molecules only work well when the mixture contains roughly equal amounts of constituents, while in biological samples, different proteins tend to be present in widely differing amounts. If such a mixture is ionized using electrospray or MALDI, the more abundant species have a tendency to "drown" or suppress signals from less abundant ones. The second problem is that the mass spectrum from a complex mixture is very difficult to interpret because of the overwhelming number of mixture components. This is exacerbated by the fact that enzymatic digestion of a protein gives rise to a large number of peptide products.

To contend with this problem, two methods are widely used to fractionate proteins, or their peptide products from an enzymatic digestion. The first method fractionates whole proteins and is called two-dimensional gel electrophoresis
Two-dimensional gel electrophoresis
Two-dimensional gel electrophoresis, abbreviated as 2-DE or 2-D electrophoresis, is a form of gel electrophoresis commonly used to analyze proteins...

. The second method, high performance liquid chromatography
High performance liquid chromatography
High-performance liquid chromatography , HPLC, is a chromatographic technique that can separate a mixture of compounds and is used in biochemistry and analytical chemistry to identify, quantify and purify the individual components of the mixture.HPLC typically utilizes different types of stationary...

 is used to fractionate peptides after enzymatic digestion. In some situations, it may be necessary to combine both of these techniques.

Gel spots identified on a 2D Gel are usually attributable to one protein. If the identity of the protein is desired, usually the method of in-gel digestion
In-gel digestion
The in-gel digestion is part of the sample preparation for the mass spectrometric identification of proteins in course of proteomic analysis. The method was introduced 1992 by Rosenfeld...

 is applied, where the protein spot of interest is excised, and digested proteolytically. The peptide masses resulting from the digestion can be determined by mass spectrometry using peptide mass fingerprinting
Peptide mass fingerprinting
Peptide mass fingerprinting is an analytical technique for protein identification that was developed in 1993 by several groups independently. In this method, the unknown protein of interest is first cleaved into smaller peptides, whose absolute masses can be accurately measured with a mass...

. If this information does not allow unequivocal identification of the protein, its peptides can be subject to tandem mass spectrometry
Tandem mass spectrometry
Tandem mass spectrometry, also known as MS/MS or MS2, involves multiple steps of mass spectrometry selection, with some form of fragmentation occurring in between the stages.-Tandem MS instruments:...

 for de novo sequencing.

Characterization of protein mixtures using HPLC/MS is also called shotgun proteomics and mudpit. A peptide mixture that results from digestion of a protein mixture is fractionated by one or two steps of liquid chromatography. The eluent from the chromatography stage can be either directly introduced to the mass spectrometer through electrospray ionization, or laid down on a series of small spots for later mass analysis using MALDI.

Protein identification

There are two main ways MS is used to identify proteins. Peptide mass fingerprinting
Peptide mass fingerprinting
Peptide mass fingerprinting is an analytical technique for protein identification that was developed in 1993 by several groups independently. In this method, the unknown protein of interest is first cleaved into smaller peptides, whose absolute masses can be accurately measured with a mass...

 (mentioned in the previous section) uses the masses of proteolytic peptides as input to a search of a database of predicted masses that would arise from digestion of a list of known proteins. If a protein sequence in the reference list gives rise to a significant number of predicted masses that match the experimental values, there is some evidence that this protein was present in the original sample.

Tandem MS is becoming a more popular experimental method for identifying proteins. Collision-induced dissociation is used in mainstream applications to generate a set of fragments from a specific peptide ion. The fragmentation process primarily gives rise to cleavage products that break along peptide bonds. Because of this simplicity in fragmentation, it is possible to use the observed fragment masses to match with a database of predicted masses for one of many given peptide sequences. Tandem MS of whole protein ions has been investigated recently using electron capture dissociation
Electron capture dissociation
Electron-capture dissociation is a method of fragmenting gas phase ions for tandem mass spectrometric analysis . ECD involves the direct introduction of low energy electrons to trapped gas phase ions...

 and has demonstrated extensive sequence information in principle but is not in common practice. This is sometimes referred to as the "top-down" approach in that it involves starting with the whole mass and then pulling it apart rather than starting with pieces (proteolytic fragments) and piecing the protein back together using de novo repeat detection (bottom-up).

De novo (peptide) sequencing

De novo (peptide) sequencing
Protein sequencing
Protein sequencing is a technique to determine the amino acid sequence of a protein, as well as which conformation the protein adopts and the extent to which it is complexed with any non-peptide molecules...

 for mass spectrometry is typically performed without prior knowledge of the amino acid sequence. It is the process of assigning amino acids from peptide
Peptide
Peptides are short polymers of amino acid monomers linked by peptide bonds. They are distinguished from proteins on the basis of size, typically containing less than 50 monomer units. The shortest peptides are dipeptides, consisting of two amino acids joined by a single peptide bond...

 fragment mass
Mass
Mass can be defined as a quantitive measure of the resistance an object has to change in its velocity.In physics, mass commonly refers to any of the following three properties of matter, which have been shown experimentally to be equivalent:...

es of a protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

. De novo sequencing has proven successful for confirming and expanding upon results from database searches.

As de novo sequencing is based on mass and some amino acids have identical masses (e.g. leucine
Leucine
Leucine is a branched-chain α-amino acid with the chemical formula HO2CCHCH2CH2. Leucine is classified as a hydrophobic amino acid due to its aliphatic isobutyl side chain. It is encoded by six codons and is a major component of the subunits in ferritin, astacin and other 'buffer' proteins...

 and isoleucine
Isoleucine
Isoleucine is an α-amino acid with the chemical formula HO2CCHCHCH2CH3. It is an essential amino acid, which means that humans cannot synthesize it, so it must be ingested. Its codons are AUU, AUC and AUA....

), accurate manual sequencing can be difficult. Therefore it may be necessary to utilize a sequence homology search application to work in tandem between a database search and de novo sequencing to address this inherent limitation.

Database searching has the advantage of quickly identifying sequences, provided they have already been documented in a database. Other inherent limitations of database searching include:
  • Sequence modifications/mutations: some database searches do not adequately account for alterations to the 'documented' sequence, thus can miss valuable information.
  • The unknown: if a sequence is not documented, it will not be found
  • False positives
  • Incomplete and corrupted data: a common, unnoticed problem


Annotation peptide spectral library
Peptide Spectral Library
Peptide Spectral Library is a curated, annotated and non-redundant collection/database of LC-MS/MS peptide spectra. One essential utility of Peptide Spectral Library is to serve as consensus templates supporting the identification of peptide/proteins based on the correlation between the templates...

 can also be used as a reference for protein/peptide identification. It offers the unique strength of reduced search space and increased specificity. The limitations include the following:
  • Spectra not included in the library will not be identified.
  • Spectra collected from different type of mass spectrometers can have quite distinct features.
  • Reference spectra in the library may contain noise peak, which may lead to false positive identifications.

Software

A number of different algorithmic approaches have been described to identify peptides and proteins from tandem mass spectrometry (MS/MS), peptide de novo sequencing and sequence tag-based searching.

Protein quantitation

Several recent methods allow for the quantitation of proteins by mass spectrometry (quantitative proteomics
Quantitative proteomics
The aim of quantitative proteomics is to obtain quantitative information about all proteins in a sample. Rather than just providing lists of proteins identified in a certain sample, quantitative proteomics yields information about differences between samples. For example, this approach can be used...

). Typically, stable (e.g. non-radioactive) heavier isotope
Isotope
Isotopes are variants of atoms of a particular chemical element, which have differing numbers of neutrons. Atoms of a particular element by definition must contain the same number of protons but may have a distinct number of neutrons which differs from atom to atom, without changing the designation...

s of carbon
Carbon
Carbon is the chemical element with symbol C and atomic number 6. As a member of group 14 on the periodic table, it is nonmetallic and tetravalent—making four electrons available to form covalent chemical bonds...

 (13C) or nitrogen
Nitrogen
Nitrogen is a chemical element that has the symbol N, atomic number of 7 and atomic mass 14.00674 u. Elemental nitrogen is a colorless, odorless, tasteless, and mostly inert diatomic gas at standard conditions, constituting 78.08% by volume of Earth's atmosphere...

 (15N) are incorporated into one sample while the other one is labeled with corresponding light isotopes (e.g. 12C and 14N). The two samples are mixed before the analysis. Peptides derived from the different samples can be distinguished due to their mass difference. The ratio of their peak intensities corresponds to the relative abundance ratio of the peptides (and proteins). The most popular methods for isotope labeling are SILAC
Silac
SILAC is a technique based on mass spectrometry that detects differences in protein abundance among samples using non-radioactive isotopic labeling. It is a popular method for quantitative proteomics.-Procedure:Two populations of cells are cultivated in cell culture...

 (stable isotope labeling by amino acids in cell culture), trypsin-catalyzed 18O labeling, ICAT
Isotope-coded affinity tag
Isotope-coded affinity tags are a gel-free method for quantitative proteomics that relies on chemical labeling reagents. These chemical probes consist of three general elements: a reactive group capable of labeling a defined amino acid side chain , an isotopically coded linker, and a tag for the...

 (isotope coded affinity tagging), iTRAQ
ITRAQ
Isobaric tags for relative and absolute quantitation are a non-gel-based technique used to quantify proteins from different sources in a single experiment. It uses isotope-coded covalent tags...

 (isobaric tags for relative and absolute quantitation).
“Semi-quantitative” mass spectrometry can be performed without labeling of samples. Typically, this is done with MALDI analysis (in linear mode). The peak intensity, or the peak area, from individual molecules (typically proteins) is here correlated to the amount of protein in the sample. However, the individual signal depends on the primary structure of the protein, on the complexity of the sample, and on the settings of the instrument. Other types of "label-free" quantitative mass spectrometry, uses the spectral counts (or peptide counts) of digested proteins as a means for determining relative protein amounts.

Protein structure

Characteristics indicative of the 3-dimensional structure
Protein structure
Proteins are an important class of biological macromolecules present in all organisms. Proteins are polymers of amino acids. Classified by their physical size, proteins are nanoparticles . Each protein polymer – also known as a polypeptide – consists of a sequence formed from 20 possible L-α-amino...

 of proteins can be probed with mass spectrometry in various ways. By using chemical crosslinking to couple parts of the protein that are close in space, but far apart in sequence, information about the overall structure can be inferred. By following the exchange of amide protons
Hydrogen-deuterium exchange
Hydrogen–deuterium exchange is a chemical reaction in which a covalently bonded hydrogen atom is replaced by a deuterium atom, or vice versa. Usually the examined protons are the amides in the backbone of a protein. The method gives information about the solvent accessibility of various parts of...

 with deuterium
Deuterium
Deuterium, also called heavy hydrogen, is one of two stable isotopes of hydrogen. It has a natural abundance in Earth's oceans of about one atom in of hydrogen . Deuterium accounts for approximately 0.0156% of all naturally occurring hydrogen in Earth's oceans, while the most common isotope ...

 from the solvent, it is possible to probe the solvent accessibility of various parts of the protein. Another interesting avenue in protein structural studies is laser-induced covalent labeling. In this technique, solvent-exposed sites of the protein are modified by hydroxyl radicals. Its combination with rapid mixing has been used in protein folding studies.

Biomarkers

The FDA defines a biomarker as, “A characteristic that is objectively measured and evaluated as an indicator of normal biologic processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention”. Mass spectrometry enables large-scale discovery of candidates for biomarkers.

Proteogenomics

In what is now commonly referred to as proteogenomics
Proteogenomics
Proteogenomics is an emerging field of biological research at the intersection of proteomics and genomics. While this intersection is large and can be defined in multiple ways, the term proteogenomics commonly refers to studies that use proteomic information, often derived from mass spectrometry,...

, proteomic technologies such as mass spectrometry
Mass spectrometry
Mass spectrometry is an analytical technique that measures the mass-to-charge ratio of charged particles.It is used for determining masses of particles, for determining the elemental composition of a sample or molecule, and for elucidating the chemical structures of molecules, such as peptides and...

are used for improving gene and protein annotations. Parallel analysis of the genome and the proteome facilitates discovery of post-translational modifications and proteolytic events , especially when comparing multiple species.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK