Peptide Spectral Library
Encyclopedia
Peptide Spectral Library is a curated, annotated and non-redundant collection/database of LC-MS/MS peptide spectra. One essential utility of Peptide Spectral Library is to serve as consensus templates supporting the identification of peptide/proteins based on the correlation between the templates with experimental spectra. The process of peptide/protein identification is called spectral library searching. Comparing to traditional peptide spectra identification approach, sequence database searching, spectral library searching offers many unique benefits, which are illustrated in the following sessions.
Spectral library has been used in the small molecules mass spectra identification since 1980s. In the early year of shotgun proteomics
, pioneer investigations suggested that similar approach might be appliable in shotgun proteomics
for peptide/protein identification. But until recent years, with the availability of millions of confidently identified MS/MS spectra, the implementation of Peptide Spectral Library shows practical value.
Tandem MS and Shotgun Proteomics
Modern tandem MS instruments combine features of fast duty cycle, exquisite sensitivity, and unprecedented mass accuracy Tandem MS
, which is an ideal match for the large-scale protein identification and quantification in complex biological systems. In a shotgun proteomics
approach, proteins in a complex mixture are digested by proteolytic enzymes, such as trypsin. Subsequently, one or more chromatographic separations are applied to resolve resulting peptides, which are then ionized and analyzed in a mass spectrometer. To acquire tandem MS, a particular peptide precursor is isolated, and fragmented in a mass spectrometer; the mass spectra corresponding to the fragments of peptide precursor is recorded, which is called tandem mass spectra. Tandem mass spectra contains specific information regarding the sequence of the peptide precursor, which can aid the identification of peptide/protein .
. The searching process is sometimes painfully slow and requiring costly high-performance computers. In addition, the nature of Sequence Database Searching disconnects the research disocveries among different groups or at different times.
The spectral library searching is not appliable in situation where discovery of novel peptides or proteins is the goal. Fortunately, more and more high-quality mass spectra are being acquired by the collective contribution of the scientific community, which would continuously expand the coverage of Peptide Spectral Library.
Spectral library has been used in the small molecules mass spectra identification since 1980s. In the early year of shotgun proteomics
Shotgun proteomics
Shotgun proteomics is a method of identifying proteins in complex mixtures using a combination of high performance liquid chromatography combined with mass spectrometry. The name is derived from shotgun sequencing of DNA which is itself named by analogy with the rapidly-expanding, quasi-random...
, pioneer investigations suggested that similar approach might be appliable in shotgun proteomics
Shotgun proteomics
Shotgun proteomics is a method of identifying proteins in complex mixtures using a combination of high performance liquid chromatography combined with mass spectrometry. The name is derived from shotgun sequencing of DNA which is itself named by analogy with the rapidly-expanding, quasi-random...
for peptide/protein identification. But until recent years, with the availability of millions of confidently identified MS/MS spectra, the implementation of Peptide Spectral Library shows practical value.
Tandem MS and Shotgun ProteomicsShotgun proteomicsShotgun proteomics is a method of identifying proteins in complex mixtures using a combination of high performance liquid chromatography combined with mass spectrometry. The name is derived from shotgun sequencing of DNA which is itself named by analogy with the rapidly-expanding, quasi-random...
Modern tandem MS instruments combine features of fast duty cycle, exquisite sensitivity, and unprecedented mass accuracy Tandem MSTandem mass spectrometry
Tandem mass spectrometry, also known as MS/MS or MS2, involves multiple steps of mass spectrometry selection, with some form of fragmentation occurring in between the stages.-Tandem MS instruments:...
, which is an ideal match for the large-scale protein identification and quantification in complex biological systems. In a shotgun proteomics
Shotgun proteomics
Shotgun proteomics is a method of identifying proteins in complex mixtures using a combination of high performance liquid chromatography combined with mass spectrometry. The name is derived from shotgun sequencing of DNA which is itself named by analogy with the rapidly-expanding, quasi-random...
approach, proteins in a complex mixture are digested by proteolytic enzymes, such as trypsin. Subsequently, one or more chromatographic separations are applied to resolve resulting peptides, which are then ionized and analyzed in a mass spectrometer. To acquire tandem MS, a particular peptide precursor is isolated, and fragmented in a mass spectrometer; the mass spectra corresponding to the fragments of peptide precursor is recorded, which is called tandem mass spectra. Tandem mass spectra contains specific information regarding the sequence of the peptide precursor, which can aid the identification of peptide/protein .
Protein Identification via Sequence Database Searching
Sequence Database Searching is widely used currently for mass spectra based protein identification. In this approach a protein sequence database is used to calculate all putative peptide candidates in the given setting (proteolytic enzymes, miscleavages, post-translational modifications). The sequence search engines use various heuristics to predict the fragmentation pattern of each peptide candidate. Such derivative patterns are used as template to find a sufficiently close match within experimental mass spectra, which serves as the basis for peptide/protein identification. Many tools are developed for such practice, which supported many discoveries in the past, e.g. SEQUEST,Mascot.Shortcomings of the Sequence Database Searching Workflow
Due to the complex nature of peptide fragmentation in a mass spectrometer, derivative fragmentation patterns fall short to reproduce experimental mass spectra, especically relative intensities among distinct fragments. Thus, Sequence Database Searching faces a bottleneck of limited specificity. Sequence Database Searching also demands vast search space, which still could not cover all possibility of peptide dynamics, exhibiting limited efficiency PTM(post-translational modifications)Posttranslational modification
Posttranslational modification is the chemical modification of a protein after its translation. It is one of the later steps in protein biosynthesis, and thus gene expression, for many proteins....
. The searching process is sometimes painfully slow and requiring costly high-performance computers. In addition, the nature of Sequence Database Searching disconnects the research disocveries among different groups or at different times.
Advantages and Limitations of Spectral Library Searching
Firstly, much reduced search space will definitely decrease the searching time. Secondly, taking full advantage of all spectral features, including relative fragment intensities, neutral losses from fragments, and various additional specific fragments, the process of spectra searching will be more specific, and it will generally provide better discrimination between true and false matches.The spectral library searching is not appliable in situation where discovery of novel peptides or proteins is the goal. Fortunately, more and more high-quality mass spectra are being acquired by the collective contribution of the scientific community, which would continuously expand the coverage of Peptide Spectral Library.