MEROPS database
Encyclopedia
MEROPS is an on-line database
for peptidases (also known as proteases) and their inhibitors. The classification scheme
for peptidases was published by Rawlings & Barrett in 1993, and that for protein inhibitors by Rawlings et al. in 2004.
, and for a protein inhibitor the reactive site. The classification is hierarchical: sequences are assembled into families, and families are assembled into clans. A family is assembled around a type example, the sequence of a well-characterized peptidase or inhibitor. All other sequences in the family must be related to the family type example, either directly or through a transitive relationship involving one or more sequences already shown to be family members. Typically, FastA
or BlastP are used to establish sequence relationships, with an expect value of 0.001 or lower taken to be statistically significant
. A clan is also assembled around a type example, this being the structure of a well-characterized peptidase or inhibitor. A family is included in a clan if the tertiary structure
of a family member can be shown to be related to that of the clan type example. Typically, DALI
is used to establish clan membership, with a z score
of 6.00 standard deviation
units or above considered to be statistically significant. For peptidases, other evidence to indicate that families are related when a tertiary structure is absent includes the same order of catalytic residues in the sequences.
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
for peptidases (also known as proteases) and their inhibitors. The classification scheme
Classification scheme
In metadata a classification scheme is a hierarchical arrangement of kinds of things or groups of kinds of things. Typically it is accompanied by descriptive information of the classes or groups. A classification scheme is intended to be used for an arrangement or division of individual objects...
for peptidases was published by Rawlings & Barrett in 1993, and that for protein inhibitors by Rawlings et al. in 2004.
Overview
The classification is based on similarities at the tertiary and primary structural levels. Comparisons are restricted to that part of the sequence directly involved in the reaction, which in the case of a peptidase must include the active siteActive site
In biology the active site is part of an enzyme where substrates bind and undergo a chemical reaction. The majority of enzymes are proteins but RNA enzymes called ribozymes also exist. The active site of an enzyme is usually found in a cleft or pocket that is lined by amino acid residues that...
, and for a protein inhibitor the reactive site. The classification is hierarchical: sequences are assembled into families, and families are assembled into clans. A family is assembled around a type example, the sequence of a well-characterized peptidase or inhibitor. All other sequences in the family must be related to the family type example, either directly or through a transitive relationship involving one or more sequences already shown to be family members. Typically, FastA
FASTA
FASTA is a DNA and protein sequence alignment software package first described by David J. Lipman and William R. Pearson in 1985. Its legacy is the FASTA format which is now ubiquitous in bioinformatics.- History :...
or BlastP are used to establish sequence relationships, with an expect value of 0.001 or lower taken to be statistically significant
Statistical significance
In statistics, a result is called statistically significant if it is unlikely to have occurred by chance. The phrase test of significance was coined by Ronald Fisher....
. A clan is also assembled around a type example, this being the structure of a well-characterized peptidase or inhibitor. A family is included in a clan if the tertiary structure
Tertiary structure
In biochemistry and molecular biology, the tertiary structure of a protein or any other macromolecule is its three-dimensional structure, as defined by the atomic coordinates.-Relationship to primary structure:...
of a family member can be shown to be related to that of the clan type example. Typically, DALI
Families of structurally similar proteins
Families of Structurally Similar Proteins or FSSP is a database of structurally superimposed proteins generated using the "Distance-matrix ALIgnment" algorithm. The database is helpful for the comparison of protein structures.-External links:*...
is used to establish clan membership, with a z score
Standard score
In statistics, a standard score indicates how many standard deviations an observation or datum is above or below the mean. It is a dimensionless quantity derived by subtracting the population mean from an individual raw score and then dividing the difference by the population standard deviation...
of 6.00 standard deviation
Standard deviation
Standard deviation is a widely used measure of variability or diversity used in statistics and probability theory. It shows how much variation or "dispersion" there is from the average...
units or above considered to be statistically significant. For peptidases, other evidence to indicate that families are related when a tertiary structure is absent includes the same order of catalytic residues in the sequences.