TopFIND
Encyclopedia
TopFIND is the Termini oriented protein Function Inferred Database (TopFIND) is an integrated knowledgebase focused on protein
termini, their formation by protease
s and functional implications. It contains information about the processing and the processing state of proteins and functional implications thereof derived from research literature, contributions by the scientific community and biological database
s.
are the N- and C-termini defining the start and end of the polypeptide chain
. While genetically encoded, protein termini isoforms are also often generated during translation, following which, termini are highly dynamic, being frequently trimmed at their ends by a large array of exopeptidases. Neo-termini can also be generated by endopeptidases after precise and limited proteolysis, termed processing. Necessary for the maturation of many proteins, processing can also occur afterwards, often resulting in dramatic functional consequences. Aberrant proteolysis can cause wide range of diseases like arthritis or cancer . Hence, proteolytic generation of pleiotrophic stable forms of proteins, the universal susceptibility of proteins to proteolysis, and its irreversibility, distinguishes proteolysis from many highly studied posttranslational modifications.
and MEROPS
and provides access to new data from community submission and manual literature curating. It renders modifications of protein
termini, such as acetylation and citrulination, easily accessible and searchable and provides the means to identify and analyse extend and distribution of terminal modifications across a protein.
information, its domain structure, protein termini, terminus modifications and proteolytic processing of and by other proteins is listed. All information is accompanied by metadata
like its original source, method of identification, confidence measurement or related publication. A positional cross correlation evaluation matches termini and cleavage sites with protein features (such as amino acid variants) and domains to highlight potential effects and dependencies in a unique way. Also, a network view of all proteins showing their functional dependency as protease
, substrate or protease inhibitor
tied in with protein interactions is provided for the easy evaluation of network wide effects. A powerful yet user friendly filtering mechanism allows the presented data to be filtered based on parameters like methodology used, in vivo relevance, confidence or data source (e.g. limited to a single laboratory or publication). This provides means to assess physiological relevant data and to deduce functional information and hypotheses relevant to the bench scientist.
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...
termini, their formation by protease
Protease
A protease is any enzyme that conducts proteolysis, that is, begins protein catabolism by hydrolysis of the peptide bonds that link amino acids together in the polypeptide chain forming the protein....
s and functional implications. It contains information about the processing and the processing state of proteins and functional implications thereof derived from research literature, contributions by the scientific community and biological database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
s.
Background
Among the most fundamental characteristics of a proteinProtein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...
are the N- and C-termini defining the start and end of the polypeptide chain
Peptide
Peptides are short polymers of amino acid monomers linked by peptide bonds. They are distinguished from proteins on the basis of size, typically containing less than 50 monomer units. The shortest peptides are dipeptides, consisting of two amino acids joined by a single peptide bond...
. While genetically encoded, protein termini isoforms are also often generated during translation, following which, termini are highly dynamic, being frequently trimmed at their ends by a large array of exopeptidases. Neo-termini can also be generated by endopeptidases after precise and limited proteolysis, termed processing. Necessary for the maturation of many proteins, processing can also occur afterwards, often resulting in dramatic functional consequences. Aberrant proteolysis can cause wide range of diseases like arthritis or cancer . Hence, proteolytic generation of pleiotrophic stable forms of proteins, the universal susceptibility of proteins to proteolysis, and its irreversibility, distinguishes proteolysis from many highly studied posttranslational modifications.
Kowledgebase content
TopFIND is a resource for comprehensive coverage of protein N- and C-termini discovered by all available in silico, in vitro as well as in vivo methodologies. It makes use of existing knowledge by seamless integration of data from UniProtUniProt
UniProt is a comprehensive, high-quality and freely accessible database of protein sequence and functional information, many of which are derived from genome sequencing projects...
and MEROPS
Merops
Merops may refer to:* Merops , a genus of bee-eaters.* MEROPS, an on-line database for peptidases.It may also refer to several figures from Greek mythology:* King of Ethiopia, husband of Clymene, who lay with Helios and bore Phaethon...
and provides access to new data from community submission and manual literature curating. It renders modifications of protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...
termini, such as acetylation and citrulination, easily accessible and searchable and provides the means to identify and analyse extend and distribution of terminal modifications across a protein.
Data access
The data is presented to the user with a strong emphasis on the relation to curated background information and underlying evidence that led to the observation of a terminus, its modification or proteolytic cleavage. In brief the proteinProtein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...
information, its domain structure, protein termini, terminus modifications and proteolytic processing of and by other proteins is listed. All information is accompanied by metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
like its original source, method of identification, confidence measurement or related publication. A positional cross correlation evaluation matches termini and cleavage sites with protein features (such as amino acid variants) and domains to highlight potential effects and dependencies in a unique way. Also, a network view of all proteins showing their functional dependency as protease
Protease
A protease is any enzyme that conducts proteolysis, that is, begins protein catabolism by hydrolysis of the peptide bonds that link amino acids together in the polypeptide chain forming the protein....
, substrate or protease inhibitor
Protease inhibitor (biology)
In biology and biochemistry, protease inhibitors are molecules that inhibit the function of proteases. Many naturally occurring protease inhibitors are proteins....
tied in with protein interactions is provided for the easy evaluation of network wide effects. A powerful yet user friendly filtering mechanism allows the presented data to be filtered based on parameters like methodology used, in vivo relevance, confidence or data source (e.g. limited to a single laboratory or publication). This provides means to assess physiological relevant data and to deduce functional information and hypotheses relevant to the bench scientist.
See also
- MEROPSMeropsMerops may refer to:* Merops , a genus of bee-eaters.* MEROPS, an on-line database for peptidases.It may also refer to several figures from Greek mythology:* King of Ethiopia, husband of Clymene, who lay with Helios and bore Phaethon...
- UniProtUniProtUniProt is a comprehensive, high-quality and freely accessible database of protein sequence and functional information, many of which are derived from genome sequencing projects...
- CytoscapeCytoscapeCytoscape is an open source bioinformatics software platform for visualizing molecular interaction networks and integrating with gene expression profiles and other state data. Additional features are available as plugins...
- Computational genomicsComputational genomicsComputational genomics refers to the use of computational analysis to decipher biology from genome sequences and related data , including both DNA and RNA sequence as well as other "post-genomic" data...
- Metabolic network modellingMetabolic network modellingMetabolic network reconstruction and simulation allows for an in depth insight into comprehending the molecular mechanisms of a particular organism, especially correlating the genome with molecular physiology...
- Protein-protein interaction predictionProtein-protein interaction predictionProtein–protein interaction prediction is a field combining bioinformatics and structural biology in an attempt to identify and catalog physical interactions between pairs or groups of proteins...