CATH
Encyclopedia
The CATH Protein Structure Classification is a semi-automatic, hierarchical classification of protein domains published in 1997 by Christine Orengo, Janet Thornton and their colleagues.

CATH shares many broad features with its principal rival, SCOP
Structural Classification of Proteins
The Structural Classification of Proteins database is a largely manual classification of protein structural domains based on similarities of their structures and amino acid sequences. A motivation for this classification is to determine the evolutionary relationship between proteins...

, however there are also many areas in which the detailed classification differs greatly.

Hierarchy

The name CATH is an acronym of the four main levels in the classification.
The four main levels of the CATH hierarchy are as follows:
# Level Description
1 Class the overall secondary-structure content of the domain
2 Architecture high structural similarity but no evidence of homology
Homology (biology)
Homology forms the basis of organization for comparative biology. In 1843, Richard Owen defined homology as "the same organ in different animals under every variety of form and function". Organs as different as a bat's wing, a seal's flipper, a cat's paw and a human hand have a common underlying...

. Equivalent to a fold in SCOP
Structural Classification of Proteins
The Structural Classification of Proteins database is a largely manual classification of protein structural domains based on similarities of their structures and amino acid sequences. A motivation for this classification is to determine the evolutionary relationship between proteins...

3 Topology a large-scale grouping of topologies which share particular structural features
4 Homologous superfamily indicative of a demonstrable evolutionary relationship. Equivalent to the superfamily level of SCOP.


CATH defines four classes: mostly-alpha
Alpha helix
A common motif in the secondary structure of proteins, the alpha helix is a right-handed coiled or spiral conformation, in which every backbone N-H group donates a hydrogen bond to the backbone C=O group of the amino acid four residues earlier...

, mostly-beta
Beta sheet
The β sheet is the second form of regular secondary structure in proteins, only somewhat less common than the alpha helix. Beta sheets consist of beta strands connected laterally by at least two or three backbone hydrogen bonds, forming a generally twisted, pleated sheet...

, alpha and beta, few secondary structures.

In order to better understand the CATH classification system it is useful to know how it is constructed: much of the work is done by automatic methods, however there are important manual elements to the classification.

The very first step is to separate the proteins into domains. It is difficult to produce an unequivocal definition of a domain and this is one area in which CATH and SCOP differ.

The domains are automatically sorted into classes and clustered on the basis of sequence similarities. These groups form the H levels of the classification. The topology level is formed by structural comparisons of the homologous groups. Finally, the Architecture level is assigned manually.

Class Level classification is done on the basis of 4 criteria:
  1. Secondary structure content;
  2. Secondary structure contacts;
  3. Secondary structure alternation score; and
  4. Percentage of parallel strands.


More detail on this process and the comparison between SCOP, CATH and FSSP can be found in: Hadley & Jones, 1999 and Day et al., 2003.

Example

  • alpha domains only
  • beta domains only
  • alpha and beta
    • roll
    • TIM barrel
      TIM barrel
      The TIM barrel is a conserved protein fold consisting of eight α-helices and eight parallel β-strands that alternate along the peptide backbone. The structure is named after triosephosphate isomerase, a conserved glycolytic enzyme. TIM barrels are quite common among the conserved protein folds...

    • sandwich
      • beta lactamase

See also

  • SCOP
    Structural Classification of Proteins
    The Structural Classification of Proteins database is a largely manual classification of protein structural domains based on similarities of their structures and amino acid sequences. A motivation for this classification is to determine the evolutionary relationship between proteins...

  • FSSP
    Families of structurally similar proteins
    Families of Structurally Similar Proteins or FSSP is a database of structurally superimposed proteins generated using the "Distance-matrix ALIgnment" algorithm. The database is helpful for the comparison of protein structures.-External links:*...

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK