RefSeq
Encyclopedia
The Reference Sequence database
Sequence database
In the field of bioinformatics, a sequence database is a large collection of computerized nucleic acid sequences, protein sequences, or other sequences stored on a computer...

 is an open access, annotated and curated collection of publicly available nucleotide
Nucleotide
Nucleotides are molecules that, when joined together, make up the structural units of RNA and DNA. In addition, nucleotides participate in cellular signaling , and are incorporated into important cofactors of enzymatic reactions...

 sequences (DNA, RNA) and their protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

 products. This database is built by National Center for Biotechnology Information
National Center for Biotechnology Information
The National Center for Biotechnology Information is part of the United States National Library of Medicine , a branch of the National Institutes of Health. The NCBI is located in Bethesda, Maryland and was founded in 1988 through legislation sponsored by Senator Claude Pepper...

 (NCBI), and, unlike GenBank
GenBank
The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. This database is produced and maintained by the National Center for Biotechnology Information as part of the International Nucleotide Sequence...

, provides only single record for each natural biological molecule(i.e. DNA, RNA or protein) for major organisms ranging from viruses to bacteria to eukaryotes.

For each model organism
Model organism
A model organism is a non-human species that is extensively studied to understand particular biological phenomena, with the expectation that discoveries made in the organism model will provide insight into the workings of other organisms. Model organisms are in vivo models and are widely used to...

, RefSeq aims to provide separate and linked records for the genomic DNA, the gene transcripts, and the proteins arising from those transcripts. RefSeq is limited to major organisms for which sufficient data is available (more than 16,000 distinct “named” organisms as of September 2011), while GenBank
GenBank
The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. This database is produced and maintained by the National Center for Biotechnology Information as part of the International Nucleotide Sequence...

 includes sequences for any organism submitted (approximately 250,000 different named organisms).

RefSeq categories

Category Description
NC Complete genomic molecules
NG Incomplete genomic region
NM mRNA
NR ncRNA
NP Protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

XM mRNA of under curation
Curation
Curation may refer to:*Digital curation, the preservation and maintenance of digital assets*Sheer curation, a minimalist form of digital curationCuration may also be:*The work performed by a curator*Archiving, historical record keeping...

 process
XP Protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

 of under curation
Curation
Curation may refer to:*Digital curation, the preservation and maintenance of digital assets*Sheer curation, a minimalist form of digital curationCuration may also be:*The work performed by a curator*Archiving, historical record keeping...

 process

See also

  • GenBank
    GenBank
    The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. This database is produced and maintained by the National Center for Biotechnology Information as part of the International Nucleotide Sequence...

  • Sequence analysis
    Sequence analysis
    In bioinformatics, the term sequence analysis refers to the process of subjecting a DNA, RNA or peptide sequence to any of a wide range of analytical methods to understand its features, function, structure, or evolution. Methodologies used include sequence alignment, searches against biological...

  • Sequence profiling tool
    Sequence profiling tool
    A sequence profiling tool in bioinformatics is a type of software that presents information related to a genetic sequence, gene name, or keyword input. Such tools generally take a query such as a DNA, RNA, or protein sequence or ‘keyword’ and search one or more databases for information related to...

  • Sequence motif
    Sequence motif
    In genetics, a sequence motif is a nucleotide or amino-acid sequence pattern that is widespread and has, or is conjectured to have, a biological significance...

  • UniProt
    UniProt
    UniProt is a comprehensive, high-quality and freely accessible database of protein sequence and functional information, many of which are derived from genome sequencing projects...

  • List of sequenced eukaryotic genomes
  • List of sequenced archeal genomes

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK