GeneRIF
Encyclopedia
A GeneRIF or Gene Reference Into Function is a short (255 characters or fewer) statement about the function of a gene. GeneRIFs provide a simple mechanism
for allowing scientist
s to add to the functional annotation
of gene
s described in the Entrez Gene database. In practice, function is construed quite broadly. For example, there are GeneRIFs that discuss the role of a gene in a disease, GeneRIFs that point the viewer towards a review article about the gene, and GeneRIFs that discuss the structure of a gene. However, the stated intent is for GeneRIFs to be about gene function. Currently over half a million geneRIFs have been created for genes from almost 1000 different species.
GeneRIFs are always associated with specific entries in the Entrez Gene database. Each GeneRIF has a pointer to the PubMed ID (a type of document identifier) of a scientific publication that provides evidence for the statement made by the GeneRIF. GeneRIFs are often extracted directly from the document that is identified by the PubMed ID, very frequently from its title or from its final sentence.
GeneRIFs are usually produced by NCBI
indexers, but anyone may submit a GeneRIF.
To be processed, a valid Gene ID must exist for the specific gene, or the Gene staff must have assigned an overall Gene ID to the species
. The latter case is implemented via records in Gene with the symbol NEWENTRY. Once the Gene ID is identified, only three types of information are required to complete a submission
:
The PubMed
document identifiers have been omitted from the examples. Note the wide variability with respect to the presence or absence of punctuation and of sentence-initial capital letters.
GeneRIFs are an unusual type of textual genre, and they have recently been the subject of a number of articles from the natural language processing community.
Mechanism (biology)
In biology --and in science in general-- a mechanism is a complex object or, more generally, a process that produces a regular phenomenon. For example, natural selection is one of the mechanisms of biological evolution, other being genetic drift, biased mutation, and gene flow; competition,...
for allowing scientist
Scientist
A scientist in a broad sense is one engaging in a systematic activity to acquire knowledge. In a more restricted sense, a scientist is an individual who uses the scientific method. The person may be an expert in one or more areas of science. This article focuses on the more restricted use of the word...
s to add to the functional annotation
Annotation
An annotation is a note that is made while reading any form of text. This may be as simple as underlining or highlighting passages.Annotated bibliographies give descriptions about how each source is useful to an author in constructing a paper or argument...
of gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...
s described in the Entrez Gene database. In practice, function is construed quite broadly. For example, there are GeneRIFs that discuss the role of a gene in a disease, GeneRIFs that point the viewer towards a review article about the gene, and GeneRIFs that discuss the structure of a gene. However, the stated intent is for GeneRIFs to be about gene function. Currently over half a million geneRIFs have been created for genes from almost 1000 different species.
GeneRIFs are always associated with specific entries in the Entrez Gene database. Each GeneRIF has a pointer to the PubMed ID (a type of document identifier) of a scientific publication that provides evidence for the statement made by the GeneRIF. GeneRIFs are often extracted directly from the document that is identified by the PubMed ID, very frequently from its title or from its final sentence.
GeneRIFs are usually produced by NCBI
National Center for Biotechnology Information
The National Center for Biotechnology Information is part of the United States National Library of Medicine , a branch of the National Institutes of Health. The NCBI is located in Bethesda, Maryland and was founded in 1988 through legislation sponsored by Senator Claude Pepper...
indexers, but anyone may submit a GeneRIF.
To be processed, a valid Gene ID must exist for the specific gene, or the Gene staff must have assigned an overall Gene ID to the species
Species
In biology, a species is one of the basic units of biological classification and a taxonomic rank. A species is often defined as a group of organisms capable of interbreeding and producing fertile offspring. While in many cases this definition is adequate, more precise or differing measures are...
. The latter case is implemented via records in Gene with the symbol NEWENTRY. Once the Gene ID is identified, only three types of information are required to complete a submission
Submission
Submission is the acknowledgement of the legitimacy of the power of one's superior or superiors.Submission may also refer to:* Submission/Submitter , an Islamic organisation...
:
- a concise phrasePhraseIn everyday speech, a phrase may refer to any group of words. In linguistics, a phrase is a group of words which form a constituent and so function as a single unit in the syntax of a sentence. A phrase is lower on the grammatical hierarchy than a clause....
describing a function or functions (less than 255 characterCharacter (computing)In computer and machine-based telecommunications terminology, a character is a unit of information that roughly corresponds to a grapheme, grapheme-like unit, or symbol, such as in an alphabet or syllabary in the written form of a natural language....
s in length, preferably more than a restatement of the title of the paper); - a published paper describing that function, implemented by supplying the PubMed ID of a citation in PubMed;
- a valid e-mail addressE-mail addressAn email address identifies an email box to which email messages are delivered. An example format of an email address is lewis@example.net which is read as lewis at example dot net...
(which will remain confidential).
Example
Here are some GeneRIFs taken from Entrez Gene for GeneID 7157, the human gene TP53.The PubMed
PubMed
PubMed is a free database accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The United States National Library of Medicine at the National Institutes of Health maintains the database as part of the Entrez information retrieval system...
document identifiers have been omitted from the examples. Note the wide variability with respect to the presence or absence of punctuation and of sentence-initial capital letters.
- p53 and c-erbB-2 may have independent role in carcinogenesis of gall bladder cancer
- Degradation of endogenous HIPK2 depends on the presence of a functional p53 protein.
- p53 codon 72 alleles influence the response to anticancer drugs in cells from aged people by regulating the cell cycle inhibitor p21WAF1
- Logistic regression analysis showed p53 and COX-2 as dependent predictors in pancreatic carcinogenesis, and a reciprocal relationship to neoplastic progression between p53 and COX-2.
GeneRIFs are an unusual type of textual genre, and they have recently been the subject of a number of articles from the natural language processing community.