Functional genomics
Encyclopedia
Functional genomics is a field of molecular biology
Molecular biology
Molecular biology is the branch of biology that deals with the molecular basis of biological activity. This field overlaps with other areas of biology and chemistry, particularly genetics and biochemistry...

 that attempts to make use of the vast wealth of data produced by genomic projects (such as genome sequencing projects
Genome project
Genome projects are scientific endeavours that ultimately aim to determine the complete genome sequence of an organism and to annotate protein-coding genes and other important genome-encoded features...

) to describe gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...

 (and protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

) functions and interactions. Unlike genomics
Genomics
Genomics is a discipline in genetics concerning the study of the genomes of organisms. The field includes intensive efforts to determine the entire DNA sequence of organisms and fine-scale genetic mapping efforts. The field also includes studies of intragenomic phenomena such as heterosis,...

 and proteomics
Proteomics
Proteomics is the large-scale study of proteins, particularly their structures and functions. Proteins are vital parts of living organisms, as they are the main components of the physiological metabolic pathways of cells. The term "proteomics" was first coined in 1997 to make an analogy with...

, functional genomics focuses on the dynamic aspects such as gene transcription
Transcription (genetics)
Transcription is the process of creating a complementary RNA copy of a sequence of DNA. Both RNA and DNA are nucleic acids, which use base pairs of nucleotides as a complementary language that can be converted back and forth from DNA to RNA by the action of the correct enzymes...

, translation, and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence
DNA sequence
The sequence or primary structure of a nucleic acid is the composition of atoms that make up the nucleic acid and the chemical bonds that bond those atoms. Because nucleic acids, such as DNA and RNA, are unbranched polymers, this specification is equivalent to specifying the sequence of...

 or structures. Functional genomics attempts to answer questions about the function of DNA at the levels of genes, RNA transcripts, and protein products. A key characteristic of functional genomics studies is their genome-wide approach to these questions, generally involving high-throughput methods rather than a more traditional “gene-by-gene” approach.

Goals of functional genomics

The goal of functional genomics is to understand the relationship between an organism's genome and its phenotype. The term functional genomics is often used broadly to refer to the many possible approaches to understanding the properties and function of the entirety of an organism's genes and gene products. This definition is somewhat variable; Gibson and Muse define it as "approaches under development to ascertain the biochemical, cellular, and/or physiological properties of each and every gene product", while Pevsner includes the study of nongenic elements in his definition: "the genome-wide study of the function of DNA (including genes and nongenic elements), as well as the nucleic acid and protein products encoded by DNA". Because of its genome-wide approach, functional genomics requires the use of high-throughput technologies capable of assaying many functions or relationships simultaneously. Functional genomics involves studies of natural variation in genes, RNA, and proteins over time (such as an organism's development) or space (such as its body regions), as well as studies of natural or experimental functional disruptions affecting genes, chromosomes, RNAs, or proteins.

The promise of functional genomics is to expand and synthesize genomic and proteomic knowledge into an understanding of the dynamic properties of an organism at cellular and/or organismal levels. This would provide a more complete picture of how biological function arises from the information encoded in an organism's genome. The possibility of understanding how a particular mutation leads to a given phenotype has important implications for human genetic diseases, as answering these questions could point scientists in the direction of a treatment or cure.

Techniques and applications

Functional genomics includes function-related aspects of the genome itself such as mutation
Mutation
In molecular biology and genetics, mutations are changes in a genomic sequence: the DNA sequence of a cell's genome or the DNA or RNA sequence of a virus. They can be defined as sudden and spontaneous changes in the cell. Mutations are caused by radiation, viruses, transposons and mutagenic...

 and polymorphism
Polymorphism (biology)
Polymorphism in biology occurs when two or more clearly different phenotypes exist in the same population of a species — in other words, the occurrence of more than one form or morph...

 (such as SNP
Single nucleotide polymorphism
A single-nucleotide polymorphism is a DNA sequence variation occurring when a single nucleotide — A, T, C or G — in the genome differs between members of a biological species or paired chromosomes in an individual...

) analysis, as well as measurement of molecular activities. The latter comprise a number of "-omics" such as transcriptomics (gene expression
Gene expression
Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. These products are often proteins, but in non-protein coding genes such as ribosomal RNA , transfer RNA or small nuclear RNA genes, the product is a functional RNA...

), proteomics
Proteomics
Proteomics is the large-scale study of proteins, particularly their structures and functions. Proteins are vital parts of living organisms, as they are the main components of the physiological metabolic pathways of cells. The term "proteomics" was first coined in 1997 to make an analogy with...

 (protein expression
Protein expression
Protein expression is a subcomponent of gene expression. It consists of the stages after DNA has been translated into polypeptide chains, which are ultimately folded into proteins...

), phosphoproteomics
Phosphoproteomics
Phosphoproteomics is a branch of proteomics that identifies, catalogs, and characterizes proteins containing a phosphate group as a post-translational modification. Phosphorylation is a key reversible modification that regulates protein function, subcellular localization, complex formation,...

 (a subset of proteomics) and metabolomics
Metabolomics
Metabolomics is the scientific study of chemical processes involving metabolites. Specifically, metabolomics is the "systematic study of the unique chemical fingerprints that specific cellular processes leave behind", the study of their small-molecule metabolite profiles...

. Functional genomics uses mostly multiplex
Multiplex (assay)
A multiplex assay is a type of laboratory procedure that simultaneously measures multiple analytes in a single assay. It is distinguished from procedures that measure one or a few analytes at a time...

 techniques to measure the abundance of many or all gene products such as mRNA
Messenger RNA
Messenger RNA is a molecule of RNA encoding a chemical "blueprint" for a protein product. mRNA is transcribed from a DNA template, and carries coding information to the sites of protein synthesis: the ribosomes. Here, the nucleic acid polymer is translated into a polymer of amino acids: a protein...

s or protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...

s within a biological sample
Specimen
A specimen is a portion/quantity of material for use in testing, examination, or study.BiologyA laboratory specimen is an individual animal, part of an animal, a plant, part of a plant, or a microorganism, used as a representative to study the properties of the whole population of that species or...

. Together these measurement modalities endeavor to quantitate the various biological processes and improve our understanding of gene and protein functions and interactions.

Genetic interaction mapping

Systematic pairwise deletion of genes or inhibition of gene expression can be used to identify genes with related function, even if they do not interact physically. Epistasis refers to the fact that effects for two different gene knockouts may not be additive; that is, the phenotype that results when two genes are inhibited may be different from the sum of the effects of single knockouts.

The ENCODE project

The ENCODE (Encyclopedia of DNA elements) project is an in-depth analysis of the human genome whose goal is to identify all the functional elements of genomic DNA, in both coding and noncoding regions. To this point, only the pilot phase of the study has been completed, involving hundreds of assays performed on 44 regions of known or unknown function comprising 1% of the human genome. Important results include evidence from genomic tiling arrays that most nucleotides are transcribed as coding transcripts, noncoding RNAs, or random transcripts, the discovery of additional transcriptional regulatory sites, further elucidation of chromatin-modifying mechanisms.

Microarrays

Microarrays measure the amount of mRNA in a sample that corresponds to a given gene or probe DNA sequence. Probe sequences are immobilized on a solid surface and allowed to hybridize with fluorescently-labeled “target” mRNA. The intensity of fluorescence of a spot is proportional to the amount of target sequence that has hybridized to that spot, and therefore to the abundance of that mRNA sequence in the sample. Microarrays allow for identification of candidate genes involved in a given process based on variation between transcript levels for different conditions and shared expression patterns with genes of known function.

SAGE

SAGE (Serial analysis of gene expression) is an alternate method of gene expression analysis based on RNA sequencing rather than hybridization. SAGE relies on the sequencing of 10–17 base pair tags which are unique to each gene. These tags are produced from poly-A mRNA and ligated end-to-end before sequencing. SAGE gives an unbiased measurement of the number of transcripts per cell, since it does not depend on prior knowledge of what transcripts to study (as microarrays do).

Yeast two-hybrid system

A yeast two-hybrid (Y2H) screen tests a "bait" protein against many potential interacting proteins ("prey") to identify physical protein–protein interactions. This system is based on a transcription factor, originally GAL4, whose separate DNA-binding and transcription activation domains are both required in order for the protein to cause transcription of a reporter gene. In a Y2H screen, the "bait" protein is fused to the binding domain of GAL4, and a library of potential "prey" (interacting) proteins is recombinantly expressed in a vector with the activation domain. In vivo interaction of bait and prey proteins in a yeast cell brings the activation and binding domains of GAL4 close enough together to result in expression of a reporter gene
Reporter gene
In molecular biology, a reporter gene is a gene that researchers attach to a regulatory sequence of another gene of interest in cell culture, animals or plants. Certain genes are chosen as reporters because the characteristics they confer on organisms expressing them are easily identified and...

. It is also possible to systematically test a library of bait proteins against a library of prey proteins to identify all possible interactions in a cell.

AP/MS

Affinity purification and mass spectrometry
Mass spectrometry
Mass spectrometry is an analytical technique that measures the mass-to-charge ratio of charged particles.It is used for determining masses of particles, for determining the elemental composition of a sample or molecule, and for elucidating the chemical structures of molecules, such as peptides and...

 (AP/MS) is able to identify proteins that interact with one another in complexes. Complexes of proteins are allowed to form around a particular “bait” protein. The bait protein is identified using an antibody or a recombinant tag which allows it to be extracted along with any proteins that have formed a complex with it. The proteins are then digested into short peptide
Peptide
Peptides are short polymers of amino acid monomers linked by peptide bonds. They are distinguished from proteins on the basis of size, typically containing less than 50 monomer units. The shortest peptides are dipeptides, consisting of two amino acids joined by a single peptide bond...

 fragments and mass spectrometry is used to identify the proteins based on the mass-to-charge ratios of those fragments.

Mutagenesis

Gene function can be investigated by systematically “knocking out” genes one by one. This is done by either deletion
Gene knockout
A gene knockout is a genetic technique in which one of an organism's genes is made inoperative . Also known as knockout organisms or simply knockouts, they are used in learning about a gene that has been sequenced, but which has an unknown or incompletely known function...

 or disruption of function (such as by insertional mutagenesis
Insertional mutagenesis
Insertional mutagenesis is mutagenesis of DNA by the insertion of one or more bases.Insertional mutations can occur naturally, mediated by virus or transposon, or can be artificially created for research purposes in the lab.- Signature tagged mutagenesis :...

) and the resulting organisms are screened for phenotypes that provide clues to the function of the disrupted gene.

RNAi

RNA interference (RNAi) methods can be used to transiently silence or knock down gene expression using ~20 base-pair double-stranded RNA typically delivered by transfection of synthetic ~20-mer short-interfering RNA molecules (siRNAs) or by virally-encoded short-hairpin RNAs (shRNAs). RNAi screens, typically performed in cell culture-based assays or experimental organisms (such as C. elegans) can be used to systematically disrupt nearly every gene in a genome or subsets of genes (sub-genomes); possible functions of disrupted genes can be assigned based on observed phenotype
Phenotype
A phenotype is an organism's observable characteristics or traits: such as its morphology, development, biochemical or physiological properties, behavior, and products of behavior...

s.

Genome annotation

Putative genes can be identified by scanning a genome for regions likely to encode proteins, based on characteristics such as long open reading frames, transcriptional initiation sequences, and polyadenylation
Polyadenylation
Polyadenylation is the addition of a poly tail to an RNA molecule. The poly tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In eukaryotes, polyadenylation is part of the process that produces mature messenger RNA for translation...

 sites. A sequence identified as a putative gene must be confirmed by further evidence, such as similarity to cDNA or EST sequences from the same organism, similarity of the predicted protein sequence to known proteins, association with promoter sequences, or evidence that mutating the sequence produces an observable phenotype.

Rosetta stone approach

The Rosetta stone approach is a computation method of de novo protein function prediction, based on the hypothesis that some proteins involved in a given physiological process may exist as two separate genes in one organism and as a single gene in another. Genomes are scanned for sequences that are independent in one organism and in a single open reading frame in another. If two genes have fused, it is predicted that they have similar biological functions that make such coregulation advantageous.

Functional genomics and bioinformatics

Because of the large quantity of data produced by these techniques and the desire to find biologically meaningful patterns, bioinformatics
Bioinformatics
Bioinformatics is the application of computer science and information technology to the field of biology and medicine. Bioinformatics deals with algorithms, databases and information systems, web technologies, artificial intelligence and soft computing, information and computation theory, software...

 is crucial to analysis of functional genomics data. Examples of techniques in this class are data clustering
Data clustering
Cluster analysis or clustering is the task of assigning a set of objects into groups so that the objects in the same cluster are more similar to each other than to those in other clusters....

 or principal component analysis for unsupervised machine learning
Machine learning
Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases...

 (class detection) as well as artificial neural network
Artificial neural network
An artificial neural network , usually called neural network , is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks. A neural network consists of an interconnected group of artificial neurons, and it processes...

s or support vector machine
Support vector machine
A support vector machine is a concept in statistics and computer science for a set of related supervised learning methods that analyze data and recognize patterns, used for classification and regression analysis...

s for supervised machine learning (class prediction, classification).

See also

  • Systems biology
    Systems biology
    Systems biology is a term used to describe a number of trends in bioscience research, and a movement which draws on those trends. Proponents describe systems biology as a biology-based inter-disciplinary study field that focuses on complex interactions in biological systems, claiming that it uses...

  • Structural genomics
    Structural genomics
    Structural genomics seeks to describe the 3-dimensional structure of every protein encoded by a given genome. This genome-based approach allows for a high-throughput method of structure determination by a combination of experimental and modeling approaches...

  • Comparative genomics
    Comparative genomics
    Comparative genomics is the study of the relationship of genome structure and function across different biological species or strains. Comparative genomics is an attempt to take advantage of the information provided by the signatures of selection to understand the function and evolutionary...

  • Pharmacogenomics
    Pharmacogenomics
    Pharmacogenomics is the branch of pharmacology which deals with the influence of genetic variation on drug response in patients by correlating gene expression or single-nucleotide polymorphisms with a drug's efficacy or toxicity...

  • MGED Society
  • Epigenetics
    Epigenetics
    In biology, and specifically genetics, epigenetics is the study of heritable changes in gene expression or cellular phenotype caused by mechanisms other than changes in the underlying DNA sequence – hence the name epi- -genetics...

  • Bioinformatics
    Bioinformatics
    Bioinformatics is the application of computer science and information technology to the field of biology and medicine. Bioinformatics deals with algorithms, databases and information systems, web technologies, artificial intelligence and soft computing, information and computation theory, software...

  • Epistasis and functional genomics
    Epistasis and Functional Genomics
    Epistasis refers to genetic interactions in which the mutation of one gene masks the phenotypic effects of a mutation at another locus. Systematic analysis of these epistatic interactions can provide insight into the structure and function of genetic pathways. By examining the phenotypes resulting...


External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK