Single nucleotide polymorphism
Encyclopedia
A single-nucleotide polymorphism (SNP, pronounced snip) is a DNA sequence
DNA sequence
The sequence or primary structure of a nucleic acid is the composition of atoms that make up the nucleic acid and the chemical bonds that bond those atoms. Because nucleic acids, such as DNA and RNA, are unbranched polymers, this specification is equivalent to specifying the sequence of...
variation occurring when a single nucleotide
Nucleotide
Nucleotides are molecules that, when joined together, make up the structural units of RNA and DNA. In addition, nucleotides participate in cellular signaling , and are incorporated into important cofactors of enzymatic reactions...
— A
Adenine
Adenine is a nucleobase with a variety of roles in biochemistry including cellular respiration, in the form of both the energy-rich adenosine triphosphate and the cofactors nicotinamide adenine dinucleotide and flavin adenine dinucleotide , and protein synthesis, as a chemical component of DNA...
, T
Thymine
Thymine is one of the four nucleobases in the nucleic acid of DNA that are represented by the letters G–C–A–T. The others are adenine, guanine, and cytosine. Thymine is also known as 5-methyluracil, a pyrimidine nucleobase. As the name suggests, thymine may be derived by methylation of uracil at...
, C
Cytosine
Cytosine is one of the four main bases found in DNA and RNA, along with adenine, guanine, and thymine . It is a pyrimidine derivative, with a heterocyclic aromatic ring and two substituents attached . The nucleoside of cytosine is cytidine...
or G
Guanine
Guanine is one of the four main nucleobases found in the nucleic acids DNA and RNA, the others being adenine, cytosine, and thymine . In DNA, guanine is paired with cytosine. With the formula C5H5N5O, guanine is a derivative of purine, consisting of a fused pyrimidine-imidazole ring system with...
— in the genome
Genome
In modern molecular biology and genetics, the genome is the entirety of an organism's hereditary information. It is encoded either in DNA or, for many types of virus, in RNA. The genome includes both the genes and the non-coding sequences of the DNA/RNA....
(or other shared sequence) differs between members of a biological species or paired chromosome
Chromosome
A chromosome is an organized structure of DNA and protein found in cells. It is a single piece of coiled DNA containing many genes, regulatory elements and other nucleotide sequences. Chromosomes also contain DNA-bound proteins, which serve to package the DNA and control its functions.Chromosomes...
s in an individual. For example, two sequenced DNA fragments from different individuals, AAGCCTA to AAGCTTA, contain a difference in a single nucleotide. In this case we say that there are two allele
Allele
An allele is one of two or more forms of a gene or a genetic locus . "Allel" is an abbreviation of allelomorph. Sometimes, different alleles can result in different observable phenotypic traits, such as different pigmentation...
s: C and T. Almost all common SNPs have only two alleles. SNPs can occur in both coding and non-coding regions of genome.
Within a population, SNPs can be assigned a minor allele frequency
Minor allele frequency
Minor allele frequency refers to the frequency at which the less common allele occurs in a given population.SNPs with a minor allele frequency of 5% or greater were targeted by the HapMap project....
— the lowest allele frequency at a locus
Locus (genetics)
In the fields of genetics and genetic computation, a locus is the specific location of a gene or DNA sequence on a chromosome. A variant of the DNA sequence at a given locus is called an allele. The ordered list of loci known for a particular genome is called a genetic map...
that is observed in a particular population. This is simply the lesser of the two allele frequencies for single-nucleotide polymorphisms. There are variations between human populations, so a SNP allele that is common in one geographical or ethnic group may be much rarer in another.
These genetic variations between the individuals (particularly in the non-coding parts of genome) are exploited in DNA fingerprinting, which is used in criminology. Also, these genetic variations underlie differences in our susceptibility to, or protection from all kinds of diseases. The severity of illness and the way our body responds to treatments are also manifestations of genetic variations. For example, a single base difference in the Apolipoprotein E
Apolipoprotein E
Apolipoprotein E is a class of apolipoprotein found in the chylomicron and IDLs that binds to a specific receptor on liver cells and peripheral cells. It is essential for the normal catabolism of triglyceride-rich lipoprotein constituents.-Function:...
is associated with Alzheimer's disease.
Types
Types of SNPs |
---|
|
Single-nucleotide polymorphisms
Polymorphism (biology)
Polymorphism in biology occurs when two or more clearly different phenotypes exist in the same population of a species — in other words, the occurrence of more than one form or morph...
may fall within coding sequences of gene
Gene
A gene is a molecular unit of heredity of a living organism. It is a name given to some stretches of DNA and RNA that code for a type of protein or for an RNA chain that has a function in the organism. Living beings depend on genes, as they specify all proteins and functional RNA chains...
s, non-coding regions of genes
Intron
An intron is any nucleotide sequence within a gene that is removed by RNA splicing to generate the final mature RNA product of a gene. The term intron refers to both the DNA sequence within a gene, and the corresponding sequence in RNA transcripts. Sequences that are joined together in the final...
, or in the intergenic region
Intergenic region
An Intergenic region is a stretch of DNA sequences located between clusters of genes that contain few or no genes. Occasionally some intergenic DNA acts to control genes nearby, but most of it has no currently known function...
s (regions between genes). SNPs within a coding sequence do not necessarily change the amino acid
Amino acid
Amino acids are molecules containing an amine group, a carboxylic acid group and a side-chain that varies between different amino acids. The key elements of an amino acid are carbon, hydrogen, oxygen, and nitrogen...
sequence of the protein
Protein
Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form, facilitating a biological function. A polypeptide is a single linear polymer chain of amino acids bonded together by peptide bonds between the carboxyl and amino groups of...
that is produced, due to degeneracy of the genetic code.
A SNP in which both alleles produce the same polypeptide sequence is called a synonymous polymorphism (sometimes called a silent mutation
Silent mutation
Silent mutations are DNA mutations that do not result in a change to the amino acid sequence of a protein. They may occur in a non-coding region , or they may occur within an exon in a manner that does not alter the final amino acid sequence...
). If a different polypeptide sequence is produced the polymorphism is a replacement polymorphism. A replacement polymorphism change may be either missense
Missense mutation
In genetics, a missense mutation is a point mutation in which a single nucleotide is changed, resulting in a codon that codes for a different amino acid . This can render the resulting protein nonfunctional...
, which results in a different amino acid, or nonsense
Nonsense mutation
In genetics, a nonsense mutation is a point mutation in a sequence of DNA that results in a premature stop codon, or a nonsense codon in the transcribed mRNA, and in a truncated, incomplete, and usually nonfunctional protein product. It differs from a missense mutation, which is a point mutation...
, which results in a premature stop codon
Stop codon
In the genetic code, a stop codon is a nucleotide triplet within messenger RNA that signals a termination of translation. Proteins are based on polypeptides, which are unique sequences of amino acids. Most codons in messenger RNA correspond to the addition of an amino acid to a growing polypeptide...
. Over half of all known disease mutations come from replacement polymorphisms.
SNPs that are not in protein-coding regions may still affect gene splicing, transcription factor
Transcription factor
In molecular biology and genetics, a transcription factor is a protein that binds to specific DNA sequences, thereby controlling the flow of genetic information from DNA to mRNA...
binding, or the sequence of non-coding RNA
Non-coding RNA
A non-coding RNA is a functional RNA molecule that is not translated into a protein. Less-frequently used synonyms are non-protein-coding RNA , non-messenger RNA and functional RNA . The term small RNA is often used for short bacterial ncRNAs...
. Gene expression affected by this type of SNP is referred to as an eSNP (expression SNP) and may be upstream or downstream from the gene.
Use and importance
Variations in the DNA sequences of humans can affect how humans develop diseaseDisease
A disease is an abnormal condition affecting the body of an organism. It is often construed to be a medical condition associated with specific symptoms and signs. It may be caused by external factors, such as infectious disease, or it may be caused by internal dysfunctions, such as autoimmune...
s and respond to pathogen
Pathogen
A pathogen gignomai "I give birth to") or infectious agent — colloquially, a germ — is a microbe or microorganism such as a virus, bacterium, prion, or fungus that causes disease in its animal or plant host...
s, chemicals, drugs
Medication
A pharmaceutical drug, also referred to as medicine, medication or medicament, can be loosely defined as any chemical substance intended for use in the medical diagnosis, cure, treatment, or prevention of disease.- Classification :...
, vaccine
Vaccine
A vaccine is a biological preparation that improves immunity to a particular disease. A vaccine typically contains an agent that resembles a disease-causing microorganism, and is often made from weakened or killed forms of the microbe or its toxins...
s, and other agents. SNPs are also thought to be key enablers in realizing the concept of personalized medicine
Personalized medicine
Personalized medicine is a medical model emphasizing in general the customization of healthcare, with all decisions and practices being tailored to individual patients in whatever ways possible...
. However, their greatest importance in biomedical research is for comparing regions of the genome between cohort
Cohort (statistics)
In statistics and demography, a cohort is a group of subjects who have shared a particular time together during a particular time span . Cohorts may be tracked over extended periods in a cohort study. The cohort can be modified by censoring, i.e...
s (such as with matched cohorts with and without a disease) in genome-wide association studies
Genome-wide association study
In genetic epidemiology, a genome-wide association study , also known as whole genome association study , is an examination of many common genetic variants in different individuals to see if any variant is associated with a trait...
.
The study of SNPs is also important in crop and livestock
Livestock
Livestock refers to one or more domesticated animals raised in an agricultural setting to produce commodities such as food, fiber and labor. The term "livestock" as used in this article does not include poultry or farmed fish; however the inclusion of these, especially poultry, within the meaning...
breeding programs (see genotyping
Genotyping
Genotyping is the process of determining differences in the genetic make-up of an individual by examining the individual's DNA sequence using biological assays and comparing it to another individual's sequence or a reference sequence. It reveals the alleles an individual has inherited from their...
). See SNP genotyping
SNP genotyping
SNP genotyping is the measurement of genetic variations of single nucleotide polymorphisms between members of a species. It is a form of genotyping, which is the measurement of more general genetic variation. SNPs are one of the most common types of genetic variation...
for details on the various methods used to identify SNPs.
They are usually biallelic and thus easily assayed.
SNPs do not usually function individually, rather, they work in coordination with other SNPs to manifest a disease condition as has been seen in osteoporosis.
Examples
- rs6311Rs6311In genetics rs6311 is a gene variation—a single nucleotide polymorphism —in the human HTR2A gene that codes for the 5-HT2A receptor. 5-HT2A is neuroreceptor, and several scientific studies have investigated the effect of the genetic variation on personality, e.g., personality traits measured with...
and rs6313Rs6313In genetics, rs6313 also called T102C or C102T is a gene variation—a single nucleotide polymorphism —in the human HTR2A gene that codes for the 5-HT2A receptor....
are SNPs in the HTR2A gene on human chromosome 13. - A SNP in the F5 gene causes a hypercoagulability disorder with the variant Factor V LeidenFactor V LeidenFactor V Leiden is the name given to a variant of human factor V that causes a hypercoagulability disorder. In this disorder the Leiden variant of factor V cannot be inactivated by activated protein C. Factor V Leiden is the most common hereditary hypercoagulability disorder amongst Eurasians...
. - rs3091244 is an example of a triallelic SNP in the CRP gene on human chromosome 1.
- TAS2R38TAS2R38TAS2R38 is a bitter taste receptor which facilitates the tasting of phenylthiocarbamide and propylthiouracil , although it does not explain supertasting.. Carriers of the PAV allele experience more bitterness from vegetables and consume vegetables less frequently and in lower amounts ....
codes for PTCPhenylthiocarbamidePhenylthiocarbamide, also known as PTC, or phenylthiourea,is an organosulfur thiourea containing a phenyl ring.It has the unusual property that it either tastes very bitter or is virtually tasteless, depending on the genetic makeup of the taster...
tasting ability, and contains 6 annotated SNPs.
Databases
As there are for genes, bioinformaticsBioinformatics
Bioinformatics is the application of computer science and information technology to the field of biology and medicine. Bioinformatics deals with algorithms, databases and information systems, web technologies, artificial intelligence and soft computing, information and computation theory, software...
databases exist for SNPs.
dbSNP
DbSNP
The Single Nucleotide Polymorphism Database is a free public archive for genetic variation within and across different species developed and hosted by the National Center for Biotechnology Information in collaboration with the National Human Genome Research Institute...
is a SNP database from National Center for Biotechnology Information
National Center for Biotechnology Information
The National Center for Biotechnology Information is part of the United States National Library of Medicine , a branch of the National Institutes of Health. The NCBI is located in Bethesda, Maryland and was founded in 1988 through legislation sponsored by Senator Claude Pepper...
(NCBI).
SNPedia
SNPedia
SNPedia is a wiki-based bioinformatics web site that serves as a database of single nucleotide polymorphisms . Each article on a SNP provides a short description, links to scientific articles and personal genomics web sites, as well as microarray information about that SNP...
is a wiki-style database from a hybrid organization
Hybrid organization
A hybrid organization is an organization that mixes elements, value systems and action logics of various sectors of society, i.e. the public sector, the private sector and the voluntary sector...
.
The OMIM database describes the association between polymorphisms and diseases (e.g., gives diseases in text form), the Human Gene Mutation Database provides gene mutations causing or associated with human inherited diseases and functional SNPs, and GWAS Central allows users to visually interrogate the actual summary-level association data in one or more genetic association studies.
Nomenclature
The nomenclature for SNPs can be confusing: several variations can exist for an individual SNP and consensus has not yet been achieved. One approach is to write SNPs with a prefix, period and "greater than" sign showing the wild-type and altered nucleotide or amino acid; for example, c.76A>T.SNP analysis
Analytical methods to discover novel SNPs and detect known SNPs include:- DNA sequencing;
- capillary electrophoresis;
- mass spectrometry;
- single-strand conformation polymorphism (SSCP);
- electrochemical analysis;
- denaturating HPLC and gel electrophoresis;
- restriction fragment length polymorphism;
- hybridization analysis;
See also
- Single-base extension
- SNP arraySNP arrayIn molecular biology and bioinformatics, a SNP array is a type of DNA microarray which is used to detect polymorphisms within a population. A single nucleotide polymorphism , a variation at a single site in DNA, is the most frequent type of variation in the genome. For example, there are around 10...
- VariomeVariomeThe Variome is the whole set of genetic variations found in populations of species that have gone through a relatively short evolution change. For example, among humans, about 1 in every 1,200 nucleotide bases differ. However, as the human species diverged only 10,000 years ago, this variation rate...
- TaqManTaqManTaqMan probes are hydrolysis probes that are designed to increase the specificity of real-time PCR assays. The method was first reported in 1991 by researchers at Cetus Corporation, and the technology was subsequently developed by Roche Molecular Diagnostics for diagnostic assays and by Applied...
- AffymetrixAffymetrixAffymetrix is a company that manufactures DNA microarrays; it is based in Santa Clara, California, United States. The company was founded by Dr. Stephen Fodor in 1992. It began as a unit in Affymax N.V...
- International HapMap ProjectInternational HapMap ProjectThe International HapMap Project is an organization that aims to develop a haplotype map of the human genome, which will describe the common patterns of human genetic variation. HapMap is a key resource for researchers to find genetic variants affecting health, disease and responses to drugs and...
- tag SNPTag SNPA tag SNP is a representative single nucleotide polymorphism in a region of the genome with high linkage disequilibrium . It is possible to identify genetic variation without genotyping every SNP in a chromosomal region...
- Short tandem repeatShort tandem repeatA short tandem repeat in DNA occurs when a pattern of two or more nucleotides are repeated and the repeated sequences are directly adjacent to each other. The pattern can range in length from 2 to 5 base pairs and is typically in the non-coding intron region...
(STR) - SnpstrSnpstrA SNPSTR is a compound genetic marker composed of one or more SNPs and one microsatellite . SNPSTRs were first described by MOUNTAIN et al. who developed experimental protocols for autosomal SNPSTRs which contain a SNP and a microsatellite within 500 base pairs of one another...
External links
- NCBI resources — Introduction to SNPs from NCBI
- The SNP Consortium LTD — SNP search
- NCBI dbSNP database — "a central repository for both single base nucleotide substitutions and short deletion and insertion polymorphisms"
- HGMD — the Human Gene Mutation Database, includes rare mutations and functional SNPs
- SNPedia - a wiki devoted to the medical consequences of DNA variations, including software to analyze personal genomes
- International HapMap Project — "a public resource that will help researchers find genes associated with human disease and response to pharmaceuticals"
- GWAS Central — a central database of summary-level genetic association findings
- 1000 Genomes Project — A Deep Catalog of Human Genetic Variation
- SIFT — "An online tool that predicts on the effect of SNPs on protein function"
- PolyPhen-2 - "An online tool that predicts the effect of nonsynonymous SNPs on protein function"
- MutationTaster - "Evaluates disease-causing potential of sequence alterations"
- WatCut — an online tool for the design of SNP-RFLP assays
- SNPStats — SNPStats, a web tool for analysis of genetic association studies
- Restriction HomePage — a set of tools for DNA restriction and SNP detection, including design of mutagenic primers
- American Association for Cancer Research Cancer Concepts Factsheet on SNPs
- PharmGKB — The Pharmacogenetics and Pharmacogenomics Knowledge Base, a resource for SNPs associated with drug response and disease outcomes.
- GEN-SNiP — Online tool that identifies polymorphisms in test DNA sequences.
- Online tool that predicts on the effects of SNPs on protein function
- Rules for Nomenclature of Genes, Genetic Markers, Alleles, and Mutations in Mouse and Rat
- HGNC Guidelines for Human Gene Nomenclature
- SNP effect predictor with galaxy integration
- Human Gene Mutation Database
- GWAS Central
- open SNP