Nested association mapping
Encyclopedia
Nested association mapping (NAM) is a technique designed by the labs of Edward Buckler, James Holland, and Michael McMullen for identifying and dissecting the genetic architecture of complex traits in corn (Zea mays
). It is important to note that nested association mapping (unlike Association mapping
) is a specific technique that cannot be performed outside of a specifically designed population such as the Maize NAM population,
the details of which are described below.
: linkage analysis and association mapping. Linkage analysis depends upon recent genetic recombination
between two different plant lines (as the result of a genetic cross) to identify general regions of interest, with the advantage of requiring few genetic marker
s to ensure genome wide coverage and high statistical power per allele. Linkage analysis, however, has the disadvantages of low mapping resolution and low allele richness. Association mapping
, by contrast, takes advantage of historic recombination, and is performed by scanning a genome for SNPs in linkage disequilibrium
with a trait of interest. Association mapping has advantages over linkage analysis in that it can map with high resolution and has high allelic richness, however, it also requires extensive knowledge of SNPs within the genome and is thus only now becoming possible in diverse species such as maize.
NAM takes advantage of both historic and recent recombination events in order to have the advantages low marker density requirements, high allele richness, high mapping resolution, and high statistical power, with none of the disadvantages of either linkage analysis or association mapping.
population. The F1 plants were then self-fertilized for six generations in order to create a total of 200 homozygous recombinant inbred lines (RILs) per family, for a total of 5000 RILs within the NAM population. The lines are publicly available through the USDA-ARS Maize Stock Center.
Each RIL was then genotyped
with the same 1106 molecular markers (for this to be possible, the researchers selected markers for which B73 had a rare allele), in order to identify recombination blocks. After genotyping with the 1106 markers, each of the parental lines was either sequenced
or high-density genotyped, and the results of that sequencing/genotyping overlaid on the recombination blocks identified for each RIL. The result was 5000 RILs that were either fully sequenced or high density genotyped that, due to genotyping with the common 1106 markers, could all be compared to each other and analyzed together (Figure 1).
The second aspect of the NAM population characterization is the sequencing of the parental lines. This captures information on the natural variation that went into the population and a record of the extensive recombination captured in the history of maize variation. The first phase of this sequencing was by reduced representation sequencing using next generation sequencing technology, as report in Gore, Chia et al. in 2009. This initial sequencing discovered 1.6 million variable regions in maize, which is now facilitating analysis of a wide range of traits.
in maize by looking for associations between SNPs within the NAM population and quantitative traits of interest (e.g. flowering time, plant height, carotene content
). As of 2009, however, the sequencing of the original parental lines was not yet completed to the degree necessary to perform these analyses. The NAM population has, however, been successfully used for linkage analysis. In the linkage study that has been released, the unique structure of the NAM population, described in the previous section, allowed for joint stepwise regression
and joint inclusive composite interval mapping of the combined NAM families to identify QTLs for flowering time.
of maize flowering time, and published in the summer of 2009. In this groundbreaking study, the authors scored days to silking, days to anthesis, and the silking-anthesis interval for nearly one million plants, then performed single and joint stepwise regression and inclusive composite interval mapping (ICIM) to identify 39 QTLs explaining 89% of the variance in days to silking and days to anthesis and 29 QTLs explaining 64% of the variance in the silking-anthesis interval.
Ninety-eight percent of the flowering time QTLs identified in this paper were found to affect flowering time by less than one day (as compared to the B73 reference). These relatively small QTL effects, however, were also shown to sum for each family to equal large differences and changes in days to silking. Furthermore, it was observed that while most QTLs were shared between families, each family appears to have functionally distinct alleles for most QTLs. These observations led the authors to propose a model of “Common genes with uncommon variants” to explain flowering time diversity in maize. They tested their model by documenting an allelic series in the previously studied maize flowering time QTL Vgt1 (vegetation-to-transition1) by controlling for genetic background and estimating the effects of vgt1 in each family. They then went on to identify specific sequence variants that corresponded to the allelic series, including one allele containing a miniature transposon
strongly associated with early flowering, and other alleles containing SNPs associated with later flowering.
s in non-maize species. Furthermore, the NAM lines become a powerful public resource for the maize community, and an opportunity for the sharing of maize germplasm as well as the results of maize studies via common databases (see external links), further facilitating future research into maize agricultural traits. Given that maize is one of the most important agricultural crops worldwide, such research has powerful implications for the genetic improvement of crops, and subsequently, worldwide food security
.
Similar designs are also being created for wheat
, barley
, sorghum
, and Arabidopsis thaliana
.
Maize
Maize known in many English-speaking countries as corn or mielie/mealie, is a grain domesticated by indigenous peoples in Mesoamerica in prehistoric times. The leafy stalk produces ears which contain seeds called kernels. Though technically a grain, maize kernels are used in cooking as a vegetable...
). It is important to note that nested association mapping (unlike Association mapping
Association mapping
Association mapping, also known as "linkage disequilibrium mapping", is a method of mapping quantitative trait loci that takes advantage of historic linkage disequilibrium to link phenotypes to genotypes .-Theory:Association mapping is based on the idea that traits that have entered a population...
) is a specific technique that cannot be performed outside of a specifically designed population such as the Maize NAM population,
the details of which are described below.
Theory behind NAM
NAM was created as a means of combining the advantages and eliminating the disadvantages of two traditional methods for identifying quantitative trait lociQuantitative trait locus
Quantitative traits refer to phenotypes that vary in degree and can be attributed to polygenic effects, i.e., product of two or more genes, and their environment. Quantitative trait loci are stretches of DNA containing or linked to the genes that underlie a quantitative trait...
: linkage analysis and association mapping. Linkage analysis depends upon recent genetic recombination
Genetic recombination
Genetic recombination is a process by which a molecule of nucleic acid is broken and then joined to a different one. Recombination can occur between similar molecules of DNA, as in homologous recombination, or dissimilar molecules, as in non-homologous end joining. Recombination is a common method...
between two different plant lines (as the result of a genetic cross) to identify general regions of interest, with the advantage of requiring few genetic marker
Genetic marker
A genetic marker is a gene or DNA sequence with a known location on a chromosome that can be used to identify cells, individuals or species. It can be described as a variation that can be observed...
s to ensure genome wide coverage and high statistical power per allele. Linkage analysis, however, has the disadvantages of low mapping resolution and low allele richness. Association mapping
Association mapping
Association mapping, also known as "linkage disequilibrium mapping", is a method of mapping quantitative trait loci that takes advantage of historic linkage disequilibrium to link phenotypes to genotypes .-Theory:Association mapping is based on the idea that traits that have entered a population...
, by contrast, takes advantage of historic recombination, and is performed by scanning a genome for SNPs in linkage disequilibrium
Linkage disequilibrium
In population genetics, linkage disequilibrium is the non-random association of alleles at two or more loci, not necessarily on the same chromosome. It is also referred to as to as gametic phase disequilibrium , or simply gametic disequilibrium...
with a trait of interest. Association mapping has advantages over linkage analysis in that it can map with high resolution and has high allelic richness, however, it also requires extensive knowledge of SNPs within the genome and is thus only now becoming possible in diverse species such as maize.
NAM takes advantage of both historic and recent recombination events in order to have the advantages low marker density requirements, high allele richness, high mapping resolution, and high statistical power, with none of the disadvantages of either linkage analysis or association mapping.
Creation of the Maize NAM population
Twenty-five diverse corn lines were chosen as the parental lines for the NAM population in order to encompass the remarkable diversity of maize and preserve historic linkage disequilibrium. Each parental line was crossed to the B73 maize inbred (chosen as a reference line due to its use in the public maize sequencing project and wide deployment as one of the most successful commercial inbred lines) to create the F1F1 hybrid
F1 hybrid is a term used in genetics and selective breeding. F1 stands for Filial 1, the first filial generation seeds/plants or animal offspring resulting from a cross mating of distinctly different parental types....
population. The F1 plants were then self-fertilized for six generations in order to create a total of 200 homozygous recombinant inbred lines (RILs) per family, for a total of 5000 RILs within the NAM population. The lines are publicly available through the USDA-ARS Maize Stock Center.
Each RIL was then genotyped
Genotyping
Genotyping is the process of determining differences in the genetic make-up of an individual by examining the individual's DNA sequence using biological assays and comparing it to another individual's sequence or a reference sequence. It reveals the alleles an individual has inherited from their...
with the same 1106 molecular markers (for this to be possible, the researchers selected markers for which B73 had a rare allele), in order to identify recombination blocks. After genotyping with the 1106 markers, each of the parental lines was either sequenced
Sequencing
In genetics and biochemistry, sequencing means to determine the primary structure of an unbranched biopolymer...
or high-density genotyped, and the results of that sequencing/genotyping overlaid on the recombination blocks identified for each RIL. The result was 5000 RILs that were either fully sequenced or high density genotyped that, due to genotyping with the common 1106 markers, could all be compared to each other and analyzed together (Figure 1).
The second aspect of the NAM population characterization is the sequencing of the parental lines. This captures information on the natural variation that went into the population and a record of the extensive recombination captured in the history of maize variation. The first phase of this sequencing was by reduced representation sequencing using next generation sequencing technology, as report in Gore, Chia et al. in 2009. This initial sequencing discovered 1.6 million variable regions in maize, which is now facilitating analysis of a wide range of traits.
Process of NAM
As with traditional QTL mapping strategies, the general goal in Nested Association Mapping is to correlate a phenotype of interest with specific genotypes. One of the creators’ stated goals for the NAM population was to be able to perform genome-wide association studiesGenome-wide association study
In genetic epidemiology, a genome-wide association study , also known as whole genome association study , is an examination of many common genetic variants in different individuals to see if any variant is associated with a trait...
in maize by looking for associations between SNPs within the NAM population and quantitative traits of interest (e.g. flowering time, plant height, carotene content
Beta-carotene
β-Carotene is a strongly-coloured red-orange pigment abundant in plants and fruits. It is an organic compound and chemically is classified as a hydrocarbon and specifically as a terpenoid , reflecting its derivation from isoprene units...
). As of 2009, however, the sequencing of the original parental lines was not yet completed to the degree necessary to perform these analyses. The NAM population has, however, been successfully used for linkage analysis. In the linkage study that has been released, the unique structure of the NAM population, described in the previous section, allowed for joint stepwise regression
Stepwise regression
In statistics, stepwise regression includes regression models in which the choice of predictive variables is carried out by an automatic procedure...
and joint inclusive composite interval mapping of the combined NAM families to identify QTLs for flowering time.
Current use of NAM
The first (and currently only) publication in which NAM was used to identify QTLs was authored by the Buckler lab on the genetic architectureGenetic architecture
Genetic architecture refers to the underlying genetic basis of a phenotypic trait. A synonymous term is the 'genotype-phenotype map', the way that genotypes map to the phenotypes....
of maize flowering time, and published in the summer of 2009. In this groundbreaking study, the authors scored days to silking, days to anthesis, and the silking-anthesis interval for nearly one million plants, then performed single and joint stepwise regression and inclusive composite interval mapping (ICIM) to identify 39 QTLs explaining 89% of the variance in days to silking and days to anthesis and 29 QTLs explaining 64% of the variance in the silking-anthesis interval.
Ninety-eight percent of the flowering time QTLs identified in this paper were found to affect flowering time by less than one day (as compared to the B73 reference). These relatively small QTL effects, however, were also shown to sum for each family to equal large differences and changes in days to silking. Furthermore, it was observed that while most QTLs were shared between families, each family appears to have functionally distinct alleles for most QTLs. These observations led the authors to propose a model of “Common genes with uncommon variants” to explain flowering time diversity in maize. They tested their model by documenting an allelic series in the previously studied maize flowering time QTL Vgt1 (vegetation-to-transition1) by controlling for genetic background and estimating the effects of vgt1 in each family. They then went on to identify specific sequence variants that corresponded to the allelic series, including one allele containing a miniature transposon
Transposon
Transposable elements are sequences of DNA that can move or transpose themselves to new positions within the genome of a single cell. The mechanism of transposition can be either "copy and paste" or "cut and paste". Transposition can create phenotypically significant mutations and alter the cell's...
strongly associated with early flowering, and other alleles containing SNPs associated with later flowering.
Implications of NAM
Nested association mapping has tremendous potential for the investigation of agronomic traits in maize and other species. As the initial flowering time study demonstrates, NAM has the power to identify QTLs for agriculturally relevant traits and to relate those QTLs to homologs and candidate geneCandidate gene
A candidate gene is a gene, located in a chromosome region suspected of being involved in the expression of a trait such as a disease, whose protein product suggests that it could be the gene in question...
s in non-maize species. Furthermore, the NAM lines become a powerful public resource for the maize community, and an opportunity for the sharing of maize germplasm as well as the results of maize studies via common databases (see external links), further facilitating future research into maize agricultural traits. Given that maize is one of the most important agricultural crops worldwide, such research has powerful implications for the genetic improvement of crops, and subsequently, worldwide food security
Food security
Food security refers to the availability of food and one's access to it. A household is considered food-secure when its occupants do not live in hunger or fear of starvation. According to the World Resources Institute, global per capita food production has been increasing substantially for the past...
.
Similar designs are also being created for wheat
Wheat
Wheat is a cereal grain, originally from the Levant region of the Near East, but now cultivated worldwide. In 2007 world production of wheat was 607 million tons, making it the third most-produced cereal after maize and rice...
, barley
Barley
Barley is a major cereal grain, a member of the grass family. It serves as a major animal fodder, as a base malt for beer and certain distilled beverages, and as a component of various health foods...
, sorghum
Sorghum
Sorghum is a genus of numerous species of grasses, one of which is raised for grain and many of which are used as fodder plants either cultivated or as part of pasture. The plants are cultivated in warmer climates worldwide. Species are native to tropical and subtropical regions of all continents...
, and Arabidopsis thaliana
Arabidopsis thaliana
Arabidopsis thaliana is a small flowering plant native to Europe, Asia, and northwestern Africa. A spring annual with a relatively short life cycle, arabidopsis is popular as a model organism in plant biology and genetics...
.