Molecular evolution of dimeric α-amylase inhibitor genes in wild emmer wheat and its ecological association

Background α-Amylase inhibitors are attractive candidates for the control of seed weevils, as these insects are highly dependent on starch as an energy source. In this study, we aimed to reveal the structure and diversity of dimeric α-amylase inhibitor genes in wild emmer wheat from Israel and to elucidate the relationship between the emmer wheat genes and ecological factors using single nucleotide polymorphism (SNP) markers. Another objective of this study was to find out whether there were any correlations between SNPs in functional protein-coding genes and the environment. Results The influence of ecological factors on the genetic structure of dimeric α-amylase inhibitor genes was evaluated by specific SNP markers. A total of 244 dimeric α-amylase inhibitor genes were obtained from 13 accessions in 10 populations. Seventy-five polymorphic positions and 74 haplotypes were defined by sequence analysis. Sixteen out of the 75 SNP markers were designed to detect SNP variations in wild emmer wheat accessions from different populations in Israel. The proportion of polymorphic loci P (5%), the expected heterozygosity He, and Shannon's information index in the 16 populations were 0.887, 0.404, and 0.589, respectively. The populations of wild emmer wheat showed great diversity in gene loci both between and within populations. Based on the SNP marker data, the genetic distance of pair-wise comparisons of the 16 populations displayed a sharp genetic differentiation over long geographic distances. The values of P, He, and Shannon's information index were negatively correlated with three climatic moisture factors, whereas the same values were positively correlated by Spearman rank correlation coefficients' analysis with some of the other ecological factors. Conclusion The populations of wild emmer wheat showed a wide range of diversity in dimeric α-amylase inhibitors, both between and within populations. We suggested that SNP markers are useful for the estimation of genetic diversity of functional genes in wild emmer wheat. These results show significant correlations between SNPs in the α-amylase inhibitor genes and ecological factors affecting diversity. Ecological factors, singly or in combination, explained a significant proportion of the variations in the SNPs, and the SNPs could be classified into several categories as ecogeographical predictors. It was suggested that the SNPs in the α-amylase inhibitor genes have been subjected to natural selection, and ecological factors had an important evolutionary influence on gene differentiation at specific loci.


Background
Wild emmer wheat, Triticum dicoccoides, the progenitor of bread and pasta wheats, presumably originated in and adaptively diversified from, northeastern Israel into the Near East Fertile Crescent [1]. In this center of diversity, wild emmer wheat harbors rich genetic diversity and resources [1]. Previous studies in T. dicoccoides and other cereals have shown significant nonrandom adaptive molecular genetic differentiation at single and multilocus structures in either protein-coding regions or randomly amplified polymorphic DNAs among micro-ecological environments [2,3]. It was also determined that wild emmer wheat is genetically variable and that the genetic differentiation of populations included regional and local patterns with sharp genetic differentiation over short distances [4]. Genetic polymorphisms of αand β-amylase in wild emmer wheat have been characterized, and it was found that diversity of climatic and edaphic natural selection, rather than stochasticity or migration, was the major evolutionary force driving amylase differentiation [5].
The estimates of molecular diversity derived from PCRbased techniques such as amplified restriction fragment length polymorphism (AFLP), microsatellites (short sequence repeats or SSR), single nucleotide polymorphism (SNP), and sequence comparisons are several-fold higher than enzymatic diversity [6]. A substantial private and public effort has been undertaken to characterize SNPs tightly associated for genetic diversity. SNPs are identified in ESTs (expressed sequence tags), thus the polymorphisms could be directly used to map functional and expressed genes, rather than DNA sequences derived from conventional RAPD and AFLP techniques, which are typically not functional genes [7][8][9]. The majority of SNPs in coding regions (cSNPs) are single-base substitutions, which may or may not result in amino acid changes. Some cSNPs may alter a functionally important amino acid residue, and these are of interest for their potential links with phenotypes [10].
α-Amylase is a family of enzymes that hydrolyze α-D-(1,4)-glucan linkages and play an important role in the carbohydrate metabolism of many autotrophic and heterotrophic organisms [11]. Heterotrophic organisms use αamylase primarily to digest starch in their food sources [12]. Several kinds of α-amylase and proteinase inhibitors in seeds and vegetative organs act to regulate the numbers of phytophagous insects [13][14][15]. α-Amylase inhibitors are attractive candidates for the control of seed weevils as these insects are highly dependent on starch as an energy source [16]. In cereal seeds, α-amylase inhibitor proteins with 120-130 amino acids, which include trypsin inhibitors, as well as α-amylase inhibitors, can be grouped into one large family on the basis of the homology between their amino acid sequences [17]. In this family, the dimeric α-amylase inhibitor has been well characterized. For weevil control, α-amylase inhibitors could be manipulated through plant genetic engineering. However, many insects have several α-amylases that differ in specificity, and successful utilization of a food source is dependent on the expression of a α-amylase for which there is no specific inhibitor [12]. The dimeric α-amylase inhibitor genes were located on chromosome 3BS and 3DS; there was no known evidence of a homoeologous locus or loci on chromosome 3AS of the polyploid wheats [18,19]. Therefore, the tetraploid wheats, which are lacking the D genome, have only the inhibitor genes on chromosome 3BS [19].
Evolutionary pressures of various kinds have often been hypothesized to cause active and rapid evolutionary changes. In a co-evolving system of plant-insect interactions, plants synthesize a variety of toxic proteinaceous and nonproteinaceous molecules for their protection against insects [20,21]. Proteinase inhibitors are therefore a potential model system in which to study basic evolutionary processes, such as functional diversification [22].
It is well established that multiple forms of proteins are active on exogenous or endogenous α-amylases in the wheat kernel, and proteinaceous dimeric α-amylase inhibitors could function against α-amylase from various origins [23]. It is known that the bulk of wheat albumins consist of a few amylase iso-inhibitor families that are very likely phylogenetically related and coded by a small number of parental genes [24]. The α-amylase inhibitors have long been proposed as possible important weapons against pests whose diets make them highly dependent on α-amylase activity. In vitro and in vivo trials using α-amylase inhibitors, including those made under field conditions, have now fully confirmed their potential for increasing yields by controlling insect populations [16].
Two conflicting views confront ecologists and evolutionary biologists on the degree of symmetry in interactions between plants and phytophagous insects [25]. The symmetrical view holds that insects and plants have strong effects on one another's evolutionary and ecological dynamics. The asymmetrical view acknowledges that plants have major effects on insects but claims that insects seldom impose significant effects on plants [25]. Plant defense mechanisms have been the subject of intense investigation [26]. The genome shaping events and processes occurring at dimeric α-amylase inhibitor gene loci from the B and S genomes of wheat and Aegilops section sitopsis, respectively, have been characterized. A Phylogenetic Median-Joining network of the haplotypes and a neighbor-joining tree analysis have indicated that the inhibitor gene sequences from common wheat and T. dicoccoides are closely related to those from Ae. speltoides [27]. However, little is known about their evolution under the influence of ecology. The molecular diversity of α-amylase inhibitor genes, as well as their divergence among 16 populations of wild emmer wheat from Israel, was investigated to gain insight into the correlation between plant defense proteinaceous inhibitors and ecological factors.

Isolation of the ORF of dimeric α-amylase inhibitors
Using two cloning primers, genomic PCR amplifications were conducted, and one desired DNA band was detected in each accession of wild emmer wheat. Cloning the fragments yielded 244 positive clones from 13 accessions (randomly selected from 10 populations), which were subsequently sequenced (data not shown). Only three out of 244 dimeric α-amylase inhibitor genes had a common three bp deletion, and those three genes were obtained from one accession derived from Mt. Hermon, whereas the other cloned fragments had 426 bp long (data not shown). It was predicted that all of the 426-bp sequences would encode functional dimeric α-amylase inhibitors. Alignment of the gene sequences from emmer wheat with sequences from the species of Aegilops section Sitopsis (including Ae. speltoides, Ae. bicornis, Ae. longissima, Ae. searsii, and Ae. sharonensis), Ae. tauschii, einkorn wheats, and common wheat clearly indicated that the emmer wheat sequences were derived from the B genome [27].

SNP and haplotype analyses of dimeric α-amylase inhibitor genes
The frequency of SNPs in the dimeric α-amylase inhibitor genes in emmer wheat was 1 out of 5.7 bases, which was higher than the SNPs observed for kunitz-type α-amylase inhibitor and α-amylase/subtilisin inhibitor genes in barley and dimeric α-amylase inhibitor genes in common wheat [28][29][30]. Among the 426 nucleotides, there were 351 conserved positions and 75 variable positions among the 244 α-amylase inhibitor genes sequenced from 13 accessions.
A total of 74 haplotypes were revealed by sequence analysis ( Figure 1); 53 of these were each found in only a single sequence. Haplotype 41 was observed at the highest frequency, i.e., in 38 gene sequences, followed by haplotype 27 in 33 sequences (Figure 1).
The relationship between SNPs and amino acid changes in the α-amylase inhibitor proteins is summarized in Table 1. The 75 SNPs resulted in 38 amino acid substitutions. The position of each SNP in the sequence, whether the predicted change was synonymous (silent) or nonsynonymous (replacement), was determined. Forty percent of SNPs were found to occur at the third codon position, and as expected, most of these were synonymous (Table 1). A number of changes were also identified in codon positions 1 and 2, and these accounted for more than 95% of the non-synonymous changes ( Table 1). In total, 60% of the SNPs resulted in non-synonymous changes.

Primer design and SNP mining of wild emmer wheat
Using the information from the 75 SNPs identified in the α-amylase inhibitor genes, 16 primers (combined with the reverse cloning primer, R, as SNP markers) were successfully designed to detect the SNPs in 205 accessions from 18 populations. The primers, with the SNP (bold letters) at the 3' end and an extra mismatched nucleotide (underline) on the third nucleotide from the end are listed in Table 2. A total of 14 SNPs were detected with the 16 SNP markers from position 19 to 288 of the α-amylase inhibitor gene, and the size of the amplified fragments ranged from 158 to 426 bp. The data was then organized in terms of genotypic frequencies ("0" or "1") to assess the population structure.
There were only 5 and 2 accessions from Yehudiyya and Achihood, respectively. Thus, the data for Yehudiyya and Achihood were not used in further analyses. Positive fragment frequency for each primer in the 16 populations is listed Additional file 1.

Genetic diversity and distance of α-amylase inhibitor genes
Some genetic parameters of the 16 populations of wild emmer wheat are summarized in Table 3. The proportion of polymorphic loci P (5%), the expected heterozygosity

Principle components & multiple regression analysis of environmental variables and SNPs
To assess if some of the ecological factors are correlated to each other, principle components analysis (PCA) was carried out using 23 ecological factors as variables. A combination of the first four components could give us a high cumulative percent (88.81%) according to the eigenvalues of the correlation matrix (see Additional file 3A; 4), which could be used to explain the ecological associations. The main ecological factors of the first component were Rd and Ev (two water availability factors) that could give 40.60% eigenvalues, and the second component could give 28.66% eigenvalues (see Additional file 3). And from PCA analysis, it was known that the accessions from Mt. Hermon were affected the most by ecological factors (see Additional file 3C).
After analyzed of the factors by projection of the variables on the factor-plane (see Additional file 4) and consulted the correlations of these factors (see Additional file 5), 11 independent ecological factors were chosen. And then, multiple regression analysis was done using these 11 factors to investigate the relationship between environmental variables and SNPs.
The geographical, temperature, water, and solar radiation factors in Table 4, singly or in combination, explained a significant proportion of the diversity in the SNPs ( Table  5). The best variable predictors of P, He, and Shannon's information index, significantly explaining 0.264 -0.355 of the variance, was the water availability factor Hu-an. The combination of three variable predictors accounting for geographic and water availability factors Hu-an, Ev,   (Table 5).

Spearman rank correlations of SNP positions with environment
The average of genetic indices (P, He, and Shannon's information index) and He of each of the SNP positions with ecogeographical variables appear in Table 6. We recorded the ecological variables for the populations. The P, He, and Shannon's information index were negatively correlated with the three water factors: mean annual humidity (Hu-an), mean humidity at 14:00 h (Hu-14), *SNP positions were on the 3' end of the primers and are identified in bold letters. Extra mismatched nucleotides were also incorporated in the primers (underlined).  Population numbers and ecological factors definitions was according to Nevo and Beiles 1989. * Populations Yehudiyya (7) and Achihood (26) (Table 6).

SNPs in the α-amylase inhibitor genes
In sequence comparisons, the 244 dimeric α-amylase inhibitor genes from wild emmer wheat, had a high level of similarity, indicating that the primary structure of these genes was similar to those of known dimeric α-amylase inhibitors 0.19 (WDAI-0.19) and 0.53 (WDAI-0.53). The predicted protein sequence of the 244 cloned α-amylase inhibitor genes from wild emmer wheat showed the presence of 10 Cys, which were the amino acids most important to the structure and function of the mature protein [31]. Changes in structure of α-amylase inhibitor proteins would affect their specificity and activity against different mammalian and insect α-amylase [32]. A comparison of sequence between members of the α-amylase inhibitors 0.19 group indicated that not only the 10 Cys residues were of importance, but also Asp110, Lys116, Asn29, Glu35, Ser94, Leu90, Trp51, His47, and Gln13 were important to form the structure of those inhibitors [33].
Most of the SNPs did not occur at highly conserved positions, which ensures that the α-amylase inhibitors keep their correct 3D structure to combine with the α-amylase. However, Gln13, His47, Ser49, Leu90, Val105, and Asp110 were changed by SNPs in some of the cloned αamylase inhibitor genes (Table 1). It is noteworthy that only the α-amylase inhibitors from the D genome of Ae. tauschii and common wheat, which were closely related to inhibitor 0.19, had the His47 [30], whereas the His47 was replaced by Asp or Tyr in 98% of the inhibitor genes from wild emmer wheat.

Genetic diversity of the α-amylase inhibitor genes in wild emmer wheat
Genetic diversity of the α-amylase inhibitor genes of 198 wild emmer wheat accessions from 16 populations in Israel were revealed by 16 SNP markers. Individual accessions from different populations could not be distinguished clearly by the sequences of their α-amylase   Table 4. inhibitor genes; whereas, using the SNP-specific primers, all wild emmer wheat populations were distinguishable, even within closely related populations originating in proximate geographic locations (Table 3). Our results demonstrated that the polymorphism of α-amylase inhibitor genes in wild emmer wheat correlated with the ecogeographic distribution of the accessions. The results suggest that the gene was subjected to strong natural selection.

Genetic indices
The observations were consistent with previous results obtained with high-and low-molecular-weight glutenin subunits, which are also seed storage proteins [34][35][36]. In other studies, DNA diversity of glutenin subunits was shown to be correlated to environmental factors and variation [34].
The genetic diversity profiles in this study were compared with earlier allozyme studies [1], RAPD loci [37], and with the microsatellite studies [38] in wild emmer wheat populations. Although the SNP markers in the protein-coding genes yielded lower values of diversities than other methods, the results in this study were able to reveal the correlations of SNP variations in specific functional genes with ecological factors.
Central populations used in this study were collected in warm, humid environments on the Golan Plateau and near the Sea of Galilee. Marginal steppic populations were collected across a wide geographic area on the northern, eastern, and southern borders of wild emmer distribution involving hot, cold, and xeric peripheries, while marginal mesic (Mediterranean) populations were collected from the western border of wild emmer distribution [1,39]. The present study included 198 accessions collected from 16 different sites in Israel and covered a wide range of ecogeographical conditions across the distribution range of the species. Specific SNP positions detected in the α-amylase inhibitor genes were found to be highly effective in distinguishing genotypes and populations of wild emmer wheat originating from diverse ecogeographic sites in Israel. High levels of polymorphic loci (P), expected heterozygosity (He), and Shannon's information index (Table 3) with high genetic distance values between populations could be found (see Additional file 2). These results suggest that the genetic variation at these SNP positions in the dimeric α-amylase inhibitor genes was somewhat ecologically determined for these populations.

Genetic distance versus geographical distance
The relationship between SNPs' genetic distance and geographical distance was investigated, and it was found that the estimates of genetic distance (D) were geographically independent, as was previously found for allozymes, RAPD loci, and microsatellite analyses [1,37,38]. Quite often it is easier to find a greater genetic difference between proximal populations than among populations that are far apart. This was clearly demonstrated by local short transects of different soil types at Tabigha [40] and by the micro-differences of sun-shade differentiation at Yehudiyya [41]. Sharp genetic divergence (large D) over very short geographic distances against small genetic divergence (small D) between large geographically distances were observed in wild emmer populations (see Additional file 2). For example, it was shown that the genetic distance obtained between the population at Gitit and the population at Kokhav Hashahar (located only about 10 km apart), with D = 0.1513, was 2.66 times higher than the genetic distance between the population at Mt. Hermon and the population at J'aba (separated by 160 km, with D = 0.0569). In other words, the distance between the first 2 populations was 1/16 of the distance between the second 2 populations (Figure 2 and Additional file 2).  The genetic structure of wild emmer wheat populations in Israel is mosaic [36]. This patchy genetic distribution appears to reflect the underlying ecological heterogeneity at both micro-and macro-scales [1,37,38,40,41]. Thus the higher polymorphisms and genetic variations of dimeric α-amylase inhibitor genes within and between populations could be explained by natural selection.

Ecological genetics of SNPs in dimeric α-amylase inhibitor genes
Natural populations of wild emmer are highly polymorphic in morphological characters, as well as for various economically important traits [3,5,34]. Although major collection areas such as Mt. Hermon, Rosh Pinna, Gamla, Bat-Shelomo, and Tabigha are at similar longitude and latitude, they differ significantly in altitude. These locations, for example, are respectively at 1300, 700, 200, 75, and 0 m above sea level (Table 4). Along with these features, several other environmental factors differ for these locations [1,39].
In this study, the mean number of P, He, and Shannon's information index were negatively correlated with the three water factors and positively correlated with the other six factors (Table 6). It was noteworthy that the significant ecological factors (Ln, Ta, Td, Hu-14, Hu-an, Dw, and Rv), revealed by a Spearman rank correlations matrix between allozyme and climate, were similar to the results in this study [1]. This similarity might be because the correlation between ecological factors and coding sequences or proteins (allozyme) is different from the non-coding sequences. Moreover, the correlation between photosynthetic performance and ecogeographical variables indicated that ecological factors, e.g., sharav (Sh), dewy nights (Dw), radiation (Rad), rainy days (Rd), altitude (Al), and latitude (Lt) were distinctly correlated with photosynthetic factors [42]. Photosynthetic efficiency needs specific ecological factors, especially light.
In this study, the SNP variations showed significant correlations with ecological factors (Table 5; Table 6). Geographical, temperature, water availability, edaphic, biotic and solar radiation factors (Sz, So, Rad, Al, Rn, Lt, Sh, Rv, Ln, Td, Hu-14, Hu-an, Dw, and Ma), singly or in combination, explained a significant proportion of the diversity in the SNPs of α-amylase inhibitor genes. The association of these factors with SNPs was similar to the association of latitude/altitude with RAPD and microsatellite diversity [37,38]. It could be explained by the change in ecological factors, i.e., Al, the sharp gradient of climatic conditions from north to south in Israel, with increasing temperatures and decreasing water availability towards the semiarid zones in southern Israel. Also, the ecological factors used in this study were not representative of all the possi-ble components involved in the determination of the real climate.
The SNPs that could determine the amino acid changes in the mature protein of α-amylase inhibitors were of great importance. Six specific primers (W125G, W127G, W190C, W259A, W263TC, and W263TA) were designed, based on the SNPs at five positions associated with amino acids changes (Table 1). It was shown that these SNPs were significantly correlated with water availability factors (Rd, and Hu-an), temperature factors (Sh, and Td), geographical factors (Ln, Al, and Lt), and solar radiation (Rad) better than the other factors (Table 5 and Table 6). Environmental stress can greatly influence plant susceptibility to herbivores and pathogens, and drought stress can promote outbreaks of fungal diseases and plant-eating insects [43,44]. Louda and Collinge (1992) reported guild-specific insect responses following soil water manipulations, and Larssou (1989) has clearly articulated why the actual response of insect herbivores to plant stress should be feeding-guild specific [45,46]. The results in this study indicated that water availability is the main factor that could affect the dimeric α-amylase inhibitor genes and, thus, the concordance between insect and plant.
Recently, based on SNP analysis, highly significant correlations were also found between diversity at the barley Isa locus (coding for a bi-functional α-amylase/subtilisin inhibitor) and key water variables (evaporation, rainfall, and humidity) plus latitude [47]. The soil fungi may influence the survival of wild barley seed in soil and the subsequent establishment of plant populations. The higher diversity of soil fungi in dry environments may select for a higher diversity of defense proteins encoded by the Isa locus in the seed [47].
The herbivore insect and the level of herbivore pressure may vary with ecological factors, so that the wheat is under different herbivore-related selection pressures at each site. Different environmental pressures at each site directly related to the climate, but the wheat alpha-amylase inhibitors responded indirectly to those factors. There might be some evolutionary mechanisms that underlie the differences in diversity of α-amylase inhibitors and water factors. Historical events may have given rise to diversity patterns that correlate coincidentally with the ecogeographical variables tested in this study. However, probability would suggest it is far more likely that the variation in genetic diversity of this gene between populations is a product of selective forces. Selection pressure at this locus is likely to be caused by insects.

Conclusion
The populations of wild emmer wheat showed great diversity in dimeric α-amylase inhibitors, both between and within populations. We suggest that SNP markers are use-Geographic distribution of the tested populations of wild emmer wheat Figure 2 Geographic distribution of the tested populations of wild emmer wheat. The numbered populations are according to Nevo and Beiles (1989) [1] and details about the populations can be found in Table 4.
ful for the estimation of genetic diversity of protein-coding genes in wild emmer wheat. These results show significant correlations between SNPs in the α-amylase inhibitor genes and ecological diversities. Ecological factors, singly or in combination, explained a significant proportion of the variations in the SNPs, and the SNPs could be classified into several categories as ecogeographical predictors. A sharp genetic divergence (large D) over very short geographic distances against small genetic divergence (small D) between large geographical distance was found in wild emmer populations. It was suggested that the SNPs in the α-amylase inhibitor genes were subjected to natural selection, and ecological factors had an important evolutionary role in gene differentiation at the gene loci.

Plant material and ecological background of wild emmer wheat
Wild emmer wheat is a tetraploid and predominantly selfpollinated wheat, which is distributed over the Near East Fertile Crescent (Israel, Jordan, Lebanon, Syria, eastern Turkey, northern Iraq, and western Iran) [48]. The center of distribution and diversity of emmer wheat was found in the catchment area of the upper Jordan Valley (Golan Heights, eastern Upper Galilee Mountains, etc.) in Israel and its vicinity [1]. Wild emmer wheat covers wide ranges of eco-geographical conditions in Israel. However, towards their marginal and peripheral areas, both in Israel and Turkey, wild emmer wheat became semi-isolated or isolated, and smaller in size. This distributional pattern has a dramatic effect on their population genetic structure and differentiation [1]. Individual plants of emmer wheat were collected at random, at least 1 m apart, from populations differing in major ecological properties. These collection sites and populations have been described in detail elsewhere [1,39]. The genotypes used for the present study are conserved in the cereal gene bank of the Institute of Evolution, University of Haifa.
In this study we examined 205 T. dicoccoides accessions representing 18 populations collected from various locations in Israel, which represent a wide range of ecological conditions of soil, temperature, altitude, and water availability. The populations used in this study, along with their geographic origin and climatic conditions, are listed in Table 4. A full description of these populations was reported in Nevo et al. [1,39], and the map location of these populations were provided in Figure 2.

DNA isolation and PCR amplification
Ten seeds of each accession were germinated in the dark at room temperature. Genomic DNA was extracted from plant leaves at about 2 weeks of age with a modified CTAB protocol, as described in Murray and Thompson [49].

Sequence analysis of α-amylase inhibitor
Amplification products were separated in 2% agarose gels. The desired DNA fragments were recovered from gels and ligated to the pBluescript SK (+) T-vector plasmid (Stratagene), and then the positive clones were screened and sequenced. The analysis of full-length sequence and the construction of subsequent nucleotide sequence were carried out under DNAman 5.2.2 [50], and the multiple sequence alignment software Clustal W [51] was used for the SNP assessment. The α-amylase inhibitor ORFs were translated into amino acid sequences using the ORF Finder program at the NCBI [52]. The polymorphic positions were used instead of all of the mutation positions, including the positions with change that observed only once in the dataset, in the subsequent analysis.

Specific primer design and analysis of SNP
Polymorphic positions were identified by MEGA version 3.1 [53]. Sixteen specific PCR forward primers (combined with the cloning reverse primer R), were designed based on the alignments of dimeric α-amylase inhibitor gene sequences obtained from wild emmer wheat ( Table 2). The SNPs were positioned at the 3'-end of the primers, based on the fact that a 3' mismatch makes PCR more specific at the selected annealing temperature [54,55]. The power of the oligonucleotide for allele discrimination was enhanced by introducing an artificial mismatch at the 3'terminal base [56]. Sequences for the specific primers for dimeric α-amylase inhibitor genes and the basic cycling conditions are listed in Table 2. PCR was performed on genomic DNAs from all accessions of the 18 populations.

Data acquisition and analysis
The gels were scored for the presence or absence of bands that showed a reproducible pattern among genotypes, and for each band with a SNP position with two alternative alleles: present (1) or absent (0). For wild emmer wheat, which is a self-pollinating species with a quite limited rate of outcrossing (estimated t approximately 0.005), we assumed 100% homozygosity. The identification of 16 SNP positions led to the construction of a 198 accessions (two populations with less than 8 accessions were not used in this analysis) × 16 loci data matrix, which was analyzed for diversity within and between populations. POPGENE 1.32 [57] was used to compute genetic polymorphism (P), expected heterozygosity (Nei's gene diversity) (He), and Shannon's information index (I) for each SNP position and population. Spearman rank correlation coefficients were used to assess differences in genetic indices P, He, and Shannon's information index and climatic variables in 16 populations. STATISTICA version 6.0 [58] was used to do the PCA analysis and conduct stepwise multiple regression (MR). Multiple regression analysis was conducted to test the best predictors of P, He, and Shannon's information index in the 16 populations using these genetic indices as dependent variables and the ecogeographic factors as independent variables at each of the polymorphic SNP loci.