Polymorphism at the apical membrane antigen 1 locus reflects the world population history of Plasmodium vivax

Background In malaria parasites (genus Plasmodium), ama-1 is a highly polymorphic locus encoding the Apical Membrane Protein-1, and there is evidence that the polymorphism at this locus is selectively maintained. We tested the hypothesis that polymorphism at the ama-1 locus reflects population history in Plasmodium vivax, which is believed to have originated in Southeast Asia and is widely geographically distributed. In particular, we tested for a signature of the introduction of P. vivax into the New World at the time of the European conquest and African slave trade and subsequent population expansion. Results One hundred and five ama-1 sequences were generated and analyzed from samples from six different Brazilian states and compared with database sequences from the Old World. Old World populations of P. vivax showed substantial evidence of population substructure, with high sequence divergence among localities at both synonymous and nonsynonymous sites, while Brazilian isolates showed reduced diversity and little population substructure. Conclusion These results show that genetic diversity in P. vivax AMA-1 reflects population history, with population substructure characterizing long-established Old World populations, whereas Brazilian populations show evidence of loss of diversity and recent population expansion. Note Nucleotide sequence data reported is this paper are available in the GenBank™ database under the accession numbers EF031154 – EF031216 and EF057446 – EF057487


Background
Studies of the population diversity of the malaria parasites have practical significance for the development for strategies of disease control, including vaccine development [1]. Moreover, the characterization of genes responsible for resistance to therapeutic agents by both Plasmodium falciparum and Plasmodium vivax depends on a thorough knowledge of each parasite's genetic diversity in natural populations [2]. The primary factors affecting genetic diversity at such loci are natural selection [3,4], and genetic drift. Genetic drift reflects the population history, including population bottlenecks; and it has a substantial effect on genetic diversity even at loci subject to balancing selection [5][6][7]. Thus, the knowledge of the parasite's population history and its genetic diversity is important for a full understanding of the epidemiology of malaria and potential response of the parasite to therapeutic strategies. P. vivax is widely geographically distributed, being present in both tropical and temperate areas; this species is responsible for about 80 million annual cases of human malaria, especially in Latin America, Asia and Oceania [8]. It is the prevalent species in a great number of countries and territories, including Brazil, which accounted for about 81% of the approximately 460,000 cases reported in 2007 [9]. P. vivax infections rarely culminate in death of the patient but are a very important cause of morbidity and social economic loss [10]. P. vivax is believed to have first entered hominid populations in Southeast Asia and to have spread from there throughout the Old World based on its close relation to malaria parasites of nonhuman primates from Southeast Asia [11]. However, there is archaeological evidence supporting the hypothesis that both P. vivax and P. falciparum were absent from the New World in pre-Columbian times and were introduced after European colonization, presumably as a result of the African slave trade [12]. Thus P. vivax in the New World might be expected to have a somewhat reduced effective population size and thus reduced genetic diversity in comparison to Old World populations, as a result of founder effects in the sampling of Old World populations. Microsatellite markers have shown evidence of a substantial reduction of genetic diversity in the case of South American P. falciparum [13]. On the other hand, in P. vivax, microsatellite markers showed revealed only a rather modest reduction in genetic diversity in South America in comparison to Asia [14].
At the ama-1 locus in P. falciparum, polymorphisms occur non-randomly along the coding region, and the highest polymorphism is found in the three ectodomains, especially in domain I [44]. Moreover, the number of nonsynonymous nucleotide substitutions per nonsynonymous site (d N ) exceeds that of synonymous nucleotide substitutions per synonymous site (d S ), providing evidence that positive Darwinian selection has acted at this locus [4]. Combined with the evidence of a high level of polymorphism at this locus, this result supports the hypothesis that balancing selection has acted to maintain polymorphisms at this locus [4]. It has been proposed that host immune system pressure is responsible for this selection [4]; and, consistent with this hypothesis, there is evidence that polymorphisms at this locus are responsible for evasion of host antibody-mediated inhibition in P. falciparum [45]. In P. vivax, d N has been found to exceed d S in partial ama-1 sequences [20], suggesting that this locus is subject to balancing selection in P. vivax as well.
Extensive data on ama-1 polymorphism in P. vivax (pvama-1) have been obtained from Asia, Oceania and Africa [20,[31][32][33]35,36], but there is a relative lack of data from South America, including Brazil. The only sequence data from Brazil involves domain I of 20 isolates; 13 polymorphic sites and eight haplotypes were reported in three Brazilian states [34].
The intention of the present study was to characterize the worldwide genetic diversity of the polymorphic domain of pvama-1. In addition to published sequences from throughout the world, we obtained sequences from patients in different endemic areas in the Brazilian Amazon. By examining polymorphism at this locus in Brazil and comparing it to other populations throughout the world, we tested the hypothesis that the pattern of genetic diversity at pvama-1 reflects population history, in particular a reduction of the effective population size of P. vivax in the New World. Theoretically, it is expected that effective population size will be the major factor determining gene diversity even at a locus under balancing selection, if the mutation rate and selection coefficient are constant [5][6][7]. A more complete understanding of the parasite's history in the New World in turn has implications for the epidemiology and control of this parasite in Brazil, where it has become a major public health problem in recent years due to the rapid peopling of the Brazilian Amazon [46][47][48][49].
A phylogenetic tree of Brazilian and worldwide sequences ( Figure 1) showed no tendency toward geographic clustering of isolates. Rather, isolates from different parts of the world were found throughout the phylogenetic tree (Figure 1). The Brazilian sequences thus appeared to represent a sample from worldwide genetic diversity, rather than from any particular lineage of worldwide pvama-1 sequences.
In order to compare nucleotide diversity within geographic regions, we computed π, π S , and π N for all pairwise comparisons within Brazilian states and within and worldwide regions ( Figure 2). Likewise, we computed mean d, d S , and d N for all pairwise comparisons between Brazilian states and between worldwide regions. In worldwide comparisons, mean π π S , and π N within regions were always significantly lower than the corresponding values of d, d S , and d N between region ( Figure 2). By contrast, mean π, π S , and π N within Brazilian states were not significantly different from the corresponding values of d, d S , and d N between states ( Figure 2). Thus, these results show that pvama-1 did not show the degree of sequence divergence among the Brazilian states that was seen among different regions in the world. Mean π, π S , and π N within Brazilian states were significantly lower than the corresponding values within world regions ( Figure 2). Likewise mean d, d S , and d N between Brazilian states were significantly lower than corresponding values between world regions ( Figure 2). These results show that sequence divergence in pvama-1 among states in Brazil was low than that in comparisons of different Old World populations.
Similar results were obtained from estimation of pairwise F ST , which provides an index of the genetic differentiation between populations. F ST values among different world regions were often significantly greater than zero, indicating genetic differentiation between populations (Table 1). By contrast, estimates of F ST among Brazilian states were never significantly different from zero, indicating a lack of genetic differentiation among the Brazilian states ( Table  2).
We plotted F ST against the geographical distance between the sites where samples were collected separately for data from Brazil and data from Asia and Oceania (Figure 3). In the data from Asia and Oceania, there was not a significant correlation between F ST and geographical distance (r = -0.196; n.s.; randomization test; Figure 3). By contrast, in Brazil, there was a strong positive correlation between F ST and geographical distance (r = 0.780; P < 0.01; randomization test; Figure 3). The correlation coefficient for the Brazilian data was significantly different from that for the Asian and Oceanian data (p < 0.01; randomization test).
The range of geographical distances among the Brazilian samples (450-2340 km) overlapped only with the lower nine values from the Asian and Oceanian sample (range 600-2240; Figure 3). If we considered only the nine data points in the Asian and Oceanian sample that overlapped the Brazilian data, there was again no significant correlation between F ST and geographical distance (r = 0.091; n.s.; randomization test; Figure 3). Moreover, for the nine Asian and Oceanian comparisons with geographical distances comparable to those in Brazil, mean F ST (0.111) was significantly greater than that for the 10 Brazilian comparisons (mean F ST = -0.011; randomization test; P < 0.01).

Discussion
Here, we characterized the polymorphic gene pvama-1 domain I in Plasmodium vivax isolates from patients in the Brazilian Amazon, where this species poses an important public health problem and compared those sequences with previously published sequences from the Old World. Although most branches in a phylogeny of pvama-1 sequences were not well resolved, it was clear that Brazilian sequences did not cluster separately from Old World sequences ( Figure 1). This pattern supports the hypothesis that any reduction in population size accompanying the invasion of the Americas [12] was not so severe that only one or a few lineages of pvama-1 alleles survived in the New World. Rather, pvama-1 sequences from Brazil were found throughout the phylogenetic tree, consistent with the hypothesis that the alleles that became established in the New World represented a sample of worldwide genetic diversity at this locus. There was evidence of reduced genetic diversity at the pvama-1 locus in Brazil, consistent with some reduction in effective population size of P. vivax in the New World after its introduction.
There were very low values of F ST among the Brazilian states, with none being significantly greater than zero. The latter was in marked contrast to the Old World, particularly Southeast Asia, where high F ST values were consistently observed. Of course, the geographical distances among the Brazilian states sampled were low in compari Phylogenetic tree of unique Brazilian and worldwide pvama-1 sequences son to many of the geographical distances among samples from Asia and Oceania (Figure 3). Nonetheless, mean F ST among the Brazilian samples was much lower than that among those samples from Asia and Oceania taken from comparable geographical distances. Thus, pvama-1 shows strikingly less geographical differentiation in Brazil than in Southeast Asia, consistent with a recent and rapid spread of the parasite in Brazil.
There is evidence that the polymorphism at the ama-1 locus is selectively maintained in P. falciparum, with host immune recognition likely being responsible for that selection [44]. Given the high polymorphism and prevalence of nonsynonymous polymorphisms at the pvama-1 locus, it seems likely that the same is true in P. vivax [20]. An alternative hypothesis to account for the reduced polymorphism in Brazilian pvama-1 sequences might be that selection at this locus has been relaxed in the New World. In the Brazilian Amazon, P. vivax has achieved high levels of infection in an ethnically diverse and rapidly growing host population [46]. If the selection on pvama-1 arises primarily from interaction with the human host immune system, it seems unlikely that selection would be relaxed under such circumstances. However, as long as the basis of natural selection on pvama-1 remains poorly known, it is impossible to rule out some role of natural selection in the pattern of sequence diversity observed in the New World.
In spite of the overall low F ST values in the Brazilian samples, there was evidence in Brazil of a strong positive relationship between F ST and geographical distance. By contrast, in the Old World, even though F ST values were high, there was no correlation between F ST and geographical distance. The latter was observed both in an extensive sample of populations from Asia and Oceania and when we examined only populations whose geographical distances were comparable to those among Brazilian states. The results from Brazil can be explained as reflecting effects of recent spread of the parasite, whereas those from the Old World appear to reflect a very ancient selectively maintained polymorphism. In the latter case, different populations are expected to show substantially different allelic frequency distributions due to divergent population histories, including the effects of genetic drift. Such a pattern is seen, for example, in the case of vertebrate major histocompatibility complex loci [50], at which high levels of polymorphism are maintained by balancing selection [51,52].

Conclusion
Our results are consistent with the hypothesis that patterns of genetic diversity at highly polymorphic proteincoding loci of malaria parasites can show the effects of population history. Polymorphism at loci such as pvama-Means of (A) π, (B) π S , and (C) π N within Brazilian states and within worldwide regions; and of (A) d, (B) d S , and (C) d N between Brazilian states and between worldwide regions Figure 2 Means of (A) π, (B) π S , and (C) π N within Brazilian states and within worldwide regions; and of (A) d, (B) d S , and (C) d N between Brazilian states and between worldwide regions. Test of the hypothesis that a value for Brazil equals the corresponding value for worldwide comparisons: * P < 0.05; *** P < 0.001. Tests of the hypothesis that mean value within regions equals the corresponding value between regions: +++ < 0.001.
1 that are evidently subject to immune-driven selection may be an important factor in the epidemiology of infection by P. vivax. Understanding the factors governing the extent and pattern of polymorphism at such loci may thus have implications for the development of effective control strategies [53].

DNA extraction and amplification of pvama-1
Blood samples were stored in guanidine 4 M and kept at -20°C. The manufacturer's instructions for 300 μL whole blood extraction from Genomic DNA Purification Kit (Puregene ® ) were followed. The pvama-1 gene was amplified following a previously described protocol [20]. We Plot of F ST vs. geographical distance for Brazilian samples (green) and samples from Asia and Oceania (red) Figure 3 Plot of F ST vs. geographical distance for Brazilian samples (green) and samples from Asia and Oceania (red). In the data from Brazil, there was a significant correlation between F ST and geographical distance (r = 0.780; P < 0.01; randomization test). In the data from Asia and Oceania, there was not a significant correlation between F ST and geographical distance (r = -0.196; n.s.; randomization test).

Sequence analysis
New sequences from Brazil were combined with a database of 215 sequences from Asia, Africa, and Oceania (see Additional file 1) and aligned with the ClustalW Software [54]. A 399-bp region was analyzed, corresponding to bases 322-720 (amino acid residues 108-240) of Genbank accession L27503. We used the MEGA 3.1 program [55] to estimate nucleotide diversity and evolutionary distances and to build phylogenetic trees by the neighborjoining method [56], using the Jukes-Cantor distance [57]. The reliability of clustering patterns in the phylogenetic trees was assessed by bootstrapping [58]: 1000 bootstrap pseudo-samples were used. Before conducting the phylogenetic analysis, we tested for inter-allelic recombination using the maximum chi-square method [59] as implemented in the RDP2 program [60]. No recombination events were detected. The number of nucleotide substitutions per site (d) was estimated by Jukes and Cantor's method [57].  (5 sequences); South Korea (4 sequences); Sri Lanka (3 sequences); Thailand (7 sequences); Vanuatu (2 sequences). Finally, in order to analyze nucleotide diversity within geographic regions other than Brazil, we computed means of d, d S , and d N for all pairwise comparisons within each of the above regions that was represented by at least two sequences. Following general usage, means of d, d S , and d N within populations were designated respectively π, π S , and π N .
Pairwise comparisons of d, d S , and d N are not statistically independent. Therefore, we tested hypotheses about the means of these variables using randomization (Monte Carlo) tests. Given N comparisons categorized (e.g., as within-region or between-region) by a classificatory variable X, in order to conduct simultaneous pairwise comparisons between categories with respect to the median of some continuous scalar variable Y measured on each of the N units (e.g., d, d S , or d N ), we created 1000 pseudo data sets of N units each by randomly sampling (with replacement) independently from the vector of X values and from the vector of Y values. For a two-tailed test, the level of significance of the difference between two group medians was obtained by comparing the observed absolute difference with the distribution of absolute differences obtained for the corresponding groups in the 1000 pseudo data sets.
We used a similar randomization procedure to test the significance of correlation coefficients between pairwise measures of F ST and geographical distance. We created 1000 pseudo data sets by sampling at random from replacement in order to generate a null distribution against which observed values were compared. We used a similar procedure to test the equality of mean F ST in the Brazilian data with those from Asian and Oceanian populations of comparable geographic distance.

Authors' contributions
PG carried out the molecular analyses, participated in statistical analyses, and drafted the manuscript. CJFF was responsible for acquisition of the data. ALH participated in design of the study and performed statistical analyses. EMB conceived the study and participated in its design and coordination.