Phylogeny of seven Bulinus species originating from endemic areas in three African countries, in relation to the human blood fluke Schistosoma haematobium

Background Snails species belonging to the genus Bulinus (Planorbidae) serve as intermediate host for flukes belonging to the genus Schistosoma (Digenea, Platyhelminthes). Despite its importance in the transmission of these parasites, the evolutionary history of this genus is still obscure. In the present study, we used the partial mitochondrial cytochrome oxidase subunit I (cox1) gene, and the nuclear ribosomal ITS, 18S and 28S genes to investigate the haplotype diversity and phylogeny of seven Bulinus species originating from three endemic countries in Africa (Cameroon, Senegal and Egypt). Results The cox1 region showed much more variation than the ribosomal markers within Bulinus sequences. High levels of genetic diversity were detected at all loci in the seven studied species, with clear segregation between individuals and appearance of different haplotypes, even within same species from the same locality. Sequences clustered into two lineages; (A) groups Bulinus truncatus, B. tropicus, B. globosus and B. umbilicatus; while (B) groups B. forskalii, B. senegalensis and B. camerunensis. Interesting patterns emerge regarding schistosome susceptibility: Bulinus species with lower genetic diversity are predicted to have higher infection prevalence than those with greater diversity in host susceptibility. Conclusion The results reported in this study are very important since a detailed understanding of the population genetic structure of Bulinus is essential to understand the epidemiology of many schistosome parasites.


Background
Snails of the genus Bulinus (Müller, 1781) serve as intermediate hosts for larval development of the parasite species belonging to the Schistosoma haematobium species group in Africa, the Eastern Mediterranean and Madagascar [1], as for other Trematode species like paramphistomes. The S. haematobium blood flukes are responsible for 112 million infections in Africa with an incidence exceeding 50% in some communities [2]. The geographic distribution of these parasites is determined by the presence of their intermediate hosts [3]. Indeed, members of the Bulinus genus have an extensive distribution throughout Africa, some areas of the Middle East and the Mediterranean countries [4]. It is divided into four species groups: B. africanus, B. forskalii, B. reticulatus and the B. truncatus/tropicus complex [5]. Among the 37 species that compose these four groups, several are known or suspected to act as intermediate hosts for larval stages of the Schistosoma parasites. However, the role of each species in the transmission of the disease varies considerably between and within countries [4]. For example, in Cameroon B. truncatus is more implicated than B. globosus in the transmission of urinary schistosomiasis [6], while this is not the case in Senegal, where B. globosus, B. senegalensis and B. umbilicatus are known as the main intermediate hosts for S. haematobium [7].
The interpretation of these specific snail-parasite relationships requires a deeper understanding of the past events that have affected the population diversity and distribution of Bulinus species. Furthermore, such information is essential for assessing the geographical distribution of variation in genes associated with resistance in bulinid snails. Previous studies showed that the above four Bulinus species groups were stable throughout the majority of phylogenetic analyses and demonstrated the monophyletic status of these groups with both variable and highly conserved genes [8][9][10]. However, these studies were restricted to the basic phylogenetic level and no phylogeographic studies including estimates of haplotype and nucleotide diversity were performed.
In this investigation, we used a suite of population genetic and phylogenetic analyses to characterize populations of seven Bulinus species collected from endemic areas in three African countries (Cameroon, Senegal and Egypt), using one mitochondrial gene (the protein coding cytochrome oxidase subunit I (cox1) gene and three nuclear genes (18S rRNA, 28S rRNA, and the Internal Transcribed Spacer (ITS)). We tried to explain their present diversity in relation with schistosomiasis transmission.

Population genetic analysis
Mitochondrial genome diversity A data matrix of 1031 bp was used in this analysis (644 bp from the Folmer region and 387 bp from the Asmit region). All mtDNA sequences were found to be highly variable. Within species, the global haplotype diversity was very high in our samples ranging from 0.85 in B. umbilicatus to 0.99 for B. senegalensis. Furthermore, our estimate of nucleotide diversity ranged from 0.016 in B. camerunensis population to 0.052 for B. senegalensis (Table 1).
Within countries, as expected haplotype diversity is lower but exceeds 0.800 in the three countries ( Table 1). The number of B. truncatus unique haplotypes found in the Northern region of Africa (Egypt) is lower than that found in the Sub-Saharan Africa (Cameroon and Senegal). There is also higher nucleotide diversity found within B. truncatus populations sampled from Cameroon and Senegal compared to Egypt.

Ribosomal genome diversity
Ribosomal RNA (18S) A data matrix of 852 bp was used in this analysis and the sequences were found to be highly similar. The highest haplotypes diversity was again observed within B. senegalensis species (Hd = 0.86), while the lowest one was observed within B. truncatus (Hd = 0.157), confirming the results obtained from the cox1 data (Table 1).
Ribosomal RNA (28S) A data matrix of 855 bp was used in this analysis, the sequences were found to be more variable than 18S, and therefore higher values of haplotype and nucleotide diversity were observed (Table 1). Haplotype diversity exceeds 0.5 in all species, but again the highest Hd was detected in B. senegalensis (Hd = 0.95).
Ribosomal RNA (ITS) ITS data was only available for six species from 17 different localities. Therefore, these data were only analyzed at a basic phylogenetic level and not at the population level. ITS sequences had an extreme amount of length variation, with a range of 600 to 990 bp. The shortest ITS lengths were noted in B. globosus, thus indicating probable deletion events. The largest ITS sequences were found in B. tropicus species. The sequences were found to be highly variable.

Genetic divergence between and within species
The diversification index (Fst) between the seven studied species was calculated using the combined data matrix ( Table 2). The largest divergence was observed between B. forskalii and B. umbilicatus populations (Fst = 0.903), whereas those of B. truncatus and B. tropicus are the most closely related (Fst = 0.206), as expected since these two species are part of the same group.
The average within species diversity using the combined data matrix ranged from 0.005 to 0.040 (Table 3). The most heterogeneous species were B. senegalensis and B. forskalii, while individuals from B. umbilicatus are the most homogeneous, but this might be the result of the small simple size of the latter species.
Demographic history DnaSP v5.10.1 was used to perform Tajima's and Fu's neutrality tests in order to detect violations of mutationdrift equilibrium caused by selection or changes in population size. Within B. truncatus, the two tests were significantly negative using cox1 sequences, while no significant results were obtained when samples from individual countries were analyzed, perhaps in part because of the smaller sample sizes. Tajima's D was significantly negative in the 28S dataset. Although these two statistics were originally introduced as tests of neutrality, Fu's test is considered a powerful test to detect past population expansion. Therefore, significant negative values, such as those seen here, can indicate population expansion especially in Cameroonian and Senegalese B. truncatus populations. Within other species, the estimates of the two statistics were not significant in all regions, which indicate neutral evolution for these species in the studied countries.

Phylogeny reconstruction
The phylogenetic trees produced for all gene fragments together and for each gene separately were very similar, as were the threes obtained using three different treebuilding algorithms (maximum likelihood (ML), maximum parsimony (MP) and neighbor-joining (NJ)). All analyses confirm the monophyletic status of the Bulinus genus (bootstrap support of 93.3%; see Figure 1).

Combined data matrix
The results from all genes were combined together for 40 Bulinus specimens, as this improved resolution and resulted in higher bootstrap values for internal nodes compared to the single gene trees. The three different treebuilding algorithms were largely congruent ( Figure 1). All seven species were supported by high bootstrap values and split into two clusters: cluster A comprised members of the B. truncatus/tropicus complex and the B. africanus group, while cluster B represented the B. forskalii group.
The cluster A is split into two subclades (bootstrap value of 94%), the first subclade C contains sequences that are restricted to either B. truncatus or B. tropicus, with Cameroonian B. truncatus haplotypes branching off earlier than those from Senegal and Egypt. This could indicate that these B. truncatus haplotypes were introduced into Egypt from Sub-Saharan countries. The second subclade D includes B. globosus and B. umbilicatus haplotypes.
As for the second cluster B, the haplotypes split again into two subclades supported by bootstrap values of 99%. Subclade E includes sequences from B. forskalii and B. camerunensis, while subclade F contains sequences that are restricted to B. senegalensis, which is itself split into two groups: the first with Cameroonian haplotypes and the second with Senegalese haplotypes ( Figure 1).

mtDNA (Cox1 Folmer and Asmit regions)
The cox1 region carried a strong phylogenetic signal, with clear haplotype clustering and high bootstrap support, although with higher bootstrap values for the Folmer region ( Figure 2). The phylogenetic analysis indicated the same clustering as found with the combined data matrix. However, due to the higher number of cox1 sequences individuals could be grouped according to the country of origin. For example three clear B. truncatus clusters could be identified using both Folmer and Asmit regions, with again Cameroonian B. truncatus haplotypes branching off earlier than those from Senegal and Egypt ( Figure 2). In addition, the B. senegalensis haplotypes split into two lineages, the first originated from Cameroon and the second from Senegal, the clustering within Senegal was however less supported ( Figure 2).

Nuclear rRNA (18S, 28S and ITS)
The phylogenetic trees generated using the nuclear rRNA (18S, 28S and ITS) confirm the major clades described by both the combined data matrix and the cox1 region ( Figure 3). However, the 18S region was not able to classify the sequences according to their species. This is the case of the B. africanus group where there was no clear segregation between B. globosus and B. umbilicatus. Similarly, the three species within the B. forskalii group, B. senegalensis, B. camerunensis and B. forskalii all grouped together ( Figure 3).

Population genetic analysis
The mtDNA sequences showed high haplotype and nucleotide diversity, with a clear segregation according to  species and locality. Many different haplotypes were found, with approximately one out of three to four snail specimens having a different haplotype within the same locality. This high level of genetic diversity was higher than that found by Nalugwa et al. [11] in B. truncatus and B. forskalii (from Cameroon) where one snail out of six to seven individuals had different mtDNA haplotypes using the same fragment [11]. Of the nuclear fragments, 18S was the least variable, while 28S showed an intermediate level of nucleotide variation compared to the ITS fragment which was the most variable. These results are comparable to that found in the work of Jørgensen et al., [8] where they also observed a lower variation in 18S compared to 28S in 26 Bulinus taxa representing the four species groups currently recognized in this genus from different African countries [8]. Despite this intermediate nucleotide diversity in 28S, the haplotype diversity was high and comparable to that found with the cox1 region.
This high genetic diversity detected at all loci, both within and between species is probably influenced by several factors: i) Firstly, geographical isolation is probably the most dominant factor to explain the global diversity within species from different countries. In fact, Figure 1 Detailed view of the seven Bulinus species within the maximum-likelihood estimates (1000 bootstrap replicates) using combined data matrix (All genes together with TVM + G as selected model). An asterisk indicates node support ≥ 80 for all methods used (parsimony, distance, and likelihood bootstrap analyses), with Indoplanorbis exustus as outgroup. Scale bar indicates 2 substitutions per 100 sites. Each terminal branch in the tree is marked using a letter country code (C: Cameroon, E: Egypt, and S: Senegal). Each internal branch represent monophyletic groups. Bulinus populations investigated in the present study originated from geographically distant countries, limiting gene flow between them. This reproductive isolation might have allowed independent evolution and divergence of these populations leading to distinct lineages. Indeed, our results showed that B. truncatus and B. senegalensis populations from the same country clustered together and formed different lineages (Figure 2). ii) Secondly, Bulinus species are hermaphrodite and can reproduce both by cross-fertilization and/or self-fertilization [12], and each of these modalities (See figure on previous page.) Figure 2 Detailed view of the seven Bulinus species within the maximum-likelihood estimates (1000 bootstrap replicates) using cox1 Folmer and Asmit region (All genes together with GTR + G as selected model). An asterisk indicates node support ≥ 80 for all methods used (parsimony, distance, and likelihood bootstrap analyses), with Indoplanorbis exustus as outgroup. Scale bar indicates 2 substitutions per 100 sites for Cox Folmer region and 5 substitutions per 100 sites for Cox Asmit region. induces different genetic consequences [13]. B. globosus and B. senegalensis are mainly outcrossing and therefore they showed high levels of genetic variability. iii) Thirdly, despite the fact that B. truncatus is preferentially self-fertilizing, it is a tetraploid species [14], and studies have shown that diversification is high in polyploid species [15], which might explain this relatively high haplotype diversity detected within this snail species.

Snail genetic diversity and host parasite compatibility
The relationship between Schistosoma and Bulinus species is very specific and varies in according to geographic location [7,16]. For schistosome -snail interactions, empirical data suggest local adaptation of parasites which result in higher infection levels of local snail populations [17]. This in turn could affect the genetic diversity of these snails from one infection site to another [17]. In Cameroon, our results showed lower levels of genetic diversity for B. truncatus than for B. globosus and B. senegalensis at all loci. In fact, according to Schmid-Hempel [15], and Coltman et al., [19], a loss of genetic variation in host population can increase susceptibility to parasitism [18,19], basing on this statement higher infection and transmission rates should be expected in those B. truncatus populations. Indeed, experimental infections of B. globosus and B. truncatus from Cameroon with different isolates of local miracidia were carried out to study snail-parasite compatibility [6]. Their findings suggest that B. truncatus might be more implicated than B. globosus in S. haematobium transmission.
On the other hand, B. forskalii is ubiquitous in Cameroon and it is often present in association with B. senegalensis. It acts as intermediate host for S. guineensis [20]. The results reported in the present study revealed very high levels of genetic diversity in B. forskalii at all loci, which is in accordance with the work of Gow et al. [20], who have detected high level of microsatellite polymorphism and gene diversity within B. forskalii populations in Cameroon. Furthermore, according to Tchuem Tchuenté et al. [1], there is a decrease or cessation in S. guineensis transmission in some major foci in Cameroon, indicating a progressive extinction for this parasite species in this country [20]. This parasite extinction is most likely the result of hybridization between S. guineensis and S. haematobium, but also is maybe influenced by the high levels of genetic diversity within Cameroonian B. forskalii populations since a gain of genetic variation can decrease susceptibility to parasitism [19].
In Egypt, only B. truncatus acts as intermediate host for urinary schistosomiasis. B. truncatus snails are recorded in water bodies throughout Egypt. It is most abundant in large canals, and decreases in density as the water approaches and flows into drains [21]. In this country, urinary schistosomiasis is prevalent along the Nile Valley from the Delta region to Upper Egypt. A seasonal analysis showed that B. truncatus snails are yearround infected with schistosomes [22]. In the present study, we have noted a relatively high genetic diversity within Egyptian B. truncatus populations. In parallel, according to El-Khoby 2000, there is a decrease in the prevalence of S. haematobium in Egypt during the last years, and S. mansoni has almost totally replaced S. haematobium in Lower Egypt and is spreading into Upper Egypt [23]. Furthermore, the haplotypes detected from the Nile Delta were different from those found in Upper Egypt. Prevalence of S. haematobium in Upper Egypt sites was higher than that found in the Nile Delta [23]. This might be explained by a variety of factors that are probably related to a lower socioeconomic status of people who live in rural sites (Upper Egypt) and are more employed in agriculture so that they are more exposed to infection. Another factor might be related to higher genetic diversity within populations of B. truncatus in Nile Delta.
In Senegal, our results showed higher diversity in B. truncatus populations than those in Egypt and Cameroon. This might be linked to the fact that Senegalese B. truncatus populations are implicated in the transmission of S. bovis and S. haematobium x S. bovis hybrids, but not of pure S. haematobium [24], suggesting a strong role for parasite genetics in influencing the diversity of their intermediate hosts.

Phylogeny of bulinids from Cameroon, Egypt and Senegal in relation to schistosomiasis
Schistosomiasis transmission depends on the active role of the intermediate host. The disease is thus closely related to rural development of water resources and snail population expansion [25]. Therefore, we cannot talk about parasite distribution without referring to the distribution of its intermediate host. Our results showed lower diversity within B. truncatus populations originating from North Africa (Egypt) compared to those originating from Sub-Saharan Africa (Cameroon and Senegal) at all loci. Many factors can explain this lower diversity in Egypt such as selection, drift or probably a relatively recent introduction since specimens from Cameroon and Senegal are basal to them in lineage C (Figures 1  and 2), which may suggest that B. truncatus may have colonized Egypt from Sub-Saharan Africa. In addition, a previous hypothesis suggested that schistosomiasis has appeared earlier in various parts of sub-Saharan Africa and then became widely distributed in North Africa during prehistoric wet phases [26]. In fact, the physical and human environment of the Nile valley and delta provided increasingly favorable conditions for the transmission and

Taxon sampling and DNA extraction
Snails of the Bulinus genus were sampled from Cameroon, Egypt and Senegal. The climate is of Mediterranean type in Egypt, tropical or equatorial in Cameroon, and mainly tropical in Senegal. This allowed the discovery of as much genetic variability as possible even within the same species from different countries.

DNA amplification and sequencing
Four markers were selected for sequencing: the protein coding cytochrome c oxidase subunit I (cox1), and three nuclear ribosomal genes (ITS, 18S and 28S). Partial sequences from the four genes were obtained after PCR amplification. Various combinations of the used primers are presented in Table 5.
For cox1 and ITS, 25 μL PCR mixtures were set up with 0.5 U Taq polymerase (Qiagen), 5 μL 10x buffer, 2-4 μL MgCl2 (depending on template and primers), 0.8 μL dNTPs, 0.5 μL from each primer (20 ng/μL). For 18S and 28S rRNA; amplification was performed with High Fidelity Taq DNA polymerase (Roche) in a 50 μL total reaction volume with standard reaction conditions since this enzyme is designed to increase yield and fidelity when amplifying longer fragments. A standard concentration of 500 ng/μL was used as template for all PCR reactions.
The PCR conditions varied for each primer combination. Cycling conditions for both cox1 and ITS were as follows: preheat step at 95°C for 5 min followed by 35 cycles of denaturation at 94°C for 10 sec, annealing at 40°C for 30 sec and extension at 72°C for 1 min; final extension step at 72°C for 10 min. Amplification for both 18S and 28S involved i) a preheat step at 94°C for 2 min, ii) 10 cycles, each including denaturation at 94°C for 15 sec, annealing at 43°C for 30 sec and extension at 72°C for 45 sec, iii) 15 cycles including the same conditions except the extension step which was expanded to 5 sec per cycle, iv) a final extension step at 72°C for 10 min.
PCR products were purified using QIAquick PCR Purification Kit (Qiagen®, France), then sequenced using an ABI Prism BigDye® Terminator v1.1 Cycle Sequencing Ready Reaction Kit (PE, Applied Biosystems) following manufacturer's instructions, and run on an ABI 377 automated sequencer.

Data analyses
The obtained sequences were used to perform BLAST searches [30] via the National Center for Biotechnology Information (NCBI) GenBank to assure that the sequences matched the sequenced genes. Sequence chromatograms were checked using SeqScape v2.5. Nucleotide sequences were multiple aligned using CLUSTAL X [31], and submitted to GenBank database under the accession numbers [GenBank: KJ135287-KJ13508 and KJ157326-KJ157508].

Population genetic analysis
DnaSP v5.10.1 [32] was used to estimate global and regional levels of haplotype and nucleotide diversity for the seven species, and to conduct Tajima's and Fu's neutrality tests for patterns on non-neutral evolution within each species. We also used DnaSP to estimate the diversification index (Fst) between each pair of species. MEGA v5 [33] was used to estimate the average evolutionary divergence over sequence pairs within species for each gene. The optimal model of sequence evolution that best fits the data was evaluated using FindModel: for cox1-Folmer region (GTR + G), Cox Asmit-region (GTR + G), 18S (GTR + I), 28S (K81 + G) and combined data matrix (GTR + I + G). Standard error estimates were obtained by a bootstrap procedure (500 replicates).

Phylogenetic analyses
In the case of interpopulation diversity, data matrices were constructed (with unique haplotypes) for individual genes separately, and then for all combined genes. All phylogenetic analyses were carried out using MEGA. Evolutionary relationships between the haplotypes were inferred using maximum parsimony (MP), maximum likelihood (ML), and Neighbor-joining (NJ) methods. The different types of analyses were subjected to 1000 bootstrap replicates as a means for testing the reliability of individual branches within the generated tree.
Indoplanorbis exustus (Deshayes, 1834) (Planorbidae: Bulininae) was selected as outgroup in analyses of cox1 (Folmer and Asmit regions), 18S and 28S, as Morgan et al., [34] had placed this genus as sister group to Bulinus [34]. Trees using ITS data matrix were constructed with Biomphalaria helophila as outgroup since no ITS sequence of I. exustus was available. Parsimony inference was performed via a heuristic search using 1000 replicates of random sequence entry, tree-bisection-reconnection (TBR) branch swapping. Maximum likelihood and NJ trees were performed with the inferred substitution models as mentioned above.