Our study examined both intra- and inter-specific genetic variation of two sequence fragments each from reproductive genes Zp2 and Zp3, in seven representative species from the Bovini tribe. A general pattern of low levels of sequence polymorphism and divergence has been detected for those two sequenced fragments (Tables 1 and 2). Due to small sample size for some species studied (e.g., 2 individuals for European bison), the low level of intraspecific variation could be, at first sight, attributable to the effect of sample size. In fact, such a pattern of low polymorphism and divergence was not really ascribed to the sample size, except for an extreme case of European bison, but to the sequence conservation of those two sequenced fragments. Several observations provided strong support to this scenario. First, the sample sizes for domestic cattle (44 for B. taurus, 34 for B. indicus) were not such small, but the pattern of low levels of intraspecific variation still remained (Table 1). Second, the codon-based Z-tests revealed evidence for purifying selection in pairwise comparisons of the Zp2 and the Zp3 coding haplotype sequences (see Additional files 3 and 4), suggesting the evolution of the sequenced fragments of the two genes under functional and structural constraints. Finally, the sequenced fragments of the Zp2 (from amino acids 238 to 311) and the Zp3 (from amino acids 164 to 229) are located at zona pellucida domains, which have been demonstrated to be well conserved in a wide variety of mammalian species [51, 52].
It is worthy of note that several haplotypes were shared by multiple bovine species for both genes, for instance, the haplotype Zp2H2 (or Zp2cdh2) shared by five species from Bos and Bison (see Figures 1, 2a, and 3a). There were, at least, eight coding SNPs shared by species from Bos and Bison for the two genes, segregating from the outgroup species, i.e., the domestic goat and sheep (Figures 2a and 3a). Such a pattern of haplotype sharing can be explained by either introgressive hybridization or retention of ancestral polymorphism . The hypothesis of introgressive hybridization is plausible in that interspecific hybridization among species from Bos or Bison often occurs [32–36]. However, a recent study attempting to reconstruct phylogeny of the Bovini tribe revealed that shared haplotypes among bovine species might be the result of the retention of ancestral polymorphism .
To test whether retention of ancestral polymorphism contributed to haplotype sharing among species from Bos and Bison and to identify specific shared polymorphisms, we looked into patterns of mutations at both sequenced fragments of the Zp2 and the Zp3 genes. A fixed coding mutation (G > A) at position 5463 of the Zp2 was segregated between non-buffalo (Bos and Bison with 'A') species and buffalo (Bubalus and Syncerus with 'G') species or the outgroup species (Goat and sheep with 'G') (see Additional file 1); A fixed non-coding mutation (G > A) at position 3270 and a fixed coding mutation (C > T) at 3418 of the Zp3 were also completely differentiated between non-buffalo (Bos and Bison with 'A' at 3270 and 'T' at 3418) species and buffalo (Bubalus and Syncerus with 'G' at 3270 and 'C' at 3418) species or the outgroup species (Goat and sheep with 'G' at 3270 and 'C' at 3418) (see Additional file 2). It is clear that those three shared polymorphisms among Bos and Bison species were derived from their most recent common ancestor after divergence from buffalo (Bubalus and Syncerus) species, but were retained during divergence and evolution of Bos and Bison species. Thus, such shared polymorphisms provide unequivocal evidence that haplotype sharing among Bos and Bison species can be, at least, partially attributed to retention of ancestral polymorphism.
In reality, it is difficult to determine whether these shared haplotypes are remnants of ancestral polymorphism or traces of recent introgression events , but a recent study suggests that geographic distribution of shared haplotypes would help to disentangle introgression from ancestral polymorphism . To further define a specific shared haplotype due to one of two hypotheses, we mapped all coding haplotypes of both genes on their geographic distribution (Figures 2b and 3b). For the Zp2, the shared haplotypes Zp2cdh1 (equal to Zp2H1, Zp2H3 in Figure 1a) and Zp2cdh2 (equal to Zp2H2, Zp2H6 in Figure 1a) are found in samples of multiple Bos species across Africa (Morocco and Egypt), Middle East (Yemen), Central Asia (Turkmenistan, Kyrgyzstan, and Kazakhstan), South Asia (India and Myanmar), and Northeast Asia (Mongolia and China). For the Zp3, the shared haplotype Zp3cdh1 (equal to Zp3H1, Zp3H2, and Zp3H9 in Figure 1b) is also found in samples of multiple Bos species across Africa, Middle East, Central Asia, South Asia, and Northeast Asia. However, the shared haplotype Zp3cdh3 (equal to Zp3H4 in Figure 1b) is only detected in B. indicus from India and B. frontalis from Myanmar, while the shared haplotype Zp3cdh4 (equal to Zp3H5, Zp3H8, and Zp3H10 in Figure 1
b) is only present in B. indicus from India and B. grunniens from China. Given that, interspecific hybridization between Bos species mainly occurs in South Asia and Central Asia [32–35], and therefore the shared haplotypes Zp3cdh3 and Zp3cdh4 can be simply explained by hybridization events. Since there is no clear evidence for hybridization between Bos species and European bison (B. bonasus), although well-documented evidence for hybridization between B. taurus and American bison (B. bison) exists , the haplotype Zp2cdh2 shared by Bos and Bison species is most probably due to retention of ancestral polymorphism, rather than to introgressive hybridization. When scrutinizing phylogenetic relations among coding haplotypes of both genes as shown in networks (see Figures 2a and 3a), both the shared haplotypes Zp2cdh1 and Zp3cdh1 are located at external nodes of network, compared to the outgroup species through buffalo (Bubalus and Syncerus) species. Based on the assumption that on average the external haplotypes in the haplotype networks should be younger than those internal ones , the haplotypes Zp2cdh1 and Zp3cdh1 shared by multiple Bos species represent relatively recent derived haplotypes at high-frequency, being compatible with a scenario of recent hybridization. Indeed, the recent worldwide spread (less than 10,000 years) of domesticated Bos species (and their haplotypes) mediated by humans may have potentiated this scenario. Taken together, none of the two hypotheses alone can fully interpret the presence of shared sequences in the Zp2 and Zp3 genes among Bos and Bison species. Indeed, these two hypotheses are reciprocally non-exclusive and therefore both hypotheses can account for our observations of haplotype sharing in the Zp2 and Zp3 genes among Bos and Bison species. Our observations of haplotype sharing suggest that adaptive alleles can flow into domestic cattle gene pools from other bovine species via introgressive hybridization, and highlight South Asia as the diversity hotspot for bovine species.
A general genetic pattern of rapid evolution driven by positive nature selection has been revealed for reproductive proteins [15, 17, 18]. This pattern has also been detected in the ZP2 and the ZP3 glycoproteins of some mammalian species [19–22]. In contrast, we found evidence for purifying (or negative) selection in the sequenced fragments of the Zp2 (from amino acids 238 to 311) and the Zp3 (from amino acids 164 to 229) genes, rather than positive selection. The molecular signatures of negative selection detected in this study can be easily explained by functional and structural constraints acting on the two sequenced fragments each of the two genes, both of which are located at the conserved ZP domains [51, 52].
It should be aware that previous evidence of positive selection in ZP glycoproteins of mammalian species is not robust and consistent among different studies. For instance, a reanalysis of Mus Zp3 data from Jansa et al.  showed no evidence for positive selection after removal of the outgroup sequences . Berlin and Smith  analyzed a Zp3 data set of 15 mammalian species including those eight species analyzed by Swanson et al.  and found no evidence for positive selection in Zp3 gene of those 15 species. In addition, simulation analysis by Berlin and Smith  showed that the likelihood ratio test (LRT) comparing M7-M8 that was used by Swanson et al.  suffers false positives under fairly simple scenarios, suggesting that the M7-M8 LRT alone does not provide solid evidence for positive selection. It should also be noticed that the M0-M3 LRT used by Swanson et al.  is no longer recommended to test evidence of positive selection . To further test whether evidence of positive selection in mammalian Zp3 and Zp2 genes is robust and consistent, we applied the CODEML site models in PAML version 4.4  to two data sets each for mammalian Zp3 (15 species) and Zp2 (10 species) full coding sequences, collected from GenBank database (see Additional file 5). Both data sets of Zp3 and Zp2 genes include those eight mammalian species analyzed by Swanson et al. . Most strikingly, we also found no evidence of positive selection in the Zp3 data of 15 species, by the three LRTs comparing M1a-M2a, M7-M8, and M8a-M8 models, respectively (see Additional file 6), being in line with previous results by Berlin and Smith . When considering only those eight species analyzed by Swanson et al. , both the M1a-M2a and M8a-M8 LRTs were not significant, whereas the M7-M8 LRT was significant (2ΔlnL(M7-M8) = 7.194, p = 0.027). However, for the Zp2 data of 10 species, the M7-M8 and M8a-M8 LRTs were significant (2ΔlnL(M7-M8) = 9.474, p = 0.009; 2ΔlnL(M8a-M8) = 5.030, p = 0.025), but the M1a-M2a LRT was non-significant (2ΔlnL(M1a-M2a) = 3.432, p = 0.180). Likewise, when focusing on only those eight species analyzed by Swanson et al. , three LTRs yielded similar results as those from the Zp2 data of 10 species (see Additional file 6). Moreover, we identified one site (at 174) in the Zp2 data set of 10 species and four sites (at 38, 117, 174, and 342) in the Zp2 data of eight species under positive selection with posterior probabilities > 90%. Our analyses on real data sets of both mammalian Zp3 and Zp2 genes are congruent with previous simulation analyses showing that the M1a-M2a and M8a-M8 comparisons are more robust than the M7-M8 comparison [56, 58]. Taken these observations into account, we conclude that there is no evidence of positive selection in Zp3 gene of these mammalian species studied, and that there is some (not strong) evidence of positive selection in Zp2 gene of these mammalian species studied. It is therefore not strange to find evidence of purifying selection in the sequenced fragment of Zp3 gene in bovine species.
We also noticed that evidence for positive selection was detected in Zp3 gene of New Guinean and Australasian murine species, by using three LRTs comparing M1a-M2a, M7-M8, and M8a-M8 models . In addition, three amino acid sites (at 324, 325, and 341) within the polypeptide (residues 309-355) encoded by mouse Zp3 exon-7 were identified to be under positive selection. The polypeptide encoded by mouse Zp3 exon-7 containing sperm combing-site of ZP3 has been demonstrated to be essential for sperm-binding activity by in vitro functional experiments [8–10]. However, when considering only Australasian murine species, three LRTs were not significant at all . These results from murine rodents suggest that there is lack of a general pattern of rapid evolution driven by positive selection in Zp3 gene across mammalian species. Contrary to mammalian ZP3, we found evidence of positive selection in mammalian ZP2 with several positively selected amino acid sites (at 38, 117, 174, and 342), indicating that mammalian ZP2 might have played a more important role in sperm-egg interaction than previously thought. Although, some authors still support sperm-binding model mediated by ZP3 [2, 3, 59], such model has been called into questions and is gradually replaced by a three-dimensional ZP structure model [11–13]. The latter model is further supported by a recent study showing that sperm-egg recognition in mouse depends on the cleavage status of ZP2, but is unaffected by the ZP3 mutations in the sperm combining-site .