- Research article
- Open Access
Positive selection at sites of chemosensory genes is associated with the recent divergence and local ecological adaptation in cactophilic Drosophila
BMC Evolutionary Biology volume 18, Article number: 144 (2018)
Adaptation to new hosts in phytophagous insects often involves mechanisms of host recognition by genes of sensory pathways. Most often the molecular evolution of sensory genes has been explained in the context of the birth-and-death model. The role of positive selection is less understood, especially associated with host adaptation and specialization. Here we aim to contribute evidence for this latter hypothesis by considering the case of Drosophila mojavensis, a species with an evolutionary history shaped by multiple host shifts in a relatively short time scale, and its generalist sister species, D. arizonae.
We used a phylogenetic and population genetic analysis framework to test for positive selection in a subset of four chemoreceptor genes, one gustatory receptor (Gr) and three odorant receptors (Or), for which their expression has been previously associated with host shifts. We found strong evidence of positive selection at several amino acid sites in all genes investigated, most of which exhibited changes predicted to cause functional effects in these transmembrane proteins. A significant portion of the sites identified as evolving positively were largely found in the cytoplasmic region, although a few were also present in the extracellular domains.
The pattern of substitution observed suggests that some of these changes likely had an effect on signal transduction as well as odorant recognition and protein-protein interactions. These findings support the role of positive selection in shaping the pattern of variation at chemosensory receptors, both during the specialization onto one or a few related hosts, but as well as during the evolution and adaptation of generalist species into utilizing several hosts.
Studying host adaptation is critical to understanding the ecological and evolutionary history of phytophagous insects [1,2,3,4,5]. Host shifts represent a distinct and sometimes novel set of complex challenges since new host plants can differ in nutritional content, chemical composition, microorganisms and defensive quality [6,7,8]. The use of novel resources when shifting to a new host can therefore affect survival of larva and adult stages, fecundity of feeding females as well as behavioral patterns of oviposition and mating, causing ultimately substantial impact on fitness. This has led to the hypothesis that in addition to genes involved in performance relative to a new environment, adaptation to new hosts often involves mechanisms of host preference [8,9,10]. Given their role in odorant and gustatory sensory and ultimately in the discrimination of suitable hosts, chemoreceptors have been the focus of several studies examining the evolutionary mechanisms underlying host adaptation [11,12,13,14,15,16,17].
Chemical stimuli recognition occurs through odorant (Or) and gustatory receptor (Gr) proteins, converting volatile and soluble chemicals from the environment into electrical outputs through nerve impulses [18, 19]. In insects, these proteins have seven trans-membrane domains located on the dendritic membranes of sensory neurons housed in sensilla of receptor organs, with a reverse orientation when compared to mammalian topology (COOH-tail extracellular) [20, 21]. While the number of genes in these families seems to be largely stable across Drosophila evolution, there have been constant gene loss and gain events through the birth-and-death model in insects [22,23,24,25,26,27,28] where divergence arises from the selection of beneficial mutations between paralogous genes . This mode of evolution, as well as the clustered organization of these genes has provided strong support to the hypothesis that these loci play a critical role in ecological shifts and specialization [9, 11,12,13,14,15,16,17, 22, 23]. In particular, the ecological specialization observed within Drosophila has been suggested to have contributed to the higher pseudogenization levels observed in sensory genes [14, 15]. However, the analysis of 12 Drosophila species genomes , has shown that genome size or endemism could also predict pseudogenization rates, but the few specialist species compared and the lack of evolutionary data over short time scales has prevented efforts to disentangle these causes .
The dramatic heterogeneity in family size and divergence among chemosensory gene paralogs suggest that these gene families have evolved rapidly. However, the pattern of variation and divergence at a majority of genes are more consistent with purifying selection and only a small portion of genes have shown evidence of positive selection, some of which have been linked to specialization [14, 15, 29, 31]. Thus, specialist species such as D. sechellia and D. erecta have shown higher Ka/Ks ratios when compared with their closest generalists species . The majority of amino acid sites showing signatures of positive selection in these families have been found in the cytoplasmic domain, suggesting that selection changes are associated with signal transduction rather than the detection of odorant molecules . To date most comparisons have involved a few highly divergent species; here we examine the case of the cactophilic local specialist D. mojavensis and its generalist sister species D. arizonae to investigate the role of positive selection in four candidate chemoreceptor genes (one Gr and three Or). These species have adapted to develop and feed on necrotic cacti, showing partially overlapping distributions in xeric regions of southwestern United States and northwestern Mexico [33, 34]. The four candidate loci examined here (Or67c, Or83c1, Or83c2 and Gr63a) have shown a pattern of differential expression associated with host shifts suggesting their role in host preference, presumably by detecting volatiles associated with either the presence of nutrients or toxins produced in rotting cacti [35,36,37]. Or67c and Gr63a are known to play a role in detecting the by-products of alcoholic fermentation as well as CO2 [38, 39], while Or83c1 and Or83c2 have been suggested to play role in food detection [38, 40].
Drosophila mojavensis offers a unique case to study host adaptation since its biogeographical history has been shaped by multiple host shifts in a relatively short period of time . Since its divergence from its sister species D. arizonae, the D. mojavensis lineage has expanded through a rapid specialization process (< 0.5 Mya; [42, 43]) producing four host populations adapted to different cactus species (Fig. 1): agria (Stenocereus gummosus) in Baja California; coastal prickly pear (Opuntia littoralis) in Santa Catalina Island; red barrel (Ferocactus cylindraceus) in the Mojave Desert and organpipe (S. thurberi) in the Sonora Desert population [33, 34]. We investigated patterns of molecular evolution in four chemoreceptor genes across the four host populations of D. mojavensis and the generalist D. arizonae to test for positive selection using a multiple approach of gene-wide, polymorphic-based, codon-based and functional tests. Then, we analyzed the distribution of sites showing evidence of positive selection on the structure of these trans-membrane proteins in order to test whether selection has been biased towards specific domains and discussed the implication of these results in terms of the role of positive selection in host preference adaptation.
Genetic diversity, neutrality tests and divergence
For each locus examined, between 7 and 15 sequences were generated for each of the four D. mojavensis populations (Sonora Desert, SON; Santa Catalina Island, CAT; Mojave Desert, MOJ and Baja California, BAJ), as well as between 8 and 12 sequences of D. arizonae. After excluding introns, these amplicons had between 1170 bp for Or83c2 to 1479 bp for Gr63a (Table 1). Given the population structure in D. mojavensis (Additional file 1: Table S2) [44, 45], we sampled multiple individuals per population. Here we report average values per population of D. mojavensis to make them comparable with those in D. arizonae (Table 1). Between 12 and 35 polymorphic sites and 6 and 12 different haplotypes were detected across the loci and species examined (Table 1). Diversity estimates in D. arizonae often fall between the estimates for individual population in D. mojavensis (Additional file 1: Table S2). However, most genetic diversity estimates were higher in D. arizonae when compared with average estimates across D. mojavensis populations. Thus, haplotype diversity was higher in D. arizonae for all genes, genetic diversity for three genes and nucleotide diversity for two of genes analyzed (Table 1). Tajima’s D and Fu and Li’s D and F neutrality tests were not significant for most genes (Table 1). Fu’s FS statistic was significantly negative for the majority of genes in both species (Table 1). For one of the loci (Gr63a) and in one D. mojavensis population (MOJ) these tests were significantly positive (Additional file 1: Table S3), indicating a recent bottleneck causing a heterogeneous pattern in the demographic history of this species.
The genetic divergence assessed by comparing the number of synonymous (Ks) and nonsynonymous (Ka) substitutions ratio (Ka/Ks) with respect to the D. navojoa outgroup was often larger in D. arizonae for all loci with the exception of Or83c1 (Table 1). The genetic structure as assessed by ΦST indicated the existence of significant population structure in D. mojavensis, with ΦST values ranging between 0.26 and 0.59 (Table 1). Population structure was highly significant for all genes as well as the majority of pairwise comparisons between D. mojavensis populations (Additional file 1: Table S4).
Evidence of positive selection
All genes showed significant signatures of positive selection in at least one of the gene-wide (Table 2), codon-based and polymorphism-based (Table 3) tests performed. Codeml consistently showed all genes evolving under positive selection, ie. LRT favored the M8 selection model (Table 2). The genes Or67c and Or83c2 were also significant in one (PARRIS) and three (PARRIS, BSR, aBSREL) of the additional gene-wide selection tests performed respectively, as well as the McDonald-Kreitman test (Table 2). The BSR test identified selection in Or83c2 for specific branches belonging to the SON and CAT populations of D. mojavensis as well as branches belonging to D. arizonae lineage (Table 2). McDonald-Kreitman test was also performed for pairwise comparisons between host populations of D. mojavensis and D. arizonae, showing signatures of positive selection for SON and BAJ in Or67c and SON, BAJ and CAT in Or83c2 (Table 2 and Additional file 1: Table S5). Only the gene Or83c2 showed signatures of positive selection in both the D. mojavensis and D. arizonae lineage as determined by the McDonald-Kreitman tests and polarizing the fixed sites utilizing the outgroup species D. navojoa (Table 2 and Additional file 1: Table S5). Furthermore, all genes showed evidence of positive selection at individual sites in at least three of the codon-based selection tests performed and all tests provided significant values in at least two genes (Table 3). A combined analysis of these tests led us to identify four sites evolving under positive selection for the gene Gr63a (using the codemlBEB, REL, FUBAR and PRIME tests); nine sites in Or67c (using the codemlBEB, SLAC, FEL, REL, MEME, FUBAR and PRIME tests); eight sites in Or83c1 from (using the codemlBEB, FEL, IFEL, REL, MEME, FUBAR and PRIME tests); and eight sites positively selected in the gene Or83c1 (using the codemlBEB, SLAC, FEL, IFEL, REL, MEME, FUBAR and PRIME tests) (Table 3, Fig. 2). Though we identified several sites evolving under positive selection, we also investigated whether these changes led to changes in amino acid characteristics that might affect protein structure and hence function. Results of the PRIME analysis showed that 38% of the candidate sites across the four loci indicated possible substantial changes to protein structure (Table 3).
Since the performance of selection tests based on phylogenetic comparisons could be affected by recombination increasing the number of false positively selected sites [46,47,48], we tested for the presence of recombination breakpoints and found evidence for such events in Or83c1 and Or83c2. Therefore, we also performed the test of selection on these two loci within each linkage block. After controlling for recombination, two sites in Or83c1 and four sites in Or83c2 continued to indicate a pattern of positive selection. The remaining sites did not pass the threshold of the Bayes factors (BF > 100) in the REL test, but posterior probabilities remained over 0.8. Suggesting that, although the power of the test is decreased since blocks are analyzed instead of genes, a similar selective history affected the pattern of variation at these sites as well. For this reason and because the phylogenetic relationships among the three species did not change between partitions, suggesting that the assumptions of selection tests are not violated, only the results for the gene as a whole without partitions were further analyzed.
Signatures of selection among gene domains
Given the pattern observed at these loci suggesting positive selection between D. mojavensis and D. arizonae, we further tested whether selection had occurred within specific domains of these transmembrane proteins. Approximately 55% of the sites under positive selection are located in the cytoplasmic domain while only 24 and 21% are in the extracellular and transmembrane domains, respectively (Fig. 2). We assessed the distribution of selected sites across the protein domains of all four loci through a GLM model (Table 4). We found no significant gene effect, but a significant domain effect (Table 4), with the cytoplasmic domain enriched for positively selected sites when compared with the transmembrane and extracellular domains according to multiple comparisons (Fig. 3). We additionally tested for such structural heterogeneity in the level of selection by running the McDonald-Kreitman independently for each domain of each gene. These tests showed significant evidence of positive selection in the cytoplasmic domain for Or83c2 and marginally for Or67c, while no significant effects for the transmembrane and extracellular domains (Table 2). The proportion of sites under positive selection was just marginally significant when accounting for the domain size in the GLM model for Or83c2 and Or67c (Table 4), which points to selection also being influenced by domain size. This is expected since, larger domains have a higher number of sites available for selection to act on.
It has been suggested that specialist species not only lose a subset of sensory pathway genes during host specialization, but some of those that are retained undergo periods of positive selection [14, 15]. Here we investigated the role of positive selection in the molecular evolution of four chemoreceptor genes (one Gr and three Or) during the evolution of D. mojavensis, a species that has experienced recent bouts of specialization resulting in a set of four cactus host populations, and its generalist sister D. arizonae. We used multiple approaches to test for positive selection including codon-based, gene-wide, polymorphism and functional analyses. All genes investigated showed evidence of positive selection at codon and gene-wide approaches while two loci indicated positive selection when polymorphism information was analyzed. Although most sites at these genes are likely to be evolving under purifying selection, our approach suggests evidence of positive selection shaping the evolution of several amino acid sites involved in molecular and physiological functions, which allowed us to investigate the distribution of candidate sites under positive selection across the domains of these chemosensory receptor proteins.
The level of genetic diversity was higher in D. arizonae when compared with the average value across the D. mojavensis populations. This pattern is in agreement with previous studies of other nuclear genes in these species , which is likely a consequence of a higher overall effective population size in D. arizonae [49, 50]. This pattern of variation is however dissimilar to what has been observed from mitochondrial loci , which suggests that recent evolutionary forces are shaping nuclear and mitochondrial genes differently . Likewise, the Fu’s FS was significantly negative at most loci, which is evidence of excess number of alleles . Since we controlled for the detected population structure performing tests per population, this pattern is more likely indicating that both species are undergoing population expansions. Although we did not find such evidence from all neutrality tests, most values were consistently negative and the Fu’s FS statistic is particularly sensitive to recent changes in population sizes . This expansion in both species has been previously suggested to be associated with a shared biogeographical history in some of the D. mojavensis populations (SON and BAJ) and D. arizonae. As suggested by Machado et al. , these populations have experienced expansion and contraction events due to Pleistocene glaciation cycles that affected a great part of the biota [52,53,54]. This is consistent with our results per population since the sampled population of D. arizonae is from the Sonoran Desert and only the populations from Sonora and Baja California showed signatures of expansion in D. mojavensis. The sole significantly positive value indicates a deficiency of alleles in the Mojave Desert population of D. mojavensis at a single locus (Gr63a) . Given that demographic changes would be expected to affect the majority of loci, this particular case could possibly be a result of balancing selection. Further evidence is needed to obtain a more complete understanding of what is shaping variation at this locus in this population.
Most of the evidence for the role of positive selection in Or and Gr genes during host specialization comes from the genome-wide analysis in Drosophila species, particularly the case of D. sechellia and D. erecta [14, 15, 31]. McBride and Arguello  found higher Ka/Ks ratios in chemoreceptors of these specialist species when compared with their closest generalists, although such ratios could be alternatively explained by relaxed purifying and not necessarily positive selection. Such comparisons have been based on genome-wide studies, where the lack of polymorphic variation and specialization events in short time scales have made the efforts to disentangle these two hypotheses difficult. Gardiner et al.  found 20 genes with evidence of functional divergence in a genome-wide study of 12 Drosophila genomes, but only six were considered to be under positive selection. This suggests that the number of genes positively selected in these families is often low in insects, however, as suggested by Gardiner et al. , the low power of the phylogenetic-based tests should caution one against making broad generalizations. Despite the low Ka/Ks values of these genes, suggesting that most sites are evolving under purifying selection, we were able to identify a number of candidate codons under selection, some of which might contribute to significant effects on protein structure and function. These changes may be involved in detection of new odor molecules as well as protein-protein interactions during signal transduction. These findings suggest that the low number of sites under positive selection previously reported in sensory genes (as recognized by Gardiner et al. ) may be at least in part due to the low power of the particular selection tests previously utilized and the absence population level polymorphism data.
According to the specialization hypothesis , beneficial mutation associated with amino acid changes that improve the recognition capacity of smell and taste receptors should be favored in specialists, resulting in a narrowly tuned receptor . This proposed mechanism of evolution at chemosensory loci might also have shaped the variation of chemosensory loci during the specialization process of the four cactus host populations of D. mojavensis [33, 34, 55]. The necroses of each cactus differ in their chemical composition as well as the yeast and bacterial communities on which D. mojavensis feeds [56,57,58]. It has been shown that these populations not only differ in terms of their performance while utilizing different necrotic cacti , but also exhibit different behavioral and electrophysiological responses of olfactory organs, which suggests that their peripheral nervous system has been shaped by local ecological differences [59, 60].
Although we provide evidence supporting the hypothesis of positive selection shaping the evolution of amino acid sites involved in molecular functions of sensory genes during the host specialization in D. mojavensis, not all the analyses performed allow for the possibility to detect lineage specific changes. Nevertheless, for those analyses addressing specific lineages, the evidence supports the role of positive selection during the evolution of D. mojavensis. This was evident from the BSR and McDonald-Kreitman tests in Or83c2. While Or67c was not significant for the McDonald-Kreitman test when compared with D. navojoa, both species showed similar fixed differences. In addition, the changes specific to the D. mojavensis lineage are supported by strong genetic differentiation among D. mojavensis populations as estimated by ΦST, which are often larger than what has been previously reported for neutral SSR markers , suggesting divergent selection may be increasing ΦST values in coding sequences among D. mojavensis populations.
We also found selective changes exclusive to the D. arizonae lineage. This was evident from the Ka/Ks value using D. navojoa as an outgroup, which was often larger in D. arizonae as well as the McDonald-Kreitman tests for Or83c2 and Or67c (Table 2). These findings in D. arizonae pose an interesting question about the role of positive selection in the evolution of a generalist that has diverged from a specialized species. Cytological and molecular evidence suggests that the D. mojavensis/D. arizonae lineage diverged from the lineage of the Opuntia specialist D. navojoa approximately 4 Mya (reviewed in ). Hence D. arizonae has acquired the ability to use columnar cacti, as well as Opuntia, among other hosts. The potential role of positive selection during the acquisition of new hosts in the evolutionary transition between specialist to generalist species may favor mutations associated with amino acid changes involved in the chemical recognition of a wider repertoire of odors, a broadly tuned receptor . Alternatively, the changes observed could be associated with the modulation of odorant binding signals during the early stages of the signal transduction cascade [62, 63].
The majority of the sites under positive selection in the Or and Gr genes map to the cytoplasmic domain of these transmembrane proteins (Tables 2, 3 and 4). In general, this pattern was unexpected since this domain interacts with secondary messengers involved in signal transduction and was therefore expected to be conserved. Interestingly, this pattern seems to be more common than previously thought, since there has been a previous report of the cytoplasmic domain of chemosensory receptors evolving rapidly . We only analyzed one Gr, which prevents us from including gene family as a factor in the GLM model. Despite such limitations of the model, the sole Gr gene analyzed here showed clear differences in the distribution of candidate sites when compared with Or genes. For example, Gr63a had the lowest number of candidate sites and was the only gene with the same number of candidate sites in extracellular and cytoplasmic regions, with no evidence of positive selection in the transmembrane domain. Such differences still need to be confirmed by comparing more Gr genes, but again, this result agrees with those patterns reported elsewhere . As suggested by Gardiner et al. , this probably reflects the functional and molecular differences between Or and Gr receptors. Or genes are associated with volatile recognition while Gr recognize soluble ligands, pheromones and particularly for Gr63a, levels of CO2 [11, 20, 21, 38, 64] . In addition, the olfactory receptors requires the interaction with the universal receptor Or83b [11, 20, 21], while Gr63a requires interaction with Gr21a for its functionality [39, 65]. Such functional and molecular differences are likely to lead to distinct regions of constraint as reflected by the observed differences in the distribution of positively selected sites.
The majority of positively selected sites mapped to IL1 and IL2 loops, which would be expected since these loops are not involved in the dimerization with Or83b, the universal co-receptor . Or67c was the only gene showing a candidate site in the IL3 domain. This loop interacts with IL3 of Or83b during dimerization , which make it likely to be under constraint. We also found candidate sites in the COOH-tail for Or83c1 and Or67c, which contrasts with the results of Gardiner et al.  since they found no candidate sites in this region for the ten Or genes under positive selection among the 12 Drosophila species studied. This terminal region is also important for coupling with Or83b, being conserved even in highly divergent genes .
The higher number of positively selected sites in the cytoplasmic domain suggest that selection has been acting primarily on the transduction of signal captured by chemoreceptors and not only its detection. Nevertheless, after controlling for the number of amino acids per domain, the GLM analysis was just marginally significant. Additionally, Or83c1 and Or67c also show divergent selection in the extracellular region (COOH-tail) essential for signal detection , suggesting that both signal detection and transduction have been shaped by positive selection during the molecular evolution of these chemoreceptors.
Local ecological adaptation can shape the evolutionary trajectories of myriads of traits and their respective underlying genetic architecture. Chemosensory pathways, especially those involved in host preference in insects, can be shaped by the selective forces associated with changes in host utilization. We have provided evidence supporting the hypothesis of host specialization as an evolutionary scenario that favors positive selection at specific amino acid sites in genes of sensory pathways. Given that selection has been particularly biased towards cytoplasmic domains at amino acid sites involved in structural and physiological functions, these selective forces appear to be largely shaping the post-odorant binding role of these chemosensory receptors, i.e. signal transduction. Our approach highlights the advantage of using a set of recently diverged species and populations locally adapted to distinct ecological conditions in the understanding of the mechanisms of chemosensory evolution. Furthermore, the incorporation of population level polymorphism data which allowed multiple and powerful approaches for detecting positive selection in an ecological context proved to be highly effective.
Isofemale lines from all four D. mojavensis cactus host populations were utilized in this study: Baja California (BAJ), Sonora Desert (SON), Mojave Desert (MOJ) and Santa Catalina Island (CAT) (Fig. 1). Additionally, sequences from a D. arizonae population from the Sonora Desert and one line of D. navojoa (Drosophila Species Stock Center 15081–1374.11), a cactophilic (Opuntia spp) species, from Jalisco, Mexico was utilized for comparisons requiring an outgroup. For both D. mojavensis and D. arizonae approximately 7–15 independent isofemale lines per population/species were sampled. These isofemale lines have been maintained in the laboratory for 60–120 generations and are highly inbred (Fig. 1). All stocks were maintained in 8-dram vials with banana-molasses media  in a 25°C incubator on a 14:10 h light:dark cycle.
Molecular procedures and alignments
DNA samples for each species were amplified via PCR using Apex Taq DNA Polymerase (Genesee Scientific) following manufacturer’s recommendations in a 25 μl final volume, involving 1 μl of DNA and 0.2 μM of each primer. Amplifications were performed using a PCR program of 35 cycles involving 94 °C for 30 s followed by the corresponding annealing temperature for each gene for 30 s (Additional file 1: Table S1), 72 °C for 1 min, and then a final extension at 72 °C for 10 min. PCR products were purified using QIAquick PCR Purification columns (Qiagen) following manufacturer’s recommended procedures and sanger sequenced in both directions (Operon Eurofins). Sequences were visually inspected for quality using Geneious  and aligned based on their amino acid sequences using ClustalW and manual verification. Haplotype phase was inferred for each sample using the software PHASE [68,69,70].
Genetic diversity, neutrality tests and divergence
The genetic diversity at nucleotide and haplotype levels was estimated for each species through a set of descriptive parameters using the software DNAsp v5 , including the average number of sequences (N), total number of sites (NS), number of variable sites (S), number of haplotypes (h), haplotype diversity (Hd), nucleotide diversity (π) and the average number of nucleotide differences (θW). In order to infer any deviation from neutral expectations at the population level, genetic structure or drastic changes in population sizes (recent bottlenecks or population expansion), a set of neutrality tests were performed, including Tajima’s DT , D and F tests  and Fu’s FS statistic  using the software DNAsp. All genes were assessed for recombination events using the GARD test, a genetic algorithm for recombination detection [75, 76] implemented in the package HyPhy v2 .
Divergence between D. mojavensis populations and D. arizonae was estimated by the number of synonymous (Ks) and nonsynonymous (Ka) substitutions ratio (Ka/Ks) with respect to the D. navojoa outgroup using the software DNAsp v5 . The genetic structure among D. mojavensis populations was estimated through pairwise ΦST, an analogue version of the Wright’s fixation index FST [78, 79] which is estimated by an Analysis of Molecular Variance (AMOVA), taking into account information on the genetic distances among haplotypes as well as their frequencies following Excoffier et al. . Significance was based on 10,000 permutations in the software Arlequin version 3.5.2 .
Because several of the selection tests we performed require a phylogenetic tree of aligned sequences, each gene was individually analyzed using Bayesian inference in MrBayes  after testing for the best evolutionary model inferred in the software jModelTest v0.1.1 . Two independent runs were used from different starting points by a Metropolis-coupled Markov Chain Monte Carlo analysis with four chains, one cold and three incrementally heated (heating parameter = 0.1) for 10 million generations, sampling every 5,000th tree. Both runs were checked for convergence of the Markov chains with the standard deviation of split frequencies being less than 0.001. Parameter estimates were then analyzed in Tracer  to ensure that these had reached stable values with adequate mixing and ESSs above 200.
Assessing positive selection
Signatures of positive selection were assessed based on the nonsynonymous/synonymous substitution ratio (dN/dS = ω). When there is no selection, both classes of substitutions are expected to become fixed with the same probability (ω = 1). On the other hand, selection can increase fixation probabilities for nonsynonymous mutations because of selective advantages (Positive selection, ω > 1), or decrease these probabilities due to selective constrains (Purifying / negative selection, ω < 1). A number of methods have been developed for detecting positive selection based on ω ratios at different levels, whole alignment, phylogenetic branches, codons and combinations of those. Therefore, we tested for selection using multiple tests and four approaches: i) gene-wide; ii) polymorphic-based; iii) codon-based; iv) functional implications of candidate sites.
i) Gene-wide selection tests were performed in order to detect general signatures of selection at the gene level in the alignments or trees among sequences of the three species, without making any assumption about foreground branches. First, the software codeml, implemented in the package PAML v4  was used to compare a null model (M7) in which ω is assumed to be beta-distributed among sites and a selection model (M8), where codons are allowed to have an extra category of positively selected sites ω > 1. The significance of these comparisons was assessed using a likelihood-ratio test . We also used the package HyPhy v2  in order perform a set of gene-wide tests for positive selection to complement and provide a comparison with the LRTs in codeml. Thus, we used the PARRIS test (a partitioning approach for robust inference of selection), which makes maximum likelihood inference of positive selection robust to the presence of recombination. The BUSTED test (Branch-site Unrestricted Statistical Test for Episodic Diversification)  was used to specifically assesses whether a gene has experienced positive selection on at least one site and one branch given a phylogeny. The BSR (Branch-Site Random effects likelihood test)  was used to test for episodic diversifying selection detecting linages with ω > 1. The aBSREL test (adaptive Branch-Site Random Effects Likelihood), the improved version of the commonly-used “branch-site” models, was used to test if positive selection had occurred on a proportion of branches. aBSREL models both site-level and branch-level ω heterogeneity, testing whether a proportion of sites have evolved under positive selection. An FDR correction was performed to account for multiple comparisons across genes for each test.
ii) A polymorphism-based approach was used to test for positive selection comparing the ω ratios within species with those between species using the McDonald-Kreitman test . This test takes advantage of the intraspecific polymorphic variation, since the substitution ω ratios within species are expected to equal those ratios between species under a neutral scenario but differ under selective scenarios. An FDR correction was performed to account for multiple comparisons across genes for each test.
iii) We performed a set of codon-based selection tests for identifying specific sites within genes showing signatures of selection. Bayes Empirical Bayes (BEB)  was used to estimate the posterior probability of sites under positive selection following the M8 model from codeml results. Again, these results were compared with codon-based selection tests available in the package HyPhy, including SLAC (Single-Likelihood Ancestor Counting method) , FEL (Fixed Effects Likelihood method), IFEL (Internal Fixed Effects Likelihood), REL (Random Effects Likelihood method), MEME (Mixed Effects Model of Evolution)  and FUBAR tests (Fast Unconstrained Bayesian Approximation for inferring selection) .
iv) Finally, we investigated whether changes at the amino acid level were associated with altered biochemical properties using the PRIME test (PRoperty Informed Model of Evolution) following Conant-Stadler  and Atchley categories . The five predefined amino acid properties proposed by Conant et al.  are: chemical composition of the side chain (C), residue polarity (RP), volume of the residue side chain (VR), isoelectric point of the side chain (IC), and hydropathy (H). The five composite properties proposed by Atchley et al.  are: polarity index (P), secondary structure factor (S), volume (V), refractivity (R), and isoelectric point (I). Although the method performs multiple tests on each site, p-values are reported after Bonferroni correction to control for the number of false positives.
Sites with evidence of positive selection were mapped onto the receptor’s topology predicted with the software TMHMM Server v2 (http://www.cbs.dtu.dk/services/TMHMM/) and then diagrams of the 2D receptors structure were created using PROTTER .
Signatures of selection among gene domains
We used two approaches in order to investigate whether selection is acting on specific domains in transmembrane receptor genes (cytoplasmic, transmembrane or extracellular). First, the number of sites under positive selection was compared among genes and domains through a Generalized linear model using the R package GLM after data normalization using the logarithmic transformation. This analysis was done twice, for raw counts of sites under selection and after accounting for the size of each domain in order to test for the potential effect of domain size on the number of sites showing signatures of selection, since longer domains have higher chance to show sites under positive selection. Additionally, the McDonald-Kreitman test was performed for each domain independently.
- aBSREL :
adaptive Branch-Site random effects likelihood
- BAJ :
- BSR :
Branch-Site random effects likelihood test
- BUSTED :
Branch-site unrestricted statistical test for episodic diversification
Chemical composition of the side chain
- CAT :
Santa Catalina Island
- Codeml M8 :
Codeml comparison between M7 and M8 models
- Cyto :
- D :
Fu and Li’s D test
- D. ari :
- D. moj :
- D. nav :
- D T :
Tajima’s D test
- Extr :
- F :
Fu and Li’s F test
- F S :
Fu’s FS statistic
- h :
Number of haplotypes
- H d :
Isoelectric point of the side chain
- Ka/Ks :
Nonsynonymous (Ka)/synonymous (Ks) ratio
- MK Cyto :
McDonald-Kreitman test for cytoplasmic domain
- MK Extr :
McDonald-Kreitman test for extracellular domain
- MK Total :
McDonald-Kreitman test for whole gene
- MK Trans :
McDonald-Kreitman test for transmembrane domain
- MOJ :
- N :
Average number of sequences
- N S :
Total number of sites
- PARRIS :
Partitioning approach for robust inference of selection
PRoperty Informed Model of Evolution
- S :
Number of variable sites
Secondary structure factor
- SON :
- Trans :
Volume of the residue side chain
- θ W ns :
Theta Watterson for nonsynonymous sites
- θ W s :
Theta Watterson for synonymous sites
- θ W :
Total theta Watterson
- π :
Total nucleotide diversity
- π ns :
Nucleotide diversity for nonsynonymous sites
- π s :
Nucleotide diversity for synonymous sites
- Φ ST :
Genetic structure index
Bernays EA, Chapman R. In: Bernays EA, Chapman RF, editors. Host-plant selection by phytophagous insects. New York: Chapman & Hall; 1994.
Jaenike J. Host specialization in phytophagous insect. Annu Rev Ecol Syst. 1990;21:243–73.
Bruce TJA, Wadhams LJ, Woodcock CM. Insect host location: a volatile situation. Trends Plant Sci. 2005;10:269–74.
Matsubayashi KW, Ohshima I, Nosil P. Ecological speciation in phytophagous insects. Entomol Exp Appl. 2010;134:1–27.
Hardy NB, Otto SP. Specialization and generalization in the diversification of phytophagous insects: tests of the musical chairs and oscillation hypotheses. Proc R Soc B Biol Sci. 2014;281:20132960.
Agosta SJ. On ecological fitting, lant-insect associations, herbivore host shifts, and host plant selection. Oikos. 2006;114:556–65.
Forbes AA, Devine SN, Hippee AC, Tvedte ES, Ward AKG, Widmayer HA, et al. Revisiting the particular role of host shifts in initiating insect speciation. Evolution (NY). 2016;71:1126–37.
Thompson JN. Evolutionary ecology of the relationship between oviposition preference and performance of offspring in phytophagons insects. Entomol Exp Appl. 1988;47:3–14.
Vertacnik KL, Linnen CR. Evolutionary genetics of host shifts in herbivorous insects: insights from the age of genomics. Ann NY Acad Sci. 2017;1389:186–212.
Dworkin I, Jones CD. Genetic changes accompanying the evolution of host specialization in Drosophila sechellia. Genetics. 2009;181:721–36.
Andersson MN, Lofstedt C, Newcomb RD. Insect olfaction and the evolution of receptor tuning. Front Ecol Evol. 2015;3:1–14.
Bohbot JD, Pitts RJ. The narrowing olfactory landscape of insect odorant receptors. Front Ecol Evol. 2015;3:1–10.
Matsuo T. Genes for host-plant selection in Drosophila. J Neurogenet. 2008;22:195–210.
McBride CS. Rapid evolution of smell and taste receptor genes during host specialization in Drosophila sechellia. Proc Natl Acad Sci U S A. 2007;104:4996–5001.
McBride CS, Arguello JR. Five Drosophila genomes reveal nonneutral evolution and the signature of host specialization in the chemoreceptor superfamily. Genetics. 2007;177:1395–416.
Simon JC, D’alençon E, Guy E, Jacquin-Joly E, Jaquiéry J, Nouhaud P, et al. Genomics of adaptation to host-plants in herbivorous insects. Brief Funct Genomics. 2015;14:413–23.
Smadja CM, Canbäck B, Vitalis R, Gautier M, Ferrari J, Zhou JJ, et al. Large-scale candidate gene scan reveals the role of chemoreceptor genes in host plant specialization and speciation in the pea aphid. Evolution (NY). 2012;66:2723–38.
Amrein H, Thorne N. Gustatory perception and behavior in Drosophila melanogaster. Curr Biol. 2005;15:673–84.
Grabe V, Sachse S. Fundamental principles of the olfactory code. BioSystems. 2018;164:94–101.
Touhara K, Vosshall LB. Sensing odorants and pheromones with chemosensory receptors. Annu Rev Physiol. 2009;71:307–32.
Vosshall LB, Stocker RF. Molecular architecture of smell and taste in Drosophila. Annu Rev Neurosci. 2007;30:505–33.
Sánchez-Gracia A, Vieira FG, Rozas J. Molecular evolution of the major chemosensory gene families in insects. Heredity (Edinb). 2009;103:208–16.
Sánchez-Gracia A, Vieira FG, Almeida FC, Rozas J. Comparative genomics of the major chemosensory gene families in arthropods. Encyclopedia of Life Sciences (ELS). Wiley, Ltd: Chichester. 2011. https://doi.org/10.1002/9780470015902.a0022848.
Benton R. Multigene family evolution: perspectives from insect chemoreceptors. Trends Ecol Evol. 2015;30:590–600.
Guo S, Kim J. Molecular Evolution of Drosophila Odorant Receptor Genes. Mol Biol Evol. 2007;24:1198–207.
Almeida FC, Sánchez-Gracia A, Campos JL, Rozas J. Family size evolution in drosophila chemosensory gene families: a comparative analysis with a critical appraisal of methods. Genome Biol Evol. 2014;6:1669–82.
Nozawa M, Nei M. Evolutionary dynamics of olfactory receptor genes in Drosophila species. Proc Natl Acad Sci U S A. 2007;104:7122–7.
Vieira FG, Rozas J. Comparative genomics of the odorant-binding and chemosensory protein gene families across the arthropoda: Origin and evolutionary history of the chemosensory system. Genome Biol Evol. 2011;3:476–90.
Vieira FG, Sánchez-Gracia A, Rozas J. Comparative genomic analysis of the odorant-binding protein family in 12 Drosophila genomes: Purifying selection and birth-and-death evolution. Genome Biol. 2007;8:R235.
Consortium (Drosophila 12 Genomes consortium). Evolution of genes and genomes on the Drosophila phylogeny. Nature. 2007;450:203–18.
Gardiner A, Barker D, Butlin RK, Jordan WC, Ritchie MG. Drosophila chemoreceptor gene evolution: Selection, specialization and genome size. Mol Ecol. 2008;17:1648–57.
Gardiner A, Butlin RK, Jordan WC, Ritchie MG. Sites of evolutionary divergence differ between olfactory and gustatory receptors of Drosophila. Biol Lett. 2009;5:244–7.
Heed WB. Ecology and genetics of Sonoran Desert Drosophila. In: Brussard PF, editor. Ecol. Genet. Interface. New York: Springer-Verlag; 1978. p. 109–26.
Ruiz A, Heed WB, Wasserman M. Evolution of the mojavensis cluster of cactophilic Drosophila with descriptions of two new species. J Hered. 1990;81:30–42.
Matzkin LM, Markow TA. Transcriptional differentiation across the four subspecies of Drosophila mojavensis. In: Michalak P, editor. Speciat. Nat. Process. Genet. Biodivers: Nova Science Publishers, Inc; 2013. p. 119–35.
Matzkin LM. Population transcriptomics of cactus host shifts in Drosophila mojavensis. Mol Ecol. 2012;21:2428–39.
Matzkin LM, Watts TD, Bitler BG, Machado CA, Markow TA. Functional genomics of cactus host shifts in Drosophila mojavensis. Mol Ecol. 2006;15:4635–43.
Mansourian S, Stensmyr MC. The chemical ecology of the fly. Curr Opin Neurobiol. 2015;34:95–102.
Turner SL, Ray A. Modification of CO2 avoidance behaviour in Drosophila by inhibitory odorants. Nature. 2009;461:277–81.
Ronderos DS, Lin C-C, Potter CJ, Smith DP. Farnesol-detecting olfactory neurons in Drosophila. J Neurosci. 2014;34:3959–68.
Matzkin LM. Ecological genomics of host shifts in Drosophila mojavensis. In: Landry C, Aubin-Horth N, editors. Adv. Exp. Med. Biol, vol. 781. New York: Springer; 2014. p. 233–47.
Matzkin LM. The molecular basis of host adaptation in cactophilic drosophila: Molecular evolution of a glutathione S-transferase gene (GstD1) in Drosophila mojavensis. Genetics. 2008;178:1073–83.
Smith G, Lohse K, Etges WJ, Ritchie MG. Model-based comparisons of phylogeographic scenarios resolve the intraspecific divergence of cactophilic Drosophila mojavensis. Mol Ecol. 2012;21:3293–307.
Reed LK, Nyboer M, Markow TA. Evolutionary relationships of Drosophila mojavensis geographic host races and their sister species Drosophila arizonae. Mol Ecol. 2007;16:1007–22.
Machado CA, Matzkin LM, Reed LK, Markow TA. Multilocus nuclear sequences reveal intra- and interspecific relationships among chromosomally polymorphic species of cactophilic Drosophila. Mol Ecol. 2007;16:3009–24.
Anisimova M, Nielsen R, Yang Z. Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites. Genetics. 2003;164:1229–36.
Arenas M, Posada D. Coalescent simulation of intracodon recombination. Genetics. 2010;184:429–37.
Arenas M, Posada D. The influence of recombination on the estimation of selection from coding sequence alignments. In: Fares MA, editor. Nat. Sel. Methods Appl. Boca Raton: CRC Press/Taylor & Francis; 2014. p. 112–25.
Matzkin LM. Population genetics and geographic variation of Alcohol dehydrogenase (Adh) paralogs and Glucose-6-Phosphate dehydrogenase (G6pd) in Drosophila mojavensis. Mol Biol Evol. 2004;21:276–85.
Matzkin LM, Eanes WF. Sequence variation of alcohol dehydrogenase (Adh) paralogs in cactophilic Drosophila. Genetics. 2003;163:181–94.
Ramírez-Soriano A, Ramos-Onsins SE, Rozas J, Calafell F, Navarro A. Statistical power analysis of neutrality tests under demographic expansions, contractions and bottlenecks with recombination. Genetics. 2008;179:555–67.
Nason JD, Hamrick JL, Fleming TH. Historical vicariance and postglacial colonization effects on the evolution of genetic structure in Lophocereus, a Sonoran desert columnar cactus. Evolution (N Y). 2002;56:2214–26.
Dyer RJ, Nason JD. Population Graphs: the graph theoretic shape of genetic structure. Mol Ecol. 2004;13:1713–27.
Smith CI, Farrell BD. Range expansions in the flightless longhorn cactus beetles, Moneilema gigas and Moneilema armatum, in response to Pleistocene climate changes. Mol Ecol. 2005;14:1025–44.
Fellows DP, Heed WB. Factors affecting host plant selection in Desert- adapted cactiphilic Drosophila. Ecology. 1972;53:850–8.
Kircher HW. Chemical composition of cacti and its relationship to Sonoran Desert Drosophila. In: Barker JSF, Starmer WT, editors. Ecol. Genet. Evol. Cactus-Yeast-Drosophila Syst. New York: Academic Press; 1982. p. 143–58.
Fogleman JC, Starmer WT, Heed WB. Larval selectivity for yeast species by Drosophila mojavensis in natural substrates. Proc Natl Acad Sci. 1981;78:4435–9.
Fogleman JC, Danielson PB. Chemical interactions in the cactus-microorganism- Drosophila model system of the Sonoran Desert. Am Zool. 2001;41:877–89.
Date P, Crowley-Gall A, Diefendorf AF, Rollmann SM. Population differences in host plant preference and the importance of yeast and plant substrate to volatile composition. Ecol Evol. 2017;7:3815–25.
Date P, Dweck HKM, Stensmyr MC, Shann J, Hansson BS, Rollmann SM. Divergence in olfactory host plant preference in D. mojavensis in response to cactus host use. PLoS One. 2013;8:1–10.
Ross CL, Markow TA. Microsatellite variation among diverging populations of Drosophila mojavensis. J Evol Biol. 2006;19:1691–700.
Koshland D, Goldbeter A, Stock J. Amplification and adaptation in regulatory and sensory systems. Science. 1982;217:220–5.
Mobashir M, Schraven B, Beyer T. Simulated evolution of signal transduction networks. PLoS One. 2012;7:1–10.
Benton R, Sachse S, Michnick SW, Vosshall LB. Atypical membrane topology and heteromeric function of Drosophila odorant receptors in vivo. PLoS Biol. 2006;4:240–57.
Isono K. Molecular and cellular designs of insect taste receptor system. Front Cell Neurosci. 2010;4:1–16.
Coleman JM, Benowitz KM, Jost AG, Matzkin LM. Behavioural evolution accompanying host shifts in cactophilic Drosophila larvae. Ecol Evol. 2018; https://doi.org/10.1002/ece3.4209.
Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, et al. Geneious basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28:1647–9.
Stephens M, Smith NJ, Donnelly P. A new statistical method for haplotype reconstruction from population data. Am J Hum Genet. 2001;68:978–89.
Stephens M, Donnelly P. A comparison of bayesian methods for haplotype reconstruction from population genotype data. Am J Hum Genet. 2003;73:1162–9.
Stephens M, Scheet P. Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation. Am J Hum Genet. 2005;76:449–62.
Librado P, Rozas J. DnaSP v5: A software for comprehensive analysis of DNA polymorphism data. Bioinformatics. 2009;25:1451–2.
Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989;123:585–95.
Fu Y-X, Li W-H. Statistical tests of neutrality of mutations. Genetics. 1993;133:693–709.
Fu Y-X. Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics. 1997;147:915–25.
Kosakovsky-Pond SL, Posada D, Gravenor MB, Woelk CH, Frost SDW. GARD: a genetic algorithm for recombination detection. Bioinformatics. 2006;22:3096–8.
Kosakovsky-Pond SL, Posada D, Gravenor MB, Woelk CH, Frost SDW. Automated phylogenetic detection of recombination using a genetic algorithm. Mol Biol Evol. 2006;23:1891–901.
Kosakovsky-Pond SL, Frost SDW, Muse SV. HyPhy: Hypothesis testing using phylogenies. Bioinformatics. 2005;21:676–9.
Wright S. The genetic structure of populations. Ann Eugen. 1951;15:323–54.
Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population structure. Evolution (N. Y). 1984;38:1358–70.
Excoffier L, Smouse PE, Quattro JM. Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics. 1992;131:479–91.
Excoffier L, Lischer HEL. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour. 2010;10:564–7.
Huelsenbeck JP, Ronquist F. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics. 2001;17:754–5.
Posada D. jModelTest: Phylogenetic model averaging. Mol Biol Evol. 2008;25:1253–6.
Drummond A, Rambaut A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007;7:1–8.
Yang Z. PAML 4: Phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586–91.
Zhang J, Nielsen R, Yang Z. Evaluation of an improved Branch-Site Likelihood method for detecting positive selection at the molecular level. Mol Biol Evol. 2005;22:2472–9.
Murrell B, Weaver S, Smith MD, Wertheim JO, Murrell S, Aylward A, et al. Gene-wide identification of episodic selection. Mol Biol Evol. 2015;32:1365–71.
Kosakovsky-Pond SL, Murrell B, Fourment M, Frost SDW, Delport W, Scheffler K. A random effects Branch-Site model for detecting episodic diversifying selection. Mol Biol Evol. 2011;28:3033–43.
McDonald JH, Kreitman M. Adaptive protein evolution at the Adh locus in Drosophila. Nature. 1991;351:652–4.
Yang Z, Wong WSW, Nielsen R. Bayes empirical Bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005;22:1107–18.
Kosakovsky-Pond SL, Frost SDW. Not so different after all: A comparison of methods for detecting amino acid sites under selection. Mol Biol Evol. 2005;22:1208–22.
Murrell B, Wertheim JO, Moola S, Weighill T, Scheffler K, Kosakovsky-Pond SL. Detecting individual sites subject to episodic diversifying selection. PLoS Genet. 2012;8:e1002764.
Murrell B, Moola S, Mabona A, Weighill T, Sheward D, Kosakovsky-Pond SL, et al. FUBAR: A fast, unconstrained bayesian Approximation for inferring selection. Mol Biol Evol. 2013;30:1196–205.
Conant GC, Wagner GP, Stadler PF. Modeling amino acid substitution patterns in orthologous and paralogous genes. Mol Phylogenet Evol. 2007;42:298–307.
Atchley WR, Zhao J, Fernandes AD, Druke T. Solving the protein sequence metric problem. Proc Natl Acad Sci. 2005;102:6395–400.
Omasits U, Ahrens CH, Müller S, Wollscheid B. Protter: Interactive protein feature visualization and integration with experimental proteomic data. Bioinformatics. 2014;30:884–6.
We would like to thank Laurel Brandsmeier for the laboratory work she performed on this project. We also would like to thank two anonymous reviewers whose comments improved this manuscript.
This project was initially supported by a National Science Foundation grant DEB-1219387 to LMM and later by a National Science Foundation grant IOS-1557697 to LMM.
Availability of data and materials
All nucleotide data are available for each gene in GenBank (Gr63a: MH750445 - MH750487, Or67c: MH750488 - MH750550, Or83c: MH750551 - MH750612 and Or83c2: MH750613 - MH750680).
Ethics approval and consent to participate
This section is not applicable to the present study.
Consent for publication
This section is not applicable to the present study.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Table S1. Primer sequences (Forward and Reverse) for each gene and species. Table S2. Descriptive parameters for haplotype, nucleotide diversity and neutrality tests. Results are shown for each gene in each D. mojavensis population and D. arizonae (See Methods for abbreviations). Table S3. Neutrality tests for each gene in each D. mojavensis population and D. arizonae (See Methods for abbreviations). Table S4. Divergence between D. mojavensis and D. arizonae species and populations within D. mojavensis. Ka/Ks ratios and genetic structure estimated by ΦST are given for pairwise comparisons between species and populations for each gene (See Methods for abbreviations). Table S5. McDonald-Kreitman test for each D. mojavensis and D. arizonae. A summary of P-values is shown for each gene using D. arizonae and D. navojoa as an outgroup (See Methods for abbreviations). (DOCX 50 kb)
About this article
Cite this article
Diaz, F., Allan, C.W. & Matzkin, L.M. Positive selection at sites of chemosensory genes is associated with the recent divergence and local ecological adaptation in cactophilic Drosophila. BMC Evol Biol 18, 144 (2018). https://doi.org/10.1186/s12862-018-1250-x
- Odorant receptor
- Gustatory receptor
- Population genetics
- Molecular evolution
- Cactophilic Drosophila