Divergent evolution and molecular adaptation in the Drosophila odorant-binding protein family: inferences from sequence variation at the OS-E and OS-Fgenes
© Sánchez-Gracia and Rozas; licensee BioMed Central Ltd. 2008
Received: 28 July 2008
Accepted: 27 November 2008
Published: 27 November 2008
The Drosophila Odorant-Binding Protein (Obp) genes constitute a multigene family with moderate gene number variation across species. The OS-E and OS-F genes are the two phylogenetically closest members of this family in the D. melanogaster genome. In this species, these genes are arranged in the same genomic cluster and likely arose by tandem gene duplication, the major mechanism proposed for the origin of new members in this olfactory-system family.
We have analyzed the genomic cluster encompassing OS-E and OS-F genes (Obp83 genomic region) to determine the role of the functional divergence and molecular adaptation on the Obp family size evolution. We compared nucleotide and amino acid variation across 18 Drosophila and 4 mosquito species applying a phylogenetic-based maximum likelihood approach complemented with information of the OBP three-dimensional structure and function. We show that, in spite the OS-E and OS-F genes are currently subject to similar and strong selective constraints, they likely underwent divergent evolution. Positive selection was likely involved in the functional diversification of new copies in the early stages after the gene duplication event; moreover, it might have shaped nucleotide variation of the OS-E gene concomitantly with the loss of functionally related members. Besides, molecular adaptation likely affecting the functional OBP conformational changes was supported by the analysis of the evolution of physicochemical properties of the OS-E protein and the location of the putative positive selected amino acids on the OBP three-dimensional structure.
Our results support that positive selection was likely involved in the functional differentiation of new copies of the OBP multigene family in the early stages after their birth by gene duplication; likewise, it might shape variation of some members of the family concomitantly with the loss of functionally related genes. Thus, the stochastic gene gain/loss process coupled with the impact of natural selection would influence the observed OBP family size.
The olfactory system of animals allows individuals detecting enormously diverse information from the external environment, being in most species a fundamental feature for their survival and reproduction. Natural selection, therefore, likely plays an important role in the evolution of olfactory-involved genes. Actually, there is compelling evidence for the action of positive selection in the evolution of these genes, both in insects and in vertebrates [e.g. [1–7]. In addition, olfactory-specific gene families might contribute to the host-specificity shifts occurring in the diversification of super-specialist Drosophila species [8, 9].
The primary step in the olfactory perception is accomplished by the Odorant-Binding Proteins (OBPs). In spite of the similar global function of insect and vertebrate OBPs, these two protein families are evolutionarily unrelated . In insects, OBPs are small globular proteins that bind odorant molecules (including pheromones) at the pores of the chemosensory sensilla, transporting them through the aqueous lymph, and delivering their ligands near the olfactory receptors (OR) [11, 12]. In addition, OBPs might play a role in the olfactory coding [13, 14], as well as in the stimulus inactivation [15–17]. While some OBPs co-express in the same individual sensilla, some others have strikingly different expression patterns . Currently, the OBP three-dimensional (3D) structures of several insects have been determined [reviewed in ; these proteins share similar folds, although with significant structural differences (protein length, position and conformation of α-helices, loops and C-terminus), resulting in diverse solvent access properties.
The Obp repertory in the genus Drosophila constitutes a multigene family composed by a moderately variable number of members (from 40 to 61 genes) [9, 18, 20]. Results in  have shown that the Obp genes evolve through a birth-and-death process; the new members originate by tandem gene duplications and gradually diverge in sequence and likely in function. The OS-E (DmelObp83a) and OS-F (DmelObp83b) genes are the two closest paralogous Obp members of the D. melanogaster genome. These genes, located in the 3R chromosome, are separated by ~1 kb intergenic region and show a highly similar gene structure and protein sequence similarity (the mature protein has 70% amino acid identity) . These genes also co-express in the same specific subset of olfactory sensilla (mainly in the sensilla trichoidea) of the D. melanogaster antennal segment 3 .
DNA polymorphism and divergence analyses at the OS-E and OS-F genes in the melanogaster  and in the old world obscura [Sánchez-Gracia and Rozas, unpublished data) subgroup species of Drosophila have shown that these olfactory genes might have evolved non-neutrally. Nevertheless, no firm conclusions regarding the precise evolutionary mechanism could be drawn; therefore, the specific role that natural selection might play in the evolutionary history of this gene duplication, and especially in the origin and maintenance of the duplicated copies, it is still unknown.
Here, we investigate the mechanisms driving the evolution of the genomic cluster encompassing the OS-E and OS-F genes (the Obp83 genomic region) in 18 species of the Drosophila genus. We integrate amino acid and nucleotide-based divergence data, with the analysis of the selective constraints and information of the OBP 3D structure and function, to infer the impact of positive and negative selection in the evolutionary history of these genes. We are especially interested in determining the origin and evolutionary fate of these Obp genes within the context of a multigene family submitted to a birth-and-death process. We found that functional differentiation, with an active role of positive selection, might contribute to the Obp family evolution across Drosophila species. We also show that the evolution of the physicochemical-properties of these proteins suggests that the functional divergence might arise through changes affecting the OBP conformational shift mechanisms, modifying the specificity, sensitivity or accessibility of OBPs to the odorants, to the Ors or to other molecules required to the correct odorant perception.
The Obp83 genomic region in the genus Drosophila
Amino acid and nucleotide divergence
Bayesian trees based on protein and nucleotide divergence indicate that the OS-E and OS-F genes existed before the split of the Sophophora and Drosophila subgenera; later, the Drosophila subgenera lineage lost the OS-E gene (Figure 1). Even using an algorithm more sensitive than BLAST , we did not find any vestige of the OS-E gene in the genome of the three Drosophila subgenus species suggesting, therefore, that this gene has been completely erased. In agreement with previous results  the distribution of nucleotide substitutions across the DNA sequence is not homogeneous: both genes are highly divergent in the so called heterogeneous region (region het 1, encompassing the amino acid positions 68 to 90 in the Drosophila OS-E protein; ), and in the first 30 amino acids of the N-terminal part of the mature protein (referred here as het2).
Selective constraints and functional divergence
Likelihood Ratio Test (LRT) results and Bayesian prediction of amino acid sites under positive selection.
Positively selected sitesc
BEB (PP > 0.95)
74, 75, 78, 79, 81, 82, 94
4, 78, 115, 120
Sequence variation on the OBP 3D structure
We first determined the location of the variable regions in the 3D structure; the het1 region lies at α-helices D and E with their connecting loop, while the het2 variable region is at the N-terminal part of the protein, encompassing the complete first α-helix. Figure 8 also shows the location in the 3D structure of the relevant positively selected amino acid positions and those inferred to contribute to type I functional divergence (with high PP values). Interestingly, most of the selectively relaxed amino acid sites inferred in D. guanche OS-E protein and some of the positions responsible of the positive selection are located in the het1 region, and therefore nearby in the 3D protein structure. This suggests that this part of the protein has an important functional role. Since the first α-helix and the C-terminal end of the protein also appear to be the target of positive selection, molecular adaptation should have affected different OBP protein domains. Otherwise, amino acid positions candidates to be under type I functional divergence are more homogeneously distributed along the 3D structure (results not shown): two of the tree positions with high PP values, however, are also in the functionally important predicted protein domains.
Origin of the OS-E/OS-Fgene duplication
The OS-E and OS-F genes encode the two phylogenetic closest odorant binding protein members of the D. melanogaster genome [9, 20]. We have identified orthologs of the OS-E gene only in the Sophophora subgenus species, with no evidence in D. virilis, D. mojavensis and D. grimshawi (Figure 1). The analysis in  led to the same result in D. virilis but they detected two OS-E like genes in Scaptodrosophila lebanonensis, a basal species of Drosophila genus. In fact, in  authors proposed that the two OS-E and OS-F genes have orthologous copies in A. gambiae. Our phylogenetically-based analysis support the hypothesis that the OS-E and OS-F gene duplication predates the Drosophila and Sophophora subgenera split, but does not agree with  findings. Most likely, the Anopheles homologous sequences are in fact co-orthologs of the Drosophila OS-E and OS-F genes, i.e. the Drosophila gene copies arose by a gene duplication event after the split of the Nematocera and Brachycera taxa (about 250 Mya). We have to notice, however, that the different evolutionary rates of OS-E and OS-F genes might constitute a confounding factor. The duplication might have occurred after the Drosophila-Sophophora subgenus split, followed by evolutionary rate acceleration. This scenario would be consistent with the  analysis; these authors, nevertheless, only surveyed a very short fragment (the het 1 region), which -in addition- we have found that might have evolved by positive selection. Consequently, the most plausible scenario would indicate that Obp83 genomic region had the OS-X, OS-E and OS-F genes before the Sophophora-Drosophila subgenera split (Figure 1); nevertheless, further experimental analyses using more distant Drosophila species would be required for a complete assessment.
Functional divergence between OS-E and OS-Fgenes
Results in  have shown that the Obp gene family has evolved following a birth-and-death model . Under this model, contrarily to the concerted evolution model, new functional genes often evolve through regulatory or functional differentiation. Accordingly, a number of Obp family members differ in gene expression patterns [18, 20] or in functional constraints . In the  classical view, the functional diversification of duplicated copies is driven by positive selection (i.e. the neofunctionalization model). Otherwise, gene duplicates might also differentiate by acquiring independent sub-functions, being all of them required for carry out the original function. This functional subdivision might be promoted by positive selection [35, 36] or be the result of the accumulation of degenerative mutations (i.e., causing a complementary loss of function; the subfunctionalization model) . This later model is, nonetheless, usually considered into the evolution of cis-regulatory elements.
In D. melanogaster, the OS-E and OS-F genes have the same temporal gene expression pattern [22, 38]. In fact, OS-E, OS-F and LUSH proteins co-localize not only in the sensillar fluid, but also in the same intracellular compartment of the supporting cells before endocytosis . Therefore, it seems unlikely that OS-E and OS-F genes might differ on the regulatory temporal or spatial pattern of gene expression. Nevertheless, these genes might differ on their quantitative gene expression patterns (unfortunately, the amount of protein produced for each gene is still unknown). It has been shown that gene expression levels are negatively correlated with evolutionary rates . Here we found that the OS-E evolves more rapidly than OS-F. The highly conserved large first intron (the ~1.7 kb that separate the 5' untranslated and the first coding exon) in all OS-F genes might contribute to explain this result. This intron is absent in the OS-E genes and has two highly conserved fragments (results not shown), which might contain regulatory regions. This feature, that has been associated with reduced evolutionary rates and high gene expression levels , might explain the evolutionary rate differences between OS-E and OS-F and perhaps putative differences in the gene expression levels.
If the functional diversification between OS-E and OS-F genes was promoted by changes on the coding region these genes might have evolved with asymmetric evolutionary rates. The OS-E and OS-F gene duplication is recent enough to allow studying the evolutionary forces acting at the early stages after gene gains in the Obp family. Here, we have found that OS-E and OS-F genes evolved with different substitution rates; nevertheless, the overall functional constraint level (measured as the ω parameter) is high and very similar in the two genes. Although relatively ancient duplicates can exhibit similar functional constraint levels , they might have differed in a short period of time after the duplication event. In this sense, we have detected both significant type I divergence among OBPs, likely resulting from site-specific relaxations, and the footprint of positive natural selection in the early stages after the OS-E/OS-F gene duplication. Hence, these two forces would affect the evolutionary fate of new Obp family members, initially originated by tandem gene duplications. Since the shifts in the evolutionary rate (detected as type I functional divergence) occurred in different and complementary positions of each duplicated pair, this would point to some functional subdivision, perhaps with a complementary loss (or relaxation) of function at different protein domains. Likewise, we found that positive selection might also act concomitant to some lineage-specific losses of members from the same cluster, suggesting that these within-cluster OBPs should be functionally connected. This functional connection might arise through the formation of OBP dimers at physiological conditions [ and references therein]. Natural selection might promote heterodimers via the increase of the combinatory potential of the OBP (increasing either the spectrum of possible target odorants or the binding-specificity) and maintaining, therefore, the co-localization of different OBPs in the same cells. Interestingly, we did not find evidences of functional divergence between OS-E and OS-X proteins, which share a similar functional divergence behaviour with respect to OS-F. Moreover, all extant species (except D. willistoni) only have one of these two genes. It is possible that OS-F dimerize with either OS-E or OS-X in the sensillar fluid generating quaternary structures with an equivalent functional role. Although, it is unknown whether in the past OS-E and OS-X really co-expressed in the same cells, it would be very attractive to investigate the expression pattern of the three Obp genes of D. willistoni (the single species where OS-E and OS-X currently coexists). The analysis of gene expression data and of the functional quaternary structure might give critical insights into the role of dimerization on the molecular evolution of the Obp genes.
Positively selected sites in the OBP structure
We have detected several sites that likely evolved by positive selection. One of these positions (23), placed in the helix A of the OBP, might alter the size and shape of the binding cavity by modifying the position of the first disulfide bridge [19, 44]; for instance, D. melanogaster LUSH, which has the first α-helix in a more internal position than in A. mellifera ASP1, has also a small binding cavity. We also detected putative positive selected sites located in the C-terminal end of the protein (115 and 120). In ASP1 this domain folds inside the protein forming one binding cavity wall and contains residues that interact directly with the ligand; conformational changes in this part of the protein trigger the ligand release close to the odorant receptor [45, 46]. Moreover, in LUSH, amino acid substitutions in this part of the protein have also been related with the pheromone-induced conformational shift that triggers the firing of pheromone-sensitive neurons . Interestingly, the structure of this region is the most divergent among the 4 OBPs with resolved 3D proteins: it presents different lengths, secondary structure, or it is even missing. Likely, replacements in these two regions can significantly affect the conformational and the ligand-binding properties of the OBPs, being therefore a major target for adaptive changes.
The third protein region that might be shaped by molecular adaptation comprises the α-helices D and E (Figure 8). This region includes both hydrophobic residues covering the binding cavity and exposed amino acids that might be involved in protein-protein interactions. Noticeably, most of the residues (7 out 9) detected in the coevolution analysis (Figure 7) lies in this part of the protein and are exposed to the solvent. Indeed, in the proposed A. gambiae OBP17 homodimer, the dimeric interface primarily engage the fourth and fifth helices . Moreover, in this same region also localize many D. guanche replacements with a distinctive evolutionary pattern (Table 1). Several authors suggested [48, 49] that the very small effective population size of this insular-endemic species would increase the fixation probability of slightly deleterious mutations (as "unpreferent" synonymous mutations, or amino acid replacements). Under this scenario, the amino acid replacements detected in the OS-E protein of D. guanche would be slightly deleterious mutations fixed by genetic drift and, therefore, would indicate a functional constraint relaxation rather than positive selection. Moreover, the selective relaxation might be related with a ecologically driven speciation, as has been suggested in other Drosophila species [8, 9]. Actually, discriminating between positive and relaxed negative selection is not an easy task; even so, since the het1 region is functionally important we cannot completely discard that positive selection might in fact also drive the evolution of D. guanche OS-E gene.
Physicochemical evolution and molecular adaptation
The evolutionary analysis of the physicochemical properties might provide insights into the functional divergence occurred across OBPs. We show that OS-E/OS-F duplicates have a markedly different behaviour. While the OS-F protein has been largely affected by purifying selection, both stabilizing and adaptive positive selection was inferred for the OS-E. Indeed, we identified the footprint of the positive-destabilizing selection on two physicochemical properties (Figure 6). P α is a conformational property  related to the length and flexibility of alpha helices and, therefore, to the accessibility of specific amino acids (as those involved in interacting motifs of the protein). OBPs are small globular α-helical proteins and, therefore, could be largely affected by these changes. pK' influences the association and disassociation constants of amino acids, affecting the protein-protein or protein-ligand interactions characteristics. Thus, positive selection might promote functional divergence modifying the binding specificities, or altering the conformational changes involved in the OBP functional mechanism [e.g. [14, 45, 46, 51]]. Additionally, our results also indicate that positive stabilizing selection has acted on the OS-E protein evolution. It has been shown that in globular proteins two (Pc and F) of the three physicochemical properties related to positive stabilizing selection are highly-negatively correlated with protein compressibility , and would be essential in maintaining the globular structure and the buried nature of the binding-cavity. In spite that the radical changes detected in our study likely produce important functional changes between the OS-E and OS-F proteins, we cannot discard that other detected conservative replacements also have an important adaptive role [see for example ]. Indeed, some of the conservative changes contributing to the excess of amino acid replacements detected in the OS-E protein might cause weakly but relevant functional changes in either the OBP binding-activity or in ligand-specificity suggesting, therefore, the action of adaptive instead of stabilizing selection.
The comparative genomic analysis of the Obp multigene family in Drosophila  has revealed that a birth-and-death model could explain the differences in the number of Obp members across species. Here we found that molecular adaptation can also play an important role in the evolution of this olfactory gene family. Indeed, positive selection was likely involved in the functional differentiation of new copies in the early stages after the gene duplication event; likewise, it might shape variation of some members of the family concomitantly with the loss of functionally related genes. The stochastic gene gain/loss process coupled with the impact of natural selection would determine the observed family size [see also ]. Nevertheless, further functional experiments would be required to demonstrate the adaptive character of the amino acids inferred as positively selected. The analysis of the non-coding flanking regions is also critical; particularly the putative regulatory sequences of the OS-E and OS-F genes to investigate putative fine-tuning differences in their gene expression patterns. All these studies and experiments will certainly contribute to better understand the precise role of natural selection and molecular adaptation in the evolution of chemoreception.
Fly samples and databases
We studied 18 species of the Drosophila genus (15 and 3 species of the Sophophora and Drosophila subgenera, respectively; Figure 1). We used highly inbreed lines (10 generations of sib mating) of D. teissieri, D. yakuba, D. ananassae (species kindly provided by F. Lemeunier), D. pseudoobscura, D. persimilis, D. miranda (species kindly provided by R. C. Lewontin), D. subobscura, D. guanche, and D. madeirensis (species available in our laboratory). In addition, we also analyze DNA sequence data from D. melanogaster (AJ574644), D. simulans (AJ567753), D. mauritiana (AJ563750) and D. erecta AJ574775-AJ574776 .
DNA extraction and sequencing
Total genomic DNA of D. yakuba, D. teissieri, D. ananassae, D. pseudoobscura, D. miranda, D. persimilis, D. guanche, D. madeirensis and D. subobscura was extracted from live flies by using a modification of protocol 48 in . DNA fragments, including the complete coding region of the OS-E and OS-F genes, were amplified by using the PCR protocol . In addition to the primers previously used for the amplification of the OS-region in D. melanogaster, D. simulans, D. mauritiana and D. erecta , we designed additional oligonucleotides for the amplification of the new species. Some of these primers were designed using information of conserved genomic regions between D. pseudoobscura and D. melanogaster (Berkley Drosophila Genome Project, Release 4; ). Although the length of the amplified genomic regions varied among species, they always included the coding region of the two genes. PCR products were purified using the QIAquick PCR purification kit (QIAGEN, Chatsworth, CA), and cycle sequenced using primers separated at intervals of ~400 nucleotides. Occasionally, a genome walking strategy was also required to complete the DNA sequence . Sequenced fragments were separated on the ABI 377 and 3700 sequencers. For all species, the DNA sequence corresponding to the coding regions was determined on both strands. The new sequence data have been deposited in the EMBL Nucleotide Sequence Database under accession numbers: FM210093–FM210110.
Available genomic data sources
We searched for the orthologous (and other homologous) copies of the OS-E and OS-F genes in other Drosophila species using available genome sequence information: D. sechellia, D. willistoni, D. virilis, D. mojavensis and D. grimshawi ; in other insects with sequenced genomes: Aedes aegypti, Anopheles gambiae, Apis mellifera, Bombyx mori, Culex pipiens and Tribolium castaneum [59–63]; http://www.broad.mit.edu/annotation/genome/culex_pipiens/Home.html; and in available sequences from public databases http://www.ncbi.nlm.nih.gov. The orthologous relationship was inferred by the TBLASTN reciprocal best-hit method with further gene trees and species trees reconciliation. Proteins with an amino acid sequence identity higher than 21.5% (the average amino acid sequence identity between OS-E and OS-F and the DmelObp69 protein – the phylogenetically closest OBP in the D. melanogaster genome ) were proposed as members of the Obp83 subfamily. We determined the gene structure features of newly identified genes using information of the already known Obp genes as a guide. To better characterize the syntenic region in the new species, we compared -using the dot-plot method implemented in zPicture tool - the Obp83 region (~10 kb) of D. melanogaster with the orthologous counterpart in the new species. The SIM local alignment software  and the LANLVIEW tool  were used to search for possible OS-E gene vestiges in the genomic DNA sequence of the Drosophila subgenus species.
Codon sequence analyses
We used SeqMan version 5.53 (DNASTAR, Inc.) for assembling the new sampled DNA sequences. DNA sequences of the coding regions obtained experimentally plus those retrieved from the public databases were multiple aligned using the MUSCLE software , and edited with MacClade version 3.05 program . We estimated nucleotide sequence variation by using DnaSP version 4.10 , and MEGA version 4  programs. Bayesian phylogenetic analysis was performed with MrBayes version 3.1.2 ; for that we applied the best substitution model estimated using the Akaike Informative Criterion implemented in MODELTEST 3.7 . To determine if the paralogous genes evolve at different substitution rates we conducted the two-cluster Relative Rate Test (RRT) implemented in the LINTRE package .
We estimated the selective pressures acting on coding regions applying a phylogenetic-based Maximum Likelihood analysis. Since the new homologous Obp83 sequences identified in Drosophila and in mosquito are highly divergent from the OS-E and OS-F genes they might differ significantly in nucleotide composition or codon frequencies. This feature might violate some assumptions of the Markov model of codon substitution, and therefore, might yield unreliable estimates of the relevant parameters. To minimize this problem, we used a multiple sequence alignment with information of only the OS-E and OS-F genes. Given that the alignment of the signal peptide region was unreliable in most homologous gene copies we analyze only the mature-protein coding region. ML estimates of the relevant parameters -as branch lengths and the ratio of the nonsynonymous (dN) to synonymous substitution rates (dS), ω = dN/dS- were obtained using the codeml program implemented in the PAML package version 4 . The ω parameter was used as a measure of the protein selective constraints . These analyses were conducted under different competing evolutionary hypothesis. We first investigate whether the distribution of selective constraints acting on the OS-E and OS-F genes fluctuate across lineages; for that, we compared the fit to the data of the "one ratio" model (M0), which assumes a constant selective pressure across branches, with the "free ratios" model (FR), where the rate parameters are estimated independently in each lineage. We also examined other evolutionary scenarios; i) to detect putative changes in the functional constraints after the gene duplication event, we applied a "two clades" model (M0dup) to the data. Under this model we assigned -a priori- two different ω ratios, one for each OS-E and OS-F clades; ii) to assess for site-specific selection pressures (including putative positive selected sites) we used the "site-specific" models (i.e., models that allow variation in the ω ratio across sites) of ; iii) to detect positively selected sites in specific lineages, we applied the modified branch-site model A of  in two consecutive tests (test1 and test2 in ); the multiple hypothesis testing problem  was taking into account using Bonferroni's correction . The likelihood Ratio Test was used to compare the fit to the data of two nested models, assuming that twice the log likelihood difference between the two models (2Δℓ) follows a χ2 distribution with a number of degrees of freedom equal to the difference in the number of free parameters . To prevent incorrect parameter estimates caused by local optima, the codeml program was run multiple times for the same model, specifying different initial values.
We used the TreeSAAP version 3.2  to determine the OBP physicochemical properties affected by natural selection. This program compares the distribution of amino acid changes altering a particular physicochemical property (for a set of 31 different properties), with that expected assuming that changes modifying this property are equally likely (i.e., independent of the physicochemical-magnitude change, that is under neutral evolution). For each property the observed and expected distributions are compared by a goodness-of-fit test . TreeSAAP also assign the amino acid substitutions to particular categories in function on their magnitude effect (each property is divided in 8 categories of equal magnitude, from more conservative to more radical physicochemical changes), and determines the statistical deviations from the expected numbers .
Amino acid sequence analyses
Amino acid-based phylogenetic trees were reconstructed using MrBayes. We used the Diverge 2.0 software to estimate the type I (θλI) and type II (θλII) functional divergence coefficients [26, 27] among paralogous proteins. Type I and type II refers to shifts in the substitution rates after gene duplication (indicative of changes in functional constrains), and amino acid replacements completely fixed between duplicates (resulting in cluster-specific alterations of amino acid physiochemical properties), respectively. We used the CAPS program  to identify groups of co-evolving positions at intermolecular level; the program also allows estimating the correlated variation between amino acid sites after correcting for evolutionary distances and phylogenetic dependences. The protein secondary structure of the gene products was inferred using the PredictProtein server . We also predicted the putative Drosophila OBP 3D structure using the SWISS-MODEL automated modeling server . The putative ancestral sequence of the Drosophila OS-E and OS-F proteins (inferred with PAML) was used to search sequences of the Protein Data Bank (PDB)  with high amino acid sequence similarity and resolved 3D structure. The 3D structure with the highest PSI-BLAST score was used for the modeling. The Swiss-PdbViewer program version 3.9b2  was used to visualize the 3D structure and to highlight the relevant amino acid replacements identified in the evolutionary analyses.
We thank J. Castresana and F. G. Vieira for helpful comments and suggestions on the manuscript and S. Guirao for her assistance with the 3D modelling and the evolutionary analysis of physicochemical properties. We also wish to express our gratitude to the members of the A. Ferrús laboratory for sharing their expertise about the Drosophila olfactory system organization and function. We also thank Serveis Cientifico-Tècnics, Universitat de Barcelona, for automated sequencing facilities. A. S. was supported by a predoctoral fellowship from the Universitat de Barcelona. This work was funded by grants BFU2004-02253 and BFU2007-62927 from the Ministerio de Educación y Ciencia (Spain), and 2005SRG-00166 from Comissió Interdepartamental de Recerca I Innovació Tecnològica (Spain).
- Ngai J, Dowling MM, Buck L, Axel R, Chess A: The family of genes encoding odorant receptors in the channel catfish. Cell. 1993, 72 (5): 657-666. 10.1016/0092-8674(93)90396-8.View ArticlePubMed
- Willett CS: Evidence for directional selection acting on pheromone-binding proteins in the genus Choristoneura. Mol Biol Evol. 2000, 17 (4): 553-562.View ArticlePubMed
- Krieger MJ, Ross KG: Identification of a major gene regulating complex social behavior. Science. 2002, 295 (5553): 328-332. 10.1126/science.1065247.View ArticlePubMed
- Clark AG, Glanowski S, Nielsen R, Thomas PD, Kejariwal A, Todd MA, Tanenbaum DM, Civello D, Lu F, Murphy B, Ferriera S, Wang G, Zheng X, White TJ, Sninsky JJ, Adams MD, Cargill M: Inferring Nonneutral Evolution from Human-Chimp-Mouse Orthologous Gene Trios. Science. 2003, 302 (5652): 1960-1963. 10.1126/science.1088821.View ArticlePubMed
- Emes RD, Beatson SA, Ponting CP, Goodstadt L: Evolution and comparative genomics of odorant- and pheromone-associated genes in rodents. Genome Res. 2004, 14 (4): 591-602. 10.1101/gr.1940604.PubMed CentralView ArticlePubMed
- Watts RA, Palmer CA, Feldhoff RC, Feldhoff PW, Houck LD, Jones AG, Pfrender ME, Rollmann SM, Arnold SJ: Stabilizing selection on behavior and morphology masks positive selection on the signal in a salamander pheromone signaling complex. Mol Biol Evol. 2004, 21 (6): 1032-1041. 10.1093/molbev/msh093.View ArticlePubMed
- Foret S, Maleszka R: Function and evolution of a gene family encoding odorant binding-like proteins in a social insect, the honey bee (Apis mellifera). Genome Res. 2006, 16 (11): 1404-1413. 10.1101/gr.5075706.PubMed CentralView ArticlePubMed
- McBride CS, Arguello JR, O'Meara BC: Five Drosophila genomes reveal nonneutral evolution and the signature of host specialization in the chemoreceptor superfamily. Genetics. 2007, 177 (3): 1395-1416. 10.1534/genetics.107.078683.PubMed CentralView ArticlePubMed
- Vieira FG, Sanchez-Gracia A, Rozas J: Comparative genomic analysis of the odorant-binding protein family in 12 Drosophila genomes: purifying selection and birth-and-death evolution. Genome Biol. 2007, 8 (11): R235-10.1186/gb-2007-8-11-r235.PubMed CentralView ArticlePubMed
- Tegoni M, Pelosi P, Vincent F, Spinelli S, Campanacci V, Grolli S, Ramoni R, Cambillau C: Mammalian odorant binding proteins. Biochim Biophys Acta, Gene Struct Expression. 2000, 1482 (1–2): 229-240.
- Vogt RG, Riddiford LM: Pheromone binding and inactivation by moth antennae. Nature. 1981, 161-163. 10.1038/293161a0. 293
- Pelosi P: Odorant-binding proteins. Crit Rev Biochem Mol Biol. 1994, 29 (3): 199-228. 10.3109/10409239409086801.View ArticlePubMed
- Xu P, Atkinson R, Jones DN, Smith DP: Drosophila OBP LUSH is required for activity of pheromone-sensitive neurons. Neuron. 2005, 45 (2): 193-200. 10.1016/j.neuron.2004.12.031.View ArticlePubMed
- Laughlin JD, Ha TS, Jones DN, Smith DP: Activation of pheromone-sensitive neurons is mediated by conformational activation of pheromone-binding protein. Cell. 2008, 133 (7): 1255-1265. 10.1016/j.cell.2008.04.046.PubMed CentralView ArticlePubMed
- Pelosi P, Maida R: Odorant-binding proteins in insects. Comp Biochem Physiol B: Biochem Mol Biol. 1995, 111 (3): 503-514. 10.1016/0305-0491(95)00019-5.View Article
- Ziegelberger G: Redox-shift of the pheromone-binding protein in the silkmoth Antheraea polyphemus. Eur J Biochem. 1995, 232 (3): 706-711. 10.1111/j.1432-1033.1995.tb20864.x.View ArticlePubMed
- Kaissling KE: Olfactory perireceptor and receptor events in moths: a kinetic model. Chem Senses. 2001, 26 (2): 125-150. 10.1093/chemse/26.2.125.View ArticlePubMed
- Galindo K, Smith DP: A large family of divergent Drosophila odorant-binding proteins expressed in gustatory and olfactory sensilla. Genetics. 2001, 159 (3): 1059-1072.PubMed CentralPubMed
- Tegoni M, Campanacci V, Cambillau C: Structural aspects of sexual attraction and chemical communication in insects. Trends Biochem Sci. 2004, 29 (5): 257-264. 10.1016/j.tibs.2004.03.003.View ArticlePubMed
- Hekmat-Scafe DS, Scafe CR, McKinney AJ, Tanouye MA: Genome-wide analysis of the odorant-binding protein gene family in Drosophila melanogaster. Genome Res. 2002, 12 (9): 1357-1369. 10.1101/gr.239402.PubMed CentralView ArticlePubMed
- Hekmat-Scafe DS, Dorit RL, Carlson JR: Molecular evolution of odorant-binding protein genes OS-E and OS-F in Drosophila. Genetics. 2000, 155 (1): 117-127.PubMed CentralPubMed
- McKenna MP, Hekmat-Scafe DS, Gaines P, Carlson JR: Putative Drosophila pheromone-binding proteins expressed in a subregion of the olfactory system. J Biol Chem. 1994, 269 (23): 16340-16347.PubMed
- Sanchez-Gracia A, Aguade M, Rozas J: Patterns of nucleotide polymorphism and divergence in the odorant-binding protein genes OS-E and OS-F: analysis in the melanogaster species subgroup of Drosophila. Genetics. 2003, 165 (3): 1279-1288.PubMed CentralPubMed
- Clark AG, Eisen MB, Smith DR, Bergman CM, Oliver B, Markow TA, Kaufman TC, Kellis M, Gelbart W, Iyer VN, et al: Evolution of genes and genomes on the Drosophila phylogeny. Nature. 2007, 450 (7167): 203-218. 10.1038/nature06341.View ArticlePubMed
- Huang X, Miller W: A time-efficient, linear-space local similarity. Algor Adv Appl Math. 1991, 12: 337-357. 10.1016/0196-8858(91)90017-D.View Article
- Gu X: Statistical methods for testing functional divergence after gene duplication. Mol Biol Evol. 1999, 16 (12): 1664-1674.View ArticlePubMed
- Gu X: Maximum-likelihood approach for gene family evolution under functional divergence. Mol Biol Evol. 2001, 18 (4): 453-464.View ArticlePubMed
- McClellan DA, McCracken KG: Estimating the influence of selection on the variable amino acid sites of the cytochrome b protein functional domains. Mol Biol Evol. 2001, 18 (6): 917-925.View ArticlePubMed
- McClellan DA, Palfreyman EJ, Smith MJ, Moss JL, Christensen RG, Sailsbery JK: Physicochemical evolution and molecular adaptation of the cetacean and artiodactyl cytochrome b proteins. Mol Biol Evol. 2005, 22 (3): 437-455. 10.1093/molbev/msi028.View ArticlePubMed
- Schwede T, Kopp J, Guex N, Peitsch MC: SWISS-MODEL: An automated protein homology-modeling server. Nucleic Acids Res. 2003, 31 (13): 3381-3385. 10.1093/nar/gkg520.PubMed CentralView ArticlePubMed
- Laskowski RA, MacArthur MW, Moss DS, Thornton JM: PROCHECK: a program to check the stereochemical quality of protein structures. J Appl Cryst. 1993, 26: 283-291. 10.1107/S0021889892009944.View Article
- Vogt RG: Odorant binding protein homologues of the malaria mosquito Anopheles gambiae ; possible orthologues of the OS-E and OS-F OBPs of Drosophila melanogaster. J Chem Ecol. 2002, 28 (11): 2371-2376. 10.1023/A:1021009311977.View ArticlePubMed
- Nei M, Hughes AL: Balanced polymorphism and evolution by the birth-and-death process in the MHC loci. 11th Histocompatibility Workshop and Conference: 1992. 1992, Oxford, UK: Oxford Univ. Press
- Ohno S: Evolution by gene duplication. 1970, Berlin: SpringerView Article
- Piatigorsky J, Wistow G: The recruitment of crystallins: new functions precede gene duplication. Science. 1991, 252 (5010): 1078-1079. 10.1126/science.252.5009.1078.View ArticlePubMed
- Hughes AL: The evolution of functionally novel proteins after gene duplication. Proc R Soc London Ser B. 1994, 256 (1346): 119-124. 10.1098/rspb.1994.0058.View Article
- Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait P: Preservation of duplicate genes by complementary, degenerative mutations. Genetics. 1999, 151 (4): 1531-1545.PubMed CentralPubMed
- Hekmat-Scafe DS, Steinbrecht RA, Carlson JR: Coexpression of two odorant-binding protein homologs in Drosophila : implications for olfactory coding. J Neurosci. 1997, 17 (5): 1616-1624.PubMed
- Shanbhag SR, Smith DP, Steinbrecht RA: Three odorant-binding proteins are co-expressed in sensilla trichodea of Drosophila melanogaster. Arthropod Struct Dev. 2005, 34 (2): 153-165. 10.1016/j.asd.2005.01.003.View Article
- Lemos B, Bettencourt BR, Meiklejohn CD, Hartl DL: Evolution of proteins and gene expression levels are coupled in Drosophila and are independently associated with mRNA abundance, protein length, and number of protein-protein interactions. Mol Biol Evol. 2005, 22 (5): 1345-1354. 10.1093/molbev/msi122.View ArticlePubMed
- Marais G, Nouvellet P, Keightley PD, Charlesworth B: Intron size and exon evolution in Drosophila. Genetics. 2005, 170 (1): 481-485. 10.1534/genetics.104.037333.PubMed CentralView ArticlePubMed
- Lynch M, Conery JS: The evolutionary fate and consequences of duplicate genes. Science. 2000, 290 (5494): 1151-1155. 10.1126/science.290.5494.1151.View ArticlePubMed
- Andronopoulou E, Labropoulou V, Douris V, Woods DF, Biessmann H, Iatrou K: Specific interactions among odorant-binding proteins of the African malaria vector Anopheles gambiae. Insect Mol Biol. 2006, 15 (6): 797-811. 10.1111/j.1365-2583.2006.00685.x.View ArticlePubMed
- Lartigue A, Gruez A, Spinelli S, Riviere S, Brossut R, Tegoni M, Cambillau C: The crystal structure of a cockroach pheromone-binding protein suggests a new ligand binding and release mechanism. J Biol Chem. 2003, 278 (32): 30213-30218. 10.1074/jbc.M304688200.View ArticlePubMed
- Horst R, Damberger F, Luginbuhl P, Guntert P, Peng G, Nikonova L, Leal WS, Wuthrich K: NMR structure reveals intramolecular regulation mechanism for pheromone binding and release. Proc Natl Acad Sci USA. 2001, 98 (25): 14374-14379. 10.1073/pnas.251532998.PubMed CentralView ArticlePubMed
- Leal WS, Chen AM, Ishida Y, Chiang VP, Erickson ML, Morgan TI, Tsuruda JM: Kinetics and molecular properties of pheromone binding and release. Proc Natl Acad Sci USA. 2005, 102 (15): 5386-5391. 10.1073/pnas.0501447102.PubMed CentralView ArticlePubMed
- Wogulis M, Morgan T, Ishida Y, Leal WS, Wilson DK: The crystal structure of an odorant binding protein from Anopheles gambiae : Evidence for a common ligand release mechanism. Biochem Biophys Res Commun. 2006, 339 (1): 157-164. 10.1016/j.bbrc.2005.10.191.View ArticlePubMed
- Llopart A, Aguade M: Synonymous rates at the RpII215 gene of Drosophila: variation among species and across the coding region. Genetics. 1999, 152 (1): 269-280.PubMed CentralPubMed
- Perez JA, Munte A, Rozas J, Segarra C, Aguade M: Nucleotide polymorphism in the RpII215 gene region of the insular species Drosophila guanche: reduced efficacy of weak selection on synonymous variation. Mol Biol Evol. 2003, 20 (11): 1867-1875. 10.1093/molbev/msg199.View ArticlePubMed
- Prabhakaran M, Ponnuswamy PK: The spatial distribution of physical, chemical, energetic and conformational properties of amino acid residues in globular proteins. J Theor Biol. 1979, 80 (4): 485-504. 10.1016/0022-5193(79)90090-0.View ArticlePubMed
- Wojtasek H, Leal WS: Conformational change in the pheromone-binding protein from Bombyx mori induced by pH and by interaction with membranes. J Biol Chem. 1999, 274 (43): 30950-30956. 10.1074/jbc.274.43.30950.View ArticlePubMed
- Gromiha MM, Ponnuswamy PK: Relationship between amino acid properties and protein compressibility. J Theor Biol. 1993, 165: 87-100. 10.1006/jtbi.1993.1178.View Article
- Ivarsson Y, Mackey AJ, Edalat M, Pearson WR, Mannervik B: Identification of residues in glutathione transferase capable of driving functional diversification in evolution. A novel approach to protein redesign. J Biol Chem. 2003, 278 (10): 8733-8738. 10.1074/jbc.M211776200.View ArticlePubMed
- Matsuo T: Rapid evolution of two odorant-binding protein genes, Obp57d and Obp57e, in the Drosophila melanogaster species group. Genetics. 2008, 178 (2): 1061-1072. 10.1534/genetics.107.079046.PubMed CentralView ArticlePubMed
- Ashburner M: Drosophila: A laboratory handbook. 1989, New York: Cold Spring Harbor Laboratory Press
- Saiki RK, Gelfand DH, Stoffel S, Scharf SJ, Higuchi R, Horn GT, Mullis KB, Erlich HA: Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science. 1988, 239 (4839): 487-491. 10.1126/science.2448875.View ArticlePubMed
- Richards S, Liu Y, Bettencourt BR, Hradecky P, Letovsky S, Nielsen R, Thornton K, Hubisz MJ, Chen R, Meisel RP, et al: Comparative genome sequencing of Drosophila pseudoobscura : chromosomal, gene, and cis-element evolution. Genome Res. 2005, 15 (1): 1-18. 10.1101/gr.3059305.PubMed CentralView ArticlePubMed
- Siebert PD, Chenchik A, Kellogg DE, Lukyanov KA, Lukyanov SA: An improved PCR method for walking in uncloned genomic DNA. Nucleic Acids Res. 1995, 23 (6): 1087-1088. 10.1093/nar/23.6.1087.PubMed CentralView ArticlePubMed
- Holt RA, Subramanian GM, Halpern A, Sutton GG, Charlab R, Nusskern DR, Wincker P, Clark AG, Ribeiro JM, Wides R, et al: The genome sequence of the malaria mosquito Anopheles gambiae. Science. 2002, 298 (5591): 129-149. 10.1126/science.1076181.View ArticlePubMed
- Mita K, Kasahara M, Sasaki S, Nagayasu Y, Yamada T, Kanamori H, Namiki N, Kitagawa M, Yamashita H, Yasukochi Y, Kadono-Okuda K, Yamamoto K, Ajimura M, Ravikumar G, Shimomura M, Nagamura Y, Shin IT, Abe H, Shimada T, Morishita S, Sasaki T: The genome sequence of silkworm, Bombyx mori. DNA Res. 2004, 11 (1): 27-35. 10.1093/dnares/11.1.27.View ArticlePubMed
- The Honeybee Genome Sequencing Consortium THGS: Insights into social insects from the genome of the honeybee Apis mellifera. Nature. 2006, 443 (7114): 931-949. 10.1038/nature05260.View Article
- Nene V, Wortman JR, Lawson D, Haas B, Kodira C, Tu ZJ, Loftus B, Xi Z, Megy K, Grabherr M, et al: Genome sequence of Aedes aegypti, a major arbovirus vector. Science. 2007, 316 (5832): 1718-1723. 10.1126/science.1138878.View ArticlePubMed
- Richards S, Gibbs RA, Weinstock GM, Brown SJ, Denell R, Beeman RW, Gibbs R, Beeman RW, Brown SJ, Bucher G, et al: The genome of the model beetle and pest Tribolium castaneum. Nature. 2008, 452 (7190): 949-955. 10.1038/nature06784.View ArticlePubMed
- Ovcharenko I, Loots GG, Hardison RC, Miller W, Stubbs L: zPicture: dynamic alignment and visualization tool for analyzing conservation profiles. Genome Res. 2004, 14 (3): 472-477. 10.1101/gr.2129504.PubMed CentralView ArticlePubMed
- Duret L, Gasteiger E, Perriere G: LALNVIEW: a graphical viewer for pairwise sequence alignments. Comput Appl Biosci. 1996, 12 (6): 507-510.PubMed
- Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32 (5): 1792-1797. 10.1093/nar/gkh340.PubMed CentralView ArticlePubMed
- Maddison WP, Maddison DR: MacClade: Analysis of phylogeny and character evolution. Version 3. 1992, Sunderland, Massachusetts: Sinauer Associates
- Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R: DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics. 2003, 19 (18): 2496-2497. 10.1093/bioinformatics/btg359.View ArticlePubMed
- Tamura K, Dudley J, Nei M, Kumar S: MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol. 2007, 24 (8): 1596-1599. 10.1093/molbev/msm092.View ArticlePubMed
- Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003, 19 (12): 1572-1574. 10.1093/bioinformatics/btg180.View ArticlePubMed
- Posada D, Crandall KA: MODELTEST: testing the model of DNA substitution. Bioinformatics. 1998, 14 (9): 817-818. 10.1093/bioinformatics/14.9.817.View ArticlePubMed
- Takezaki N, Rzhetsky A, Nei M: Phylogenetic test of the molecular clock and linearized trees. Mol Biol Evol. 1995, 12 (5): 823-833.PubMed
- Yang Z: PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007, 24 (8): 1586-1591. 10.1093/molbev/msm088.View ArticlePubMed
- Yang Z: Inference of selection from multiple species alignments. Curr Opin Genet Dev. 2002, 12 (6): 688-694. 10.1016/S0959-437X(02)00348-9.View ArticlePubMed
- Yang Z, Nielsen R, Goldman N, Pedersen AM: Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics. 2000, 155 (1): 431-449.PubMed CentralPubMed
- Yang Z, Wong WS, Nielsen R: Bayes empirical bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005, 22 (4): 1107-1118. 10.1093/molbev/msi097.View ArticlePubMed
- Zhang J, Nielsen R, Yang Z: Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol. 2005, 22 (12): 2472-2479. 10.1093/molbev/msi237.View ArticlePubMed
- Anisimova M, Yang Z: Multiple hypothesis testing to detect lineages under positive selection that affects only a few sites. Mol Biol Evol. 2007, 24 (5): 1219-1228. 10.1093/molbev/msm042.View ArticlePubMed
- Miller RGJ: Simultaneous statistical inference. 1981, Springer-Verlag, New YorkView Article
- Whelan S, Goldman N: Distributions of statistics used for the comparison of models of sequence evolution in phylogenetics. Mol Biol Evol. 1999, 16 (9): 1292-1299.View Article
- Woolley S, Johnson J, Smith MJ, Crandall KA, McClellan DA: TreeSAAP: selection on amino acid properties using phylogenetic trees. Bioinformatics. 2003, 19 (5): 671-672. 10.1093/bioinformatics/btg043.View ArticlePubMed
- Fares MA, McNally D: CAPS: coevolution analysis using protein sequences. Bioinformatics. 2006, 22 (22): 2821-2822. 10.1093/bioinformatics/btl493.View ArticlePubMed
- Rost B, Yachdav G, Liu J: The PredictProtein server. Nucleic Acids Res. 2004, W321-326. 10.1093/nar/gkh377. 32 Web Server
- Berman HM, Bhat TN, Bourne PE, Feng Z, Gilliland G, Weissig H, Westbrook J: The Protein Data Bank and the challenge of structural genomics. Nat Struct Biol. 2000, 7 (Suppl): 957-959. 10.1038/80734.View ArticlePubMed
- Guex N, Peitsch MC: SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis. 1997, 18 (15): 2714-2723. 10.1002/elps.1150181505.View ArticlePubMed
- Ramos-Onsins S, Segarra C, Rozas J, Aguade M: Molecular and chromosomal phylogeny in the obscura group of Drosophila inferred from sequences of the rp49 gene region. Mol Phylogenet Evol. 1998, 9 (1): 33-41. 10.1006/mpev.1997.0438.View ArticlePubMed
- Tamura K, Subramanian S, Kumar S: Temporal patterns of fruit fly (Drosophila) evolution revealed by mutation clocks. Mol Biol Evol. 2004, 21 (1): 36-44. 10.1093/molbev/msg236.View ArticlePubMed
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.