Diversifying selection and host adaptation in two endosymbiont genomes
© Brownlie et al. 2007
Received: 21 November 2006
Accepted: 30 April 2007
Published: 30 April 2007
Skip to main content
© Brownlie et al. 2007
Received: 21 November 2006
Accepted: 30 April 2007
Published: 30 April 2007
The endosymbiont Wolbachia pipientis infects a broad range of arthropod and filarial nematode hosts. These diverse associations form an attractive model for understanding host:symbiont coevolution. Wolbachia 's ubiquity and ability to dramatically alter host reproductive biology also form the foundation of research strategies aimed at controlling insect pests and vector-borne disease. The Wolbachia strains that infect nematodes are phylogenetically distinct, strictly vertically transmitted, and required by their hosts for growth and reproduction. Insects in contrast form more fluid associations with Wolbachia. In these taxa, host populations are most often polymorphic for infection, horizontal transmission occurs between distantly related hosts, and direct fitness effects on hosts are mild. Despite extensive interest in the Wolbachia system for many years, relatively little is known about the molecular mechanisms that mediate its varied interactions with different hosts. We have compared the genomes of the Wolbachia that infect Drosophila melanogaster, w Mel and the nematode Brugia malayi, w Bm to that of an outgroup Anaplasma marginale to identify genes that have experienced diversifying selection in the Wolbachia lineages. The goal of the study was to identify likely molecular mechanisms of the symbiosis and to understand the nature of the diverse association across different hosts.
The prevalence of selection was far greater in w Mel than w Bm. Genes contributing to DNA metabolism, cofactor biosynthesis, and secretion were positively selected in both lineages. In w Mel there was a greater emphasis on DNA repair, cell division, protein stability, and cell envelope synthesis.
Secretion pathways and outer surface protein encoding genes are highly affected by selection in keeping with host:parasite theory. If evidence of selection on various cofactor molecules reflects possible provisioning, then both insect as well as nematode Wolbachia may be providing substances to hosts. Selection on cell envelope synthesis, DNA replication and repair machinery, heat shock, and two component switching suggest strategies insect Wolbachia may employ to cope with diverse host and intra-host environments.
Intracellular bacterial symbiont associations are extremely common in invertebrates. The capacity for these symbionts to shape host biology is immense and includes documented effects on host reproduction , food preference , locomotion , and interspecific competition . Teasing apart the contributions of insect and symbiont genomes to such multi-organism determined phenotypes is necessary if the evolution and ecology of both partners are to be understood. This can be challenging, because the complex biotic interaction also makes these systems less tractable experimentally. Comparative study of sequenced symbiont genomes and their relatives is offering new means to direct empirical study of symbiosis .
The endosymbiont Wolbachia pipientis infects a wide range of arthropod and filarial nematode hosts. Across its host range the microbe is associated with diverse phenotypic outcomes. The Wolbachia -nematode associations are mutualistic while all other associations could be described as commensal or parasitic in nature. In nematodes the infection is confined to the nematode reproductive tract and the hypodermal tissue where the microbe plays an integral role in host viability and reproduction [6, 7]. Phylogenies of Wolbachia and their nematode hosts are congruent, reflecting a long history of strict vertical transmission . Tight associations like these are predicted to generate genome reduction , as host support of symbiont requirements leads to degradation and loss of the genes in these redundant pathways. Consistent with this prediction, the genome of the Wolbachia strain that infects Brugia malayi (w Bm) is much smaller and highly streamlined relative to the genomes of free-living bacteria and other Wolbachia [10, 11].
The Wolbachia -arthropod association, in contrast, is more fluid in nature. Infections are not fixed in populations and most appear to be mild in their effects on host fitness [12, 13]. Horizontal transmission among host lineages is common on a phylogenetic time scale, meaning closely related Wolbachia can be found in taxonomically diverse hosts . Infections can be found in numerous somatic tissues as well as the gonads . The presence of Wolbachia in insect hemolymph in combination with recent experimental work also suggests that the bacteria may be exposed to extracellular environments for sustained periods . Across the arthropods Wolbachia also induces a broad range of reproductive manipulations including feminization, male killing, cytoplasmic incompatibility, and parthenogenesis [1, 17, 18]. The pattern of Wolbachia tissue distribution, infection densities, induced fitness effects, and reproductive manipulations vary greatly within the arthropods and are the result of host and bacterial genotype interactions [19–22].
Here we report the results of genome wide screens for the presence of diversifying selection in the Wolbachia that infect the filarial nematode, Brugia malayi w Bm  and the insect Drosophila melanogaster, w Mel . Per gene estimates of nonsynonyous substitution per nonsynonymous site versus synonymous substitution per synonymous site (d N /d S ) in the Wolbachia relative to the outgroup species, Anaplasma marginale  were used to infer past history of positive selection . This approach has been utilized previously to explore the genetic basis of complex phenotypes in a diverse range of taxa [24–29]. By identifying key molecular adaptations in each of the two Wolbachia lineages, we sought to shed light the mechanistic basis of the Wolbachia symbiosis and how it might vary with respect to different hosts. We hypothesized that genes whose encoded proteins were involved with secretion or were localized to the Wolbachia cell surface would show evidence of strong selection due to their interaction with the host. We also expected to find evidence of selection on pathways that could be used for host provisioning in w Bm. The screen confirmed both these hypotheses. The genomic comparisons also revealed possible points of host provisioning in w Mel and strategies Wolbachia may have evolved for coping with diverse hosts and intra host environments.
An examination of the d N /d S ratios also highlighted those genes experiencing extreme levels purifying selection in either of the Wolbachia lineages. A total of 323 genes in the w Bm lineage and 250 genes in w Mel had a ratio < 1.0. We then examined only the most severely affected genes (d N /d S < 0.2) in each lineage and asked whether genes from any of the functional categories were over-represented. Most of the major functional classes were represented by only a small number of genes. The exceptions were the categories of synthesis and modification of ribosomal proteins in both genomes and the biosynthesis and degradation of cell envelope in w Bm only. The former represented ~15% of genes with d N /d S < 0.2 and the latter 8% of the genes for w Bm. The extreme conservation in ribosomal protein evolution is not surprising given their essential and conserved cellular functions for all kingdoms of life. Purifying selection on cell envelope component genes in w Bm is interesting given that these same genes are experiencing diversifying selection in w Mel (see Additional file 1). The Wolbachia cell envelope may be exposed to vastly different environments in the insect versus nematode hosts. Differences in how selection is operating on the genes encoding membrane proteins may reflect adaptation to lineage specific ecological niches (see Direct contact with the host below).
Both Wolbachia genomes lack complete pathways for de novo synthesis of coenzyme A, NAD, biotin, lipoate, ubiquinone, and folate; presumably the host supplements these compounds [10, 11]. Several genes that encode for components of these disrupted biosynthetic pathways show evidence of positive selection in w Mel and may reflect the molecular evolutionary process of integrating host and symbiont systems (Fig. 2 & see Additional file 1). Selection on genes in these same pathways was also detected in w Bm, but under less stringent rejection criteria (see Additional file 1). Unlike the above listed cofactors, riboflavin biosynthesis pathways are complete in both Wolbachia strains. Evidence for positive selection on riboflavin synthesis was present in w Mel (Fig. 2, Model p < 0.001, & Fisher's p < 0.001) and again in w Bm under slightly less stringent criteria (see Additional file 1). Symbiont provisioning of riboflavin has been documented in both weevil-SOPE and aphid- Buchnera associations [36, 38]. Two members of the heme biosynthetic pathway (of seven genes in total) were affected by selection in w Mel. Additional genes in the heme biosynthesis pathway were also identified in both w Bm and w Mel when less stringent rejection criteria were applied (see Additional file 1). An examination of the Brugia malayi genome  suggests that the nematode may be incapable of synthesizing its own heme and therefore it is possible that w Bm Wolbachia may be provisioning its host with heme intermediates. Although insect hosts are not dependent on Wolbachia for heme biosynthesis, the microbe may supplement host stores or play an additional role in iron homeostasis.
In addition to the provision of metabolic cofactors, invertebrate hosts may also benefit from an additional source of nucleotides provided by Wolbachia. Multiple genes in this functional category (seven in w Bm and five in w Mel, Fig. 1) were affected by positive selection (see Additional file 1). Other endosymbionts, including the parasitic Rickettsia or beneficial Buchnera, scavenge nucleotides from the host environment via ATP/ADP translocases. Wolbachia, however encodes complete purine and pyrimidine biosynthetic pathways, and lacks the nucleotide translocase found in the closely related Rickettsia [10, 11]. The provision of nucleotides by w Bm and w Mel could benefit their hosts during periods of rapid DNA replication and cellular division, such as during oogenesis and embryogenesis . Lastly, there is widespread evidence of diversifying selection in both genomes on amino acid biosynthetic pathway genes (Fig 1 and see Additional file 1). Wolbachia lack many genes in the biosynthetic pathways for amino acids and therefore it is less likely they are provisioning hosts in this regard [10, 11].
The coordination of symbiont replication with host cell division is required to prevent either loss of the symbiont within the host or over replication leading to pathology within the host , such as that occurring with w MelPop. The mechanisms underlying this balancing act in Wolbachia -host associations are unknown. Filarial Wolbachia densities increase when the infection passes from the insect vector into the mammalian host [39, 40]. Arthropod Wolbachia are also present at different densities depending on host species , host developmental phase , and tissue distribution [15, 42]. For a number of insect species, Wolbachia has the additional challenge of dealing with host diapause where the microbe's replication would have to be slowed or stopped temporarily to maintain synchrony with host cell division .
The accumulation of mildly deleterious mutations in symbionts due to repeated bottlenecking during transmission between hosts has been used to predict the irreversible degradation of symbiont genomes via the process of Muller's ratchet . Selection for more effective repair or recombination systems may mitigate the effects of the ratchet upstream in the process. Both Wolbachia genomes appear to contain a functional set of DNA repair enzymes. Two genes in w Bm and five genes in w Mel encoding recombination and/or repair proteins were affected by positive selection. Muller's ratchet could be mitigated by genetic recombination among divergent strains of Wolbachia that infect a single host. However, this is not likely to occur for w Bm where multiple divergent strains of Wolbachia do not coexist within a single host. Multiple genes involved with aminoacylation of tRNAs were also affected by positive selection (see Additional file 1). These proteins ensure fidelity of translation by providing error correction . The prevalence of selection was roughly equal in w Bm and w Mel (six vs. nine genes, respectively) and could represent another strategy for minimizing effects of other sources of error on protein performance.
In w Mel one of the genes encoding part of the two-component system also exhibited evidence of positive selection (see Additional file 1, signal transduction). The two-component system forms the basis of a small-molecule signaling pathway and is thought to play a role in quorum sensing . In other bacteria these pathways affect exopolysaccharide synthesis, biofilm formation, motility, cell differentiation, and virulence. Genes comprising quorum-sensing systems have previously been shown to be targets of selection . Selection on this pathway in w Mel may indicate a mechanism for rapidly inducing widespread transcriptional changes in response to shifting habitats.
Ankyrin repeat domain-containing proteins are common in eukaryotes and viruses and are thought to mediate protein-protein interactions. ANK encoding genes are unusually common in the Wolbachia genome relative to other bacteria. The ANK containing proteins are especially interesting in the Wolbachia system given their possible involvement in determining reproductive phenotypes or host specificity [64, 65]. In Anaplasma phagocytophilum , one of these proteins is secreted into the host cell where it binds host chromatin and may affect host gene expression. Only one gene encoding an ANK protein exhibited diversifying selection in our screen (Fig. 3). The functional role of this protein in Wolbachia is not known.
There are a number of caveats associated with the interpretation of genome wide screens for selection . The methods employed here should be fairly conservative given, the use of per gene measures of dN/dS that are more likely to detect only dominant features of a gene, the statistical tests of difference between dN & dS, and use of multiple test correction procedures. We cannot completely exclude issues of saturation and increased fixation of nonsynonymous mutations in populations with small Ne . The results are also highly defined by the choice of outgroup. As more genome sequences become available future screens between strains within the Wolbachia genus may provide finer scale comparison among lineages. The trends identified here in terms of biological process, while not proof of adaptation, highlight the most likely points of interaction between hosts and symbionts. These areas may be targeted for empirical study in hopes of better understanding the mechanistic basis of Wolbachia symbiosis.
From this screen we would suggest the following hypotheses for further empirical testing. Both w Mel and w Bm may provision hosts with the following compounds or their intermediates; heme, riboflavin, ubiquinone, folate, and nucleotides. Rearing the hosts under restricted diets or more natural field conditions could reveal yet undescribed Wolbachia associated fitness benefits, particularly in insects. Regulating DNA replication and cell division may not only be a requirement for successful intracellularity, but also the key to adaptation to diverse cellular environments, temperatures, and host ranges in insect hosts. Enhanced DNA repair, improved translation fidelity, and the heat shock response may be adaptive responses to the action of Muller's ratchet in these small bottlenecked populations. The heat shock response in combination and two-component switching may be employed by insect Wolbachia to cope with variable host environments. Communication with the host is fundamental for both Wolbachia and evidence of diversifying selection is present in multiple secretion pathways. Insect Wolbachia are uniquely experiencing selection on cell envelope synthesis genes. This may reflect a greater exposure to effector molecules of the host immune system.
Anaplasma marginale (St. marie's strain)  was selected as the outgroup as it is the closest known relative to Wolbachia . A member of the α-proteobacteria, A. marginale is a pathogen of cattle that is vectored primarily by ticks . Sequences of w Mel, w Bm and A. marginale protein encoding genes – 1195, 805 and 949 respectively – were obtained from the RefSeq database. Proteins were considered orthologous if each combination of Blast searches (six in this three-way comparison) identified the same gene as the best scoring match [25, 69]. Ambiguous matches with little sequence similarity and very short alignments were eliminated by accepting only Blast hits with e-values less than or equal to 1 × 10-6. All known pseudogenes and phage sequences were excluded. The amino acid sequences for the 591 orthologs selected by the above procedure were then aligned with ClustalW ver. 1.83  using default parameters and the resulting alignments back-translated into their DNA sequence, preserving patterns of indels from protein alignments.
The likelihood ratio test of the null hypothesis of constant rates of nonsynonymous substitutions per nonsynonymous site over synonymous substitutions per synonymous site (d N /d S ) among all three lineages was performed on each triplet of genes using codon-based maximum likelihood models. The models were implemented using codeml – a program for codon-based substitution models from PAML package ver. 3.14 . All models were implemented to utilize one d N /d S ratio among all amino acid sites . The likelihood test was performed as a one-sided chi-square test of the null hypothesis H0 assuming one d N /d S ratio among all three lineages versus alternative hypotheses HA and HB allowing for two d N /d S ratios – one for w Bm or w Mel respectively, and a second for the remaining two lineages (branch-specific model).
Obtained log likelihood ratios were tested for significance using the upper critical value of chi-square distribution for one degree of freedom. The null hypothesis of constant d N /d S ratio among all three lineages was rejected when two times the log likelihood was greater than 3.84. A Benjamini & Hochberg multiple test correction  was employed in combination with a critical rejection value, α = 0.001. As random numbers are used to start the maximum likelihood iterations, we repeated the above analysis five times to check for convergence of the models. Average value and standard deviation of the focal lineages d N /d S ratios were used to assess model convergence. The supplemental tables report mean d N and d S values across the five replicate analyses. A number of genes with very small mean d S produced artificially inflated ratios at the reportable limit of codeml (999). In these cases the ratios themselves are not particularly informative (Fig. 2, d N /d S = 999, unreportable). We therefore have used Fisher's exact test  (p ≤ 0.001 & Benjamini & Hochberg multiple test correction)  for all loci to identify genes where d N was significantly different (and larger) from d S . All genes of interest reported here have therefore met both the significance criteria under the appropriate model of selection and possess report a mean d N that is significantly different and greater from the mean d S .
The assumptions of codeml include similarity of base composition and codon usage patterns as well as calculable genetic distances across the sequences being compared. The w Bm and w Mel genomes have very similar base compositions, 34.1  and 35.2% GC . Anaplasma marginale is 48.9% GC . A comparison of codon usage patterns between the three genomes by paired t-tests revealed no statistical differences (data not shown). Mean dN and dS values were 0.056 ± 0.002 and 0.14 ± 0.008 for w Bm/HA and 0.049 ± 0.001 and 0.10 ± 0.007 for w Mel/HB, respectively. Genetic distances are large enough that d N /d S  should not suffer from a time lag. Alternatively, genes experiencing a high degree of divergence and more specifically saturation could lead to overestimates of d N /d S . Anismova et al modeled the effects of various parameters including divergence on both power and accuracy of the likelihood ratio test . Our datasets (three taxa, mean gene length in codons ≈ 343 Transition/transversion ratio ≈ 4.0, and median d N /d S ≈ 0.3 for both HA and HB) are most similar to the reported results of experiment C. These simulations identified no type I error at α = 0.01. This study relies on a more stringent α and inspection of the data indicates that most significant genes possess high d N values relative to d S (see Additional file 1) and are therefore not likely to be artifacts of saturation.
The authors wish to thank Gavin Huttley, Cynthia Riginos, and five anonymous reviewers for advice regarding data analysis and Scott O'Neill for his ongoing collaboration. The research was supported by The Australian Research Council through Discovery Project Grant DP557987 to CI's Brownlie and McGraw.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.