Assuming that bdelloid rotifers are tetraploid and inherited alpha tubulin copies from a diploid common ancestor with monogononts, we expected bdelloids to have twice as many copies of alpha tubulin as diploid monogononts, or fewer than this if monogononts have copies with sex-specific functions that were lost in bdelloids. Instead, we found that alpha tubulin has undergone proliferation in bdelloid rotifers relative to the one or two copies found in monogonont rotifers, and beyond numbers expected solely based on degenerate tetraploidy. Each species contains between 11 and 13 copies that can be divided into five main classes supported both by exon phylogeny and intron structure. Roughly half the copies belong to a single class (class 1) and have amino acid sequences conserved both within the class and relative to copies present in monogonont rotifers. Within class 1, copies fall into four distinct groups (differing by more than 10% nucleotide sequence divergence, which we call copy types) in all four species. The remaining alpha tubulin copies belong to four classes that have diverged significantly in intron structure, amino acid sequence and, in the case of classes 3 to 5, in predicted biochemical properties. The four copy types within class 1 and the four divergent classes 2 to 5 together make eight divergent copy types per species. Because all five classes (including multiple copy types in class 1) are present in species from both sampled families of bdelloids, we propose that bdelloids inherited these eight divergent copy types from their common ancestor followed by the loss of some copy types in some species.
The repeated observation of pairs of similar sequences (<1.5% divergent) of a given copy type in the same species is consistent with there being pairs of copies maintained as templates for DNA repair through gene conversion, as proposed to occur between collinear pairs in other bdelloid genome regions . An alternative explanation would be recent duplication of copies within species, but this would be unlikely to generate a widespread pattern of recent divergence between similar copies across classes, unless a similar event had happened independently in all four species. Cases in which only a single copy of a divergent copy type is present could be due to very recent conversion generating identical copies, or to the loss of some duplicate copies. Only in one case were more than two similar copies present within the same species (class 5 in A. vaga), and two of these sequences were just outside our error threshold, so perhaps reflect sequencing error.
Given the presence of paired copies within species’ copy types, we therefore propose that bdelloids inherited sixteen copies of alpha tubulin from their common ancestor: eight divergent copy types, each present as a pair of similar sequences. Some copies were later lost or homogenized in particular lineages leading to the observed numbers of copies in the sampled species. Alternative scenarios invoking duplication events since the species diverged cannot readily explain why all four species share these patterns, unless copies originating independently in separate species were either transferred horizontally to other species - which seem extremely unlikely, even with bdelloids’ known ability to take up foreign DNA - or had converged in both amino acid sequence and intron structure, again unlikely.
Our results provide robust evidence for numbers of copies and likely inheritance from their common ancestor, yet reconstructing the ancestry of alpha tubulin copies using phylogenetics proved challenging for two reasons. First, we detected major differences in base composition among sequences, which we attempted to deal with using models of composition heterogeneity in p4. Second, functional divergence between different classes has led to extreme rate heterogeneity and departures from the assumptions of neutral models of sequence evolution. For example, the relationships of class 1 copies relative to classes 2 to 5 were weakly supported and unstable among alternative reconstructions. The preferred tree using five base composition vectors had classes 4 and 5 as sister to one A. vaga class 1 copy type, which cannot reflect the true ancestry of the genes for reasons argued above (namely that horizontal transfer from one species followed by diversification across the other species seems highly unlikely). The amino acid tree does have class 1 copies as monophyletic and classes 2 to 5 copies as sister clade to them, which is consistent with inheritance of multiple copies from a common ancestor. However, the amino acid tree lacks resolution to infer ancestry among class 1 copies. Together, these problems prevented the use of formal genealogical methods for reconstructing patterns of gene duplication and loss (such as Notung 2.6 ). Also, the homogenizing effect of gene conversion is not taken into account in current approaches, and therefore the existence of recent pairs of copies within species would have been inferred as multiple duplications occurring in each species, rather than ancestrally inherited copies maintained by gene conversion (consistent with existing knowledge of bdelloid genomes). Our dataset presents a useful test case for developers of more complex models of sequence evolution.
Most monogonont species had one copy of alpha tubulin, which likely reflects a pair of identical alleles in these sexually reproducing diploids. This low copy number eliminates the possibility of sex-specific roles for alpha tubulin in most monogononts or that multiple copies in bdelloids arose from formerly sex-specific copies in their sexual ancestor. Assuming that the background level of ploidy in bdelloids conforms to the tetraploidy as demonstrated for two separate genome regions and in both bdelloid families we studied here , then if the ancestor contained four copies of alpha tubulin (one pair of alleles in the sexually reproducing ancestor duplicated – by either genome duplication or hybridization - on the origin of tetraploidy), a further two rounds of duplication of all genes are required to explain our findings. One possible mechanism would be tandem duplication, possibly facilitated by DNA breakage and repair during desiccation. Tandem duplication has been observed for alpha tubulin in species ranging from Trypanosoma brucei to Zea mays. If the ancestral sexual progenitor of bdelloids had two distinct copies of alpha tubulin, as found in the B. plicatilis transcriptome, then only one round of additional duplication events would be needed to explain our findings.
Why might bdelloids have so many copies of alpha tubulin? First, multiple identical copies might provide redundancy and increase the rate of production of proteins. Gene duplicates have been shown to provide robustness against null mutations in Saccharomyces cerevisiae. Second, different copies might be expressed in different tissues or at different developmental stages. For example, Bombyx mori and Drosophila melanogaster alpha tubulin isotypes fall into classes that are ubiquitously expressed throughout the body at all life stages, classes that are expressed only at certain stages of the life-cycle, and classes expressed only in certain tissues, such as the testes or ovaries . Third, different copies might have specialist functions in different cellular processes or locations. Modifications in amino acid sequence can result in changes in the stability and kinetics of microtubule assemblies and in the binding of different microtubule associated proteins . Fourth, different copies might provide robustness against environmental variation, by maintaining microtubule function under extreme environmental conditions. For example nine copies of alpha tubulin expressed in brain tissue of the Antarctic fish Notothenia coriiceps vary in their stability and polymerization rates at extreme temperatures .
Which of these mechanisms might explain alpha tubulin proliferation in bdelloids? Class 1 copies have conserved protein structure and predicted biochemical properties shared with alpha tubulin copies in monogononts. Yet, each bdelloid species has four divergent types of this class of copies, plus additional pairs of similar copies. Interestingly, only two of the six class 1 copies were recovered in the transcriptome of A. ricciae pooled from several laboratory cultures both in hydrated and desiccated conditions. The missing copies might be expressed in small amounts or only in life-stages (e.g. eggs) or particular environments (e.g. rehydrating adults) that were not sampled in the experiments. Because no biochemical differences were detected among class 1 copies, these copies might be truly functionally redundant. Functional redundancy of housekeeping genes might provide extra protection in the face of DNA breakage by increasing the probability that at least one copy still functions after desiccation. Alternatively, multiple copies might provide the machinery to enhance rapid production of protein during rehydration. In Arabidopsis thaliana alpha tubulin is up-regulated upon rehydration after a period of dehydration . Tubulin has also been shown to form aggregates and dense networks in Brassica napus seedlings during dehydration . It is therefore possible that having multiple functionally redundant tubulin copies might enable rapid up-regulation of tubulin expression at important times.
In contrast, the remaining classes are divergent in amino acid sequence, and those present in A. ricciae were all recovered in the transcriptome and hence had been expressed under laboratory conditions. There was significant evidence for excess amino acid divergence between classes relative to the strong purifying selection within them. This is consistent with a standard model of functional specialization after gene duplication in which neo- or sub- functionalized copies evolve, perhaps following a period of relaxed selection, and are then subject to purifying selection. None of the amino acid positions identified as significantly divergent between classes are within regions identified a priori as functionally important for alpha tubulin, and very few are in structural regions, indicating that selection has not acted to alter fundamentally the functioning of the copies as microtubule proteins. These positions did, however, have significantly higher MAPP scores than random selections of other polymorphic codons, which indicates that they are predicted to have larger consequences for protein function than randomly selected amino acid positions. This finding adds further evidence that divergence at these sites is associated with functional divergence between classes. Class 2 alpha tubulin copies differed in amino acid sequence but have the same predicted chemistry as class 1 for the parameters examined here. However, this does not preclude a different function, as small amino acid changes can alter the function of tubulin . Classes 3 to 5 have diverged significantly both in amino acid sequence and in predicted protein chemistry, and are therefore likely to play different roles.
Increased hydrophobicity of the domain that interacts with beta tubulin is predicted to increase microtubule stability . Although there was no evidence of excess amino acid divergence in this region, contact surface hydrophobicity did differ significantly between classes and class 3 copies are predicted to form the most stable microtubule arrays and class 5 the least. Amino acid variation can also be involved in sub-cellular localization and cellular function. Variation in isoelectric point has been noted in tubulin copies with different sub-cellular distributions , so the significantly higher isoelectric point in class 5 relative to the others might point to a variation in sub-cellular localization or function between classes. Future experiments are needed to test these alternative hypotheses for the function of divergent copies.
Another striking difference between classes was in intron structure. In contrast to the lack of introns in monogonont alpha tubulin, many introns are present and their number and location varies markedly between classes. This finding matches broader evidence that intron loss and gain occurs at a significantly faster rate between paralogs than between orthologs [65, 66]. Genetic manipulation experiments in rice and yeast have shown that introns influence the timing and cellular location of expression of alpha tubulin [67, 68]. Such experiments are not feasible at present in bdelloid rotifers, for which no method of genetic modification is currently available. However, the significant pattern of divergence of intron structure between classes, and conservation within classes, adds further evidence for functional divergence among classes.