Evidence for the adaptive significance of an LTR retrotransposon sequence in a Drosophila heterochromatic gene

Background The potential adaptive significance of transposable elements (TEs) to the host genomes in which they reside is a topic that has been hotly debated by molecular evolutionists for more than two decades. Recent genomic analyses have demonstrated that TE fragments are associated with functional genes in plants and animals. These findings suggest that TEs may contribute significantly to gene evolution. Results We have analyzed two transposable elements associated with genes in the sequenced Drosophila melanogaster y; cn bw sp strain. A fragment of the Antonia long terminal repeat (LTR) retrotransposon is present in the intron of Chitinase 3 (Cht3), a gene located within the constitutive heterochromatin of chromosome 2L. Within the euchromatin of chromosome 2R a full-length Burdock LTR retrotransposon is located immediately 3' to cathD, a gene encoding cathepsin D. We tested for the presence of these two TE/gene associations in strains representing 12 geographically diverse populations of D. melanogaster. While the cathD insertion variant was detected only in the sequenced y; cn bw sp strain, the insertion variant present in the heterochromatic Cht3 gene was found to be fixed throughout twelve D. melanogaster populations and in a D. mauritiana strain suggesting that it maybe of adaptive significance. To further test this hypothesis, we sequenced a 685bp region spanning the LTR fragment in the intron of Cht3 in strains representative of the two sibling species D. melanogaster and D. mauritiana (~2.7 million years divergent). The level of sequence divergence between the two species within this region was significantly lower than expected from the neutral substitution rate and lower than the divergence observed between a randomly selected intron of the Drosophila Alcohol dehydrogenase gene (Adh). Conclusions Our results suggest that a 359 bp fragment of an Antonia retrotransposon (complete LTR is 659 bp) located within the intron of the Drosophila melanogaster Cht3 gene is of adaptive evolutionary significance. Our results are consistent with previous suggestions that the presence of TEs in constitutive heterochromatin may be of significance to the expression of heterochromatic genes.


Background
The potential adaptive significance of transposable elements (TEs) to the host genomes in which they reside is a topic that has been hotly debated by molecular evolutionists for more than two decades. While the biological importance of TEs seemed self-evident to those scientists involved in their initial discovery [e.g., [1,2]], the subsequent realization that TEs could be maintained in populations even while imparting slight selective disadvantage to their hosts [e.g., [3][4][5]] drew into question the presumption of adaptive significance. However, even if TEs can be maintained in populations on a day-to-day basis without providing selective advantage, it does not preclude the possibility that the insertion of TEs in or near genes may, in some instances, be of adaptive advantage.
If TE insertion variants have contributed to adaptive gene evolution, such variants might be expected to be in high frequency or fixed in populations and species. Initial surveys of natural populations of Drosophila melanogaster showing that TE insertion alleles are in uniformly low frequency seemed to negate the adaptive hypothesis [6]. However, the sporadic discovery of degenerate TEs or TE fragments as critical components of functional genes in both plants and animals was sufficient to keep the adaptive hypothesis alive throughout the pre-genomic era [7][8][9][10][11].
The current availability of the complete or nearly complete sequence of select genomes representing a variety of species is providing an unprecedented opportunity to examine the frequency and distribution of TEs in eukaryotic genomes. The results have been dramatic. TEs not only comprise a significant fraction of nearly all eukaryotic genomes thus far sequenced, they have been found to be components of the regulatory and/or coding regions of a surprisingly large number of genes [e.g., [12]]. For example, a recent genomic analysis of 13,799 human genes revealed that approximately 4% harbored retrotransposon sequences within protein-coding regions [13]. Similar results have been recently reported for the nematode Caenorhabditis elegans [14]. Here we analyze the polymorphism of two LTR retrotransposon / host gene associations across geographically widespread D. melanogaster populations and a representative population of the D. melanogaster sibling species, Drosophila mauritiana.

Results
We have initiated a genomic analysis of LTR retrotransposons present in the Drosophila melanogaster genome [e.g., [15]]. Of particular interest is identification of genes harboring TEs and determining if these insertion alleles are in high frequency or fixed among natural populations as would be expected from the adaptive hypothesis. We report here the results of an analysis of two LTR retrotrans-poson-containing genes located on the second chromosome of the sequenced D. melanogaster y; cn bw sp strain. These two genes present an interesting contrast in that one of them, Chitinase 3 (Cht3), is located within constitutive heterochromatin (Genbank accession: AE002743) while the other, cathD, is located in a euchromatic region of the chromosome (Genbank accession: AE003839). Our findings demonstrate that while the euchromatic cathD insertion variant was not detected in any of the natural populations examined, the insertion variant present in the heterochromatic Cht3 gene was found to be apparently fixed throughout the species. These results are consistent with the view that the presence of TEs in constitutive heterochromatin may have relevance to the expression of heterochromatic genes [e.g., [16,17]].
Genomic analysis of the sequenced y; cn bw sp strain of Drosophila melanogaster identified a full-length Burdock LTR retrotransposon located just 3' to the cathD gene and a 359bp LTR fragment (complete LTR is 659 bp) of an Antonia LTR retrotransposon [15] located within an intron of the Cht3 gene ( Figure 1). A set of PCR primers were designed to amplify regions of both genes and retrotransposon sequences. Appropriate pairs of gene and element primers were used to detect the presence or absence of the respective retrotransposon inserts associated with each gene in strains representing 12 geographically dispersed populations of D. melanogaster. The results presented in Figure 2 and Table 1 demonstrate that while the Burdock insertion located just 3' to cathD gene is not present in any   It is formally possible that the presence of the Antonia LTR within the Cht3 intron was the result of a chance fixation event prior to the expansion of D. melanogaster around the world. Thus, to further test the adaptive hypothesis we compared the level of sequence divergence within the LTR and its flanking intronic sequence between the two sibling species Drosophila melanogaster and Drosophila mauritiana.
If the LTR-containing intron is under stabilizing selection, a lower than neutral rate of substitution would be expected. A total of 685 bp of the Cht3 intron was sequenced. This region spans 264 bp of the 359 bp Antonia LTR fragment. The sequence of this region in a D. melanogaster (Dimonika, Africa) and D. mauritiana (Mauritius, Africa) strain was aligned with the homologous region in the sequenced D. melanogaster y; cn bw sp strain ( Figure 3). The two melanogaster strains were 100% identical. The melanogaster sequences were found to be only 1.3% (9 substitutions/685 nucleotide sites) diverged from that of D. mauritiana. This value is significantly less than half of the expected 4.3 % (± 2.7) divergence based on the Drosophila neutral substitution rate of 0.016 (± 0.005) substitutions/ site/million year [18] over the estimated 2.7 million years separating the two species [19].
To directly compare the substitution rate for the Cht3 intron with that of another Drosophila gene intron, we randomly selected intron 1 of the Drosophila alcohol dehydrogenase (Adh) gene. Adh is a widely studied Drosophila gene and it has been sequenced in several Drosophila species including D. melanogaster, accession X60793 [20] and D. mauritiana, accession M19264 [21]. The sequence divergence between D. melanogaster and D. mauritiana in the Adh intron 1 (7.9%, Figure 4), is higher than that for the LTR containing Cht3 intron (1.3%). These results strongly suggest that conservative selection has been operating on the LTR containing intron associated with the Drosophila Cht3 gene over the past 2.7 million years.

Discussion
For many years, constitutive heterochromatin was considered to be of little or no functional significance [22]. This view seemed to be supported by early molecular studies showing that heterochromatin consists almost exclusively of highly repeated and middle repetitive DNA [e.g., [23,24]]. The middle repetitive fraction was viewed as the descendent of once active TEs that had the misfortune of inserting into transcriptionally inert heterochromatin at some point in their evolutionary history [e.g., [6,20]]. The view of heterochromatin as a genetic wasteland gradually changed with the mapping of a number of functionally important Drosophila genes to constitutive heterochromatin [e.g., [24][25][26][27][28][29][30][31]]. Reexamination of Drosophila constitutive heterochromatin revealed that long stretches of highly repetitive DNA are interrupted by "islands" of retrotransposon sequences [e.g., [32,33]]. Drosophila genes in heterochromatin are typically associated with these islands of retrotransposons [2,31,[34][35][36]. It has been suggested that transposable elements inserted into heterochromatin may locally alter chromatin structure [e.g., [16]]. Our results suggest that in at least some instances, the association of heterochromatic genes with transposable element sequences may be of adaptive significance.

Conclusions
The results presented here are consistent with the hypothesis that a 359 bp fragment of the Antonia retrotransposon located within the intron of the heterochromatic Drosophila melanogaster Cht3 gene may be of adaptive evolutionary significance. Further genomic and molecular analyses will be required to assess the general importance of LTR retrotransposon sequences to the evolution of heterochromatic gene structure and function.

cathD PCR
The reaction mix and program used for all sets of primers are the same as those described for primer set cht3(f) and cht3(r) and primer set Antonia LTR(f) and Antonia LTR(r) in the Cht3 PCR (above). The annealing temperature for primer set cathD(f) and cathD(r) is 58°C, for primer set Burdock LTR(f) and Burdock element(r) is 59°C, and for primer set cathDff) and Burdock element(r) is 56°C.

Sequencing
PCR products of the Cht3 intron were sequenced in the Molecular Genetics Instrumentation Facility at the University of Georgia. Sequences were aligned with Mac Vector 7.0 and compared to the published y; cn bw sp strain. Substitutions and insertion/deletion sites (indels) were summed for each sequence product and compared to the expected divergence based upon the neutral substitution rate. The expected number of polymorphisms between D. melanogaster and D. mauritiana was calculated based on the Drosophila neutral substitution rate of .016 (± 0.005) substitutions per site/million years [18] on 685 bp over a divergence time of 2.7 million years [19].

Note added in proof
The two Cht3 intron fragments descibed in Figure 3 have the following provisional accession numbers in GenBank: D. melanogaster, Africa -AY081055 D. mauritiana -AY081054