Signals of demographic expansion in Drosophila virilis
© Mirol et al. 2008
Received: 12 September 2007
Accepted: 25 February 2008
Published: 25 February 2008
Skip to main content
© Mirol et al. 2008
Received: 12 September 2007
Accepted: 25 February 2008
Published: 25 February 2008
The pattern of genetic variation within and among populations of a species is strongly affected by its phylogeographic history. Analyses based on putatively neutral markers provide data from which past events, such as population expansions and colonizations, can be inferred. Drosophila virilis is a cosmopolitan species belonging to the virilis group, where divergence times between different phylads go back to the early Miocene. We analysed mitochondrial DNA sequence variation among 35 Drosophila virilis strains covering the species' range in order to detect demographic events that could be used to understand the present characteristics of the species, as well as its differences from other members of the group.
Drosophila virilis showed very low nucleotide diversity with haplotypes distributed in a star-like network, consistent with a recent world-wide exponential expansion possibly associated either with domestication or post-glacial colonization. All analyses point towards a rapid population expansion. Coalescence models support this interpretation. The central haplotype in the network, which could be interpreted as ancestral, is widely distributed and gives no information about the geographical origin of the population expansion. The species showed no geographic structure in the distribution of mitochondrial haplotypes, in contrast to results of a recent microsatellite-based analysis.
The lack of geographic structure and the star-like topology depicted by the D. virilis haplotypes indicate a pattern of global demographic expansion, probably related to human movements, although this interpretation cannot be distinguished from a selective sweep in the mitochondrial DNA until nuclear sequence data become available. The particular behavioural traits of this species, including weak species-discrimination and intraspecific mate choice exercised by the females, can be understood from this perspective.
The development of methods to analyse intraspecific phylogenies has provided very valuable tools to understand how populations have been influenced by historical and contemporary processes [1, 2]. Mitochondrial DNA has proved a very useful tool for reconstructing phylogenies of species and historical demographic events [3–8]. Its usefulness for the study of closely related taxa and populations within species lies in its very low rate of recombination, maternal inheritance, conserved structure, reduced effective population size and relatively high rates of evolution [6, 7, 9]. However, after an explosion of studies using mitochondrial DNA for phylogeographic inference, there is nowadays rising concern about making inferences based in this single molecule, due to problems related with recombination, effective population size, mutation rates, introgression and neutrality [10, 11]. These problems influence the construction of species-level phylogenies most profoundly [e.g. ]. However, they may also confound estimations of demographic history and coalescence times within species. Nevertheless, if these drawbacks are taken into account when examining the evolutionary history of a taxon, mitochondrial DNA surveys can provide efficient means of detecting gene flow, levels of reproductive isolation, species boundaries and historical patterns of population structure .
The pattern of variation of mitochondrial DNA, permits inferences about a species' demographic history, including events such as population expansions and colonizations. The application of coalescent theory [14, 15] for the analysis of population data has allowed inferences about past and present population size. A change in population size over time will be reflected in the DNA sequences in such a way that the analysis of the substitutions can indicate the direction and the timing of the change. It has been shown that demographic expansions lead to star-shaped genealogies , an excess of rare mutations  and a unimodal mismatch distribution . Although spatial or range expansions also lead to an increase in the global effective size of a species, they lead to the same molecular signature as that of a purely demographic expansion if the migration rate between demes was large [19, 20]. Otherwise, for relatively low levels of gene flow, multimodal mismatch distributions should be expected . During the Pleistocene, global climate oscillations and the associated cycles of glaciation had a profound influence on the population distributions of many organisms [21–24]. Populations and species retracted their habitats to limited areas which served as refuges. Range expansions and colonizations followed by demographic expansions often happened after these events, when climate ameliorated, and it is possible to find the signals of those processes in contemporary populations.
The Drosophila virilis group comprises 13 species and subspecies divided into two clades, the virilis and montana phylads [25–30]. Throckmorton  suggested that the phylads diverged in the early Miocene, or not later than the Pliocene, when both of them entered the New World by way of Beringia. Divergence times between the phylads have been estimated to be from 7 Mya  to 11 Mya . In general, species within the montana phylad have evolved more in terms of chromosomal re-arrangements, and are also more variable regarding the number of inversions segregating within populations than members of the virilis phylad , although species in this group show a higher number of fusions.
Drosophila virilis group species have been used as models in evolutionary studies for decades and their role in this kind of research will become even more important now that the whole genome of D. virilis has been sequenced http://insects.eugenes.org/species/data/dvir/. Within the virilis phylad, Drosophila virilis is a domestic species of nearly cosmopolitan distribution. The primitive karyotype is found in D. virilis and natural populations are exceptional in having no inversion polymorphism, in contrast to other species of the group. It has been proposed that D. virilis originated somewhere in the ancient deciduous forests of China or in arid regions such as Iran or Afghanistan, and remained isolated from the remaining species of the group until relatively recently . Its ancestral role is supported by its monomorphic karyotype, an extremely high (about 50%) proportion of satellite DNA, and the distribution of various repetitive DNA elements .
Drosophila virilis also differs from other species of the group by the weak species-discrimination and intraspecific mate choice exercised by the females [e.g. [25, 33]]. Although females are able to recognize species-specific characters of the male song, they do not require to hear the song before mating . D. virilis strains originating from different parts of the species distribution have shown significant variation in male courtship song . Rapid expansion into a novel habitat, not occupied by related species, may explain these characteristics. The ability of D. virilis to survive in various kinds of environments can partly be explained by its high thermotolerance  and high tolerance for ethanol . Like several other cosmopolitan species, e.g. D. melanogaster, D. virilis possesses intraspecific genetic differences in thermotolerance with an obvious adaptive significance to local thermal conditions , which may make it easier for the species to occupy new habitats.
In this paper, we report an analysis of the demographic history of D. virilis using mitochondrial DNA sequence data, in order to provide an historical framework for evolutionary studies on life history and behavioural traits of the species and comparisons with other species in the group, especially D. montana, the most divergent species within the virilis group.
Lines of D. virilis used in the study, indicating, when it was available, year of collection and coordinates from which the line originates
43° 3'N, 141° 21'E
40° 22'N, 49° 48'E
46° 57'N, 37° 16'E
41° 39'N, 41° 39'E
41° 39'N, 41° 39'E
40° 12'N, 44° 31'E
41° 18'N, 69° 16'E
53° 47'N, 1° 32'W
Pasadena, California, USA
34° 8'N, 118° 8'W
Truckee, California, USA
39° 19'N, 120° 12'W
When analysing demographic scenarios, where migration between populations, recombination and sexual reproduction usually play an important role, population history might not be tree-like. In these cases intraspecific genealogies constitute the appropriate way to make evolutionary inferences. The genealogical approach, which estimates parameters of the genealogical process, provides a coherent framework in which to consider recombination, migration, selection and other processes . This approach takes into account several characteristics of the populations which make them different to species-level analysis, such as low divergence, ancestral nodes extant in the sampled population, multifurcations, reticulation and large sample sizes [1, 2].
Summary statistics of mitochondrial variation and results from the FLUCTUATE and GENIE analyses.
AICc Best model (second best model)
Θ0 (95% CI)
g (95% CI)
-2.0055 p < 0.01
-14.311 p < 0.0001
Exp 149.83 (Log 148.38)
The Drosophila virilis lines analysed in this study did not show any geographic structure among mitochondrial haplotypes. There was a very low number of substitutions differentiating haplotypes and they were commonly shared among locations. All analyses point towards a rapid population expansion over a short time-scale, which could be consistent with population growth following the end of the last glaciation and/or a shift into domestic environments. These are the most obvious possible causes of expansion, although they suggest population growth would have initiated less than 10,000 years ago which is more recent than the timing implied by the skyline plot, for example, which suggests continuous population growth for >50,000 years (using a substitution rate of 10-8 per year). However, it is not possible to rule out alternative models, such as recent expansion from a smaller but more stable refugial population and time estimation should be treated with caution given the wide confidence interval on the expansion rate estimate and uncertainty about the substitution rate in D. virilis.
In natural populations, ancestral haplotypes may persist and be sampled together with their descendants . The coalescence process predicts that high frequency haplotypes are likely to have been present in the population for a long time . Rare haplotypes represent more recent mutations and are more likely to be related to common haplotypes than to other rare variants . Therefore, ancestral haplotypes have greater probability of being central in a network, being common and having a broader geographic distribution . Following these considerations, the central haplotype in the network appears to be ancestral (Figure 1), and because it is widely distributed gives no information on the geographic origin of the population expansion.
In their study on sequence variation in six X-linked genes of 21 D. virilis strains from different continents, Vieira and Charlesworth  found no fixed differences between the Asian strains and strains originating from Europe or North- or South-America, all the variants found outside Asia being also present in Asia, but not vice versa. These data were consistent with either a large population centred in Asia and a smaller migrant population elsewhere, or a large migrant population that went through a bottleneck . Throckmorton  has previously suggested that D. virilis originated from an ancestral form in Asia, since the most primitive species of the virilis-repleta section of the genus Drosophila have been observed in Southeast Asia. This hypothesis recently received further support from a molecular phylogeny of the virilis section .
In a recent study, Huttunen et al.  analysed 48 microsatellite loci in 30 D. virilis strains, many of which were analysed here for mitochondrial DNA. Although a phylogenetic tree and STRUCTURE analysis showed only moderate clustering of the strains originating from Continental Asia, Europe, America and Japan, an assignment test using a priori information about the geographical origin of the strains gave high posterior probabilities for their correct assignment. Microsatellite variation also showed significant population differentiation, as measured by FST, with evidence for isolation by distance. Variability detected by microsatellites can have a more recent origin than variation measured by mitochondrial DNA. Consequently, the lack of population differentiation in mitochondrial DNA could indicate shared ancestry while microsatellite variation between populations indicates that differentiation is in progress because current gene flow is restricted. Alternatively, the homogeneity of mitochondrial DNA could be the result of a recent selective sweep which did not disturb the pre-existing population structure for nuclear loci. Unfortunately, the use of laboratory strains prohibited testing for a demographic expansion using the microsatellite data.
There are several papers describing genetic variation in another species of the virilis group, D. americana, which, together with D. lummei, represents dispersal of the virilis group into the Neartic region [43–46]. Drosophila americana is represented by two chromosomal forms, D. a. americana and D. a. texana, differing by a fusion between the X and fourth chromosome and the frequency of chromosomal inversions. As in the case of D. virilis, phylogenetic analysis failed to resolve distinct clades between the chromosomal forms and also between geographic regions sampled, with Fst values not significantly different from zero [43, 44]. The absence of differentiation among the geographically distinct populations indicated either that gene flow homogenizes neutral sequence variation, even when different chromosomal rearrangements are involved, or that large populations retain shared ancestral polymorphisms . Furthermore, and opposite to results obtained for D. virilis , an analysis of 27 microsatellite loci in 85 individuals from 6 different natural populations of D. americana showed no differentiation between populations , which confirms the existence of ongoing gene flow between populations of the species.
The pattern of geographic variation in haplotypes of D. virilis, and in D. americana, is in sharp contrast with the pattern recently described for another species of the group, D. montana . In this case there was clear differentiation, based on both mitochondrial DNA and microsatellites, between lines from populations in Finland, Canada and USA. Both markers indicated the presence of at least two distinct populations, one in Eurasia and the other one representing the expansion of the species to the New World, with a divergence time between them estimated from 450,000 to 900,000 years ago, within the Pleistocene. Although D. virilis and D. montana belong to the same species group in the genus Drosophila, they represent different phylads and differ in many characteristics, including chromosomal variation, habitat preferences, and courtship behaviour. Differences in biogeographic history, reflected in the pattern of mitochondrial variation in the two species, could have been important in the origin of these characteristics and constitute the basis for the interpretation of their evolution.
Mitochondrial DNA sequence variation in Drosophila virilis suggests a worldwide exponential population expansion during the Pleistocene, with extensive migration between demes. Slight differentiation at microsatellite markers results either from bottlenecking during or after expansion, or accumulation of differentiation across current barriers to gene exchange. The alternative possibility, that a selective sweep has homogenised mitochondrial sequences without disturbing pre-existing population structure for nuclear markers, cannot be excluded. However, a recent demographic expansion is consistent with the domestic habitat of D. virilis and helps to explain its lack of chromosomal polymorphism as well as evidence for a history of weak selection on mating behaviour by comparison with related species.
In total, 35 Drosophila virilis strains, covering the species' range, were selected for analysis (Table 1). The stocks were collected during a time period covering almost 90 years, from 1913 to flies sampled in China in 2001/2002. A single individual from each strain, either from laboratory stocks or freshly caught, was used to extract DNA and for PCR amplification of the COI and COII mitochondrial genes.
DNA was extracted from ethanol-preserved flies following a standard protocol , where the samples were homogenised in buffer and proteinase k, and DNA was extracted with chloroform-isoamyl alcohol and precipitated with isopropanol. The amplification of mitochondrial DNA was carried out with primers flanking the COII gene in the tRNALYS and tRNALEU [; TL2: 5'-ATGGCAGATTAGTGCAATGG-3', TKN: 5'-GTTTAAGAGACCAGTACTTG-3'], which amplify an 850 bp fragment that includes the 688 bp COII gene, and COI-1460-F: 5'-ATCTATCGCCTAAACTTCAGCC-3' and COI-2195-R: 5'-ACTTCAGGGTGACCAAAAAATC-3' [50, 51] which amplify the complete 670 bp corresponding to the COI gene. PCR reactions were performed in 50 μl volumes including 0.5 μM of each primer, 200 μM dNTPs, 1.5 mM MgCl2 and 1 U Taq polymerase (Bioline) in reaction buffer. Initial denaturation was for 7 minutes at 94°C followed by 35 cycles of 1 minute at 94°C, 1 minute at the annealing temperature (54°C for COI and 56°C for COII) and 1 minute at 72°C, and a final incubation of 5 minutes at 72°C. The products were purified using QIAquick columns (QIAGEN) and sequenced using the forward primer. No ambiguous sites were found in the sequences. Sequences (GenBank accession nos. DQ426800 to DQ426823) were aligned with CLUSTAL-V .
The partition homogeneity test, PHT , as implemented in PAUP 4.0, was used to test for incongruence between the COI and COII data sets. There was congruence between the data-sets and therefore the two fragments were combined for all subsequent analyses. ARLEQUIN 2.0  was used to calculate pairwise distances between haplotypes, the mismatch distributions and tests of the standard neutral model for a demographically stable population (Tajima's D  and Fu's F ).
The program FLUCTUATE  was used to make simultaneous estimates of present day θ and the population growth rate g, assuming an exponential model of growth and using a maximum likelihood approach. The parameters used for the simulations were obtained by running a hierarchy of likelihood-ratio tests in Modeltest 3.0  to choose the model of evolution with the best fit to the data. Skyline plots were constructed using GENIE v. 3.0 . The starting trees were obtained using maximum likelihood with molecular clock enforced. The software requires that the genealogy is estimated under the assumption of a molecular clock . GENIE was also used to calculate the fit to different models of population growth, with fit assessed using the corrected Akaike Information Criterion.
Networks of haplotypes were constructed based on statistical parsimony using the program TCS 1.06 .
We are grateful to the members of the 'Co-evolved Traits' Research Training Network for their valuable input to the work presented here and to the European Commission for funding the network (HPRN-CT-2002-00266). Special thanks are due to Zhang Wenxia for help in collecting D. virilis flies in China. Laboratory strains of the species were obtained also from Jorge Vieira, Michael Evgenev and Bowling Green stock center.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.