Genetic hitchhiking in a subdivided population of Mytilus edulis
© Faure et al; licensee BioMed Central Ltd. 2008
Received: 07 February 2008
Accepted: 30 May 2008
Published: 30 May 2008
Few models of genetic hitchhiking in subdivided populations have been developed and the rarity of empirical examples is even more striking. We here provide evidences of genetic hitchhiking in a subdivided population of the marine mussel Mytilus edulis. In the Bay of Biscay (France), a patch of M. edulis populations happens to be separated from its North Sea conspecifics by a wide region occupied only by the sister species M. galloprovincialis. Although genetic differentiation between the two M. edulis regions is largely non-significant at ten marker loci (average FST~0.007), a strong genetic differentiation is observed at a single locus (FST = 0.25). We validated the outlier status of this locus, and analysed DNA sequence polymorphism in order to identify the nature of the selection responsible for the unusual differentiation.
We first showed that introgression of M. galloprovincialis alleles was very weak in both populations and did not significantly affect their differentiation. Secondly, we observed the genetic signature of a selective sweep within both M. edulis populations in the form of a star-shaped clade of alleles. This clade was nearly fixed in the North Sea and was segregating at a moderate frequency in the Bay of Biscay, explaining their genetic differentiation. Incomplete fixation reveals that selection was not direct on the locus but that the studied sequence recombined with a positively selected allele at a linked locus while it was on its way to fixation. Finally, using a deterministic model we showed that the wave of advance of a favourable allele at a linked locus, when crossing a strong enough barrier to gene flow, generates a step in neutral allele frequencies comparable to the step observed between the two M. edulis populations at the outlier locus. In our case, the position of the barrier is now materialised by a large patch of heterospecific M. galloprovincialis populations.
High FST outlier loci are usually interpreted as being the consequence of ongoing divergent local adaptation. Combining models and data we show that among-population differentiation can also dramatically increase following a selective sweep in a structured population. Our study illustrates how a striking geographical pattern of neutral diversity can emerge from past indirect hitchhiking selection in a structured population.
Nucleotide sequences reported in this paper are available in the GenBank™ database under the accession numbers EU684165 – EU684228.
The detection of adaptive evolution at the molecular level has essentially relied on indirect inferences . This is simply because genomes are very large and adaptive evolution probably occurs only at a few segregating mutations at a given time . However, adaptive evolution leaves a footprint on the pattern of neutral diversity , which widens both the genomic extent and the time scale on which adaptation can be detected. The theory is very well developed in the case of a single panmictic population. The hitchhiking model of Maynard Smith and Haigh  predicts that the fixation of an advantageous mutation decreases the diversity at linked neutral loci. The effect of a so-called selective sweep on the allele frequency spectrum , and more generally on gene genealogies [5, 6] is also well established. Along with the development of a battery of statistical tests , empirical examples have accumulated [8–12]. Although the indirect path through which selection shapes genetic diversity bears many resemblances to demographic effects [13, 14], selection only acts on the chromosomal neighbourhood of the site targeted by selection while demography affects the whole genome [15, 16]. The recent development of genome scans now allows appreciating how genomes seem crippled by numerous signatures of adaptive evolution [17, 18].
Most models describe the spread of an advantageous mutation in a single isolated population and examples of genetic hitchhiking remain bordered on such an idealised model. However, natural populations are most of the time structured into geographically or genetically partially isolated populations . The hitchhiking effect in structured populations has been less intensively investigated. Few models have been developed [20–23] and they sometimes give contradictory results. Although a reduction of genetic diversity at the metapopulation level is a robust expectation, the effect of a selective sweep on the distribution of the genetic diversity within and between populations is less clear. Although statistical tests have been developed in order to identify loci showing more or less population differentiation than predicted under neutrality [24–27], the exact form of the selection responsible for extreme differentiations is hardly ever addressed. Indeed, loci with a higher than expected FST are simply assumed to be under divergent selection ('local adaptation') while loci with a lower than expected FST are assumed to be under balancing selection. In addition, the question of whether selection acted directly on the polymorphism observed or indirectly through genetic hitchhiking is often eluded . Regrettably, it has seldom been noticed by experimentalists that the hitchhiking effect of an unconditionally favourable mutation that spread from its deme of origin to other demes by migration ('hitchhiking in space' ) can sometimes enhance population differentiation as measured by FST . Few studies have attempted to investigate more explicitly how selection has operated on an outlier locus by examining DNA sequence variation . Albeit some genome scans were recently conducted with DNA sequence polymorphisms in humans and flies [17, 31], these studies emphasised the occurrence of hitchhiking within derived populations thought to have recently adapted to a new environment (i.e. the 'local adaptation' hypothesis).
Is EFbis an FSToutlier?
Genetic composition of samples
Sample sizes, molecular diversities and tests of neutrality
Sample name :
M. galloprovincialis Atlantic (FA+PR)
0.013 ± 0.003
M. edulis Bay of Biscay (LU)
0.014 ± 0.002
0.015 ± 0.002
M. edulis North Sea (WS)
0.007 ± 0.003
0.01 ± 0.005
Significant genetic differentiation was observed at the EFseq locus with permutation test between each pair of populations except between the M. galloprovincialis samples of Brittany and the Atlantic coast of the Iberian Peninsula, as was already the case with intron-length polymorphism. As a consequence the two samples were pooled together and simply labelled as M. galloprovincialis. Basic descriptors of polymorphism are presented in Table 1. A high level of nucleotide diversity was observed. The M. edulis sample of the North Sea, however, exhibited a significantly lower level of nucleotide diversity (θ π , ). In addition, the polymorphism of this sample was essentially composed of rare mutations (singletons). This result logically echoed the lack of diversity obtained in the North Sea patch with length-polymorphism in the third intron (Figure 2A).
Correspondence between length and sequence polymorphisms
Departure from neutrality in M. edulis samples
We examined whether the pattern of variation observed at locus EFbis within each population of M. edulis was consistent with the neutral model at mutation-drift equilibrium. To remove the effect of introgression, the single sequence of the B clade sampled in the population of the Bay of Biscay (LU) was removed from the analysis. The values of Tajima's D  and Fay and Wu's H  together with their significance are presented in Table 1. A strong departure from mutation-drift equilibrium was observed in the sample of the North Sea (WS). Sequences from WS differ mainly by singleton mutations which results in a star-shaped genealogy (Figure 5) typical of a selective sweep while a single sampled lineage survived to the sweep. Indeed, the significant excess of derived variants at high frequency observed in such a genealogy (i.e. a significant Fay and Wu's test) has been advocated to be a unique pattern produced by hitchhiking . Whilst the same star-like clade (denoted A1* in Figure 5) was also present in sample LU, though in moderate frequency, the departure was not detected by the two tests (Table 1). Indeed, partial sweeps are not easily detected by these two tests [11, 40] because they generates an excess of rare and intermediary mutations that compensate in the Tajima's D test and the frequency of derived variants is not high enough to be captured in the Fay and Wu's H test. For instance, Santiago and Caballero  showed that positive instead of negative Tajima's D values can sometimes be generated in some subpopulations after a selective sweep in a subdivided population. In order to detect more efficiently the distortion of the distribution in coalescence times in the two sample genealogies we applied the coalescence-based maximum-likelihood method of Galtier et al. . We were able to easily apply this method to our data as they were compatible with the infinite mutation model (no recombination, Table 1). For the two populations, the model with a recent reduction in effective size was significantly better supported than the mutation-drift equilibrium model (Likelihood ratio tests: p = 0.001 and p = 0.03 for samples WS and LU respectively). However, the estimated strength of this effect was stronger in the North Sea (S = 1.5 coalescence time units) than in the Bay of Biscay (S = 0.9 coalescence time units) while the estimated age of the event was roughly the same in the two samples (T~0.5 coalescence time units). These results provide statistical support to what can be easily visualised in the reconstructed sample genealogies presented in Figure 5.
Genetic hitchhiking past a barrier to gene flow
We hypothesised that the past fixation of the same favourable mutation could have been responsible for the distortion of allele genealogies in the two M. edulis patches, a hypothesis on which we will come back in the Discussion section. Our analyses suggest that the distortion of allele genealogies and allele frequencies is less pronounced in the patch of the Bay of Biscay than in the patch of the North Sea. Although few models of genetic hitchhiking in subdivided populations are available, they have emphasised that this situation could produce spatial variation in genetic diversity even without spatial variation in selection regimes . In order to illustrate this effect in a context that matches our study system we modelled the hitchhiking effect of a favourable mutation that crosses a barrier to gene flow (see methods). We considered two patches of hundred of demes. Within patches, a stepping stone of demes were connected by an appreciable level of migration (m > 0.1), while the migration rate between patches was much smaller (m bar << 0.1). Our model therefore resembles the model of Barton  within patches and the model of Slatkin and Wiehe  between patches.
Our eleven-locus FST scan of the Mytilus edulis genome provided clear results: one locus, EFbis, behaved very differently from the other 10 loci (Figure 2A) and the outlier test of Beaumont and Nichols  allowed us to validate the impact of selection on this locus (Figure 2B). The approach of scanning genomes for FST outliers is becoming a standard in molecular ecology  and the number of studies that inferred candidate loci for adaptation through this approach is increasing rapidly [43–45]. These studies often used molecular markers that do not easily allow reconstructing the genealogical relationship among alleles, such as microsatellites or AFLPs. Although selected loci are discovered, one can get little information on the exact form of selection that is/was responsible for the discrepancies observed at these loci. DNA sequence polymorphism allows a more precise analysis of selective effects but the approach of scanning genomes for regions of low nucleotide diversity [17, 31] is rarely simultaneously conducted with FST scans. Few studies have pursued the analysis of an FST-outlier locus by a more refined analysis of DNA sequence variation [30, 46]. The results we here obtained in the analysis of DNA sequence polymorphism at an FST-outlier locus were in remarkable agreement with the FST-scan approach as we observed a clear departure from mutation-drift equilibrium in the form of a clade of alleles with a star-shaped genealogy (clade A1* in Figure 5). We are now in a position to discuss the various possible selective scenarios which could have created the pattern of differentiation observed.
The genetic structure observed at the EF1α gene between M. edulis populations is not a consequence of differential introgression
When we undertook the analysis of DNA sequences at the EF1α gene we had in mind the hypothesis of adaptive introgression of a galloprovincialis allele within the M. edulis patch of the Bay of Biscay. We are now in position to dismiss this interpretation, as our data establish that introgression does not interfere with differentiation between M. edulis populations at the EFbis locus. The phylogenetic analysis of DNA sequence polymorphism revealed the strong divergence that exists between alleles of the two species and only a single introgressed allele has been sampled (Figure 4). The analysis of allele frequencies allowed us to study more precisely introgression levels. Although introgression appeared to be slightly stronger in the enclosed patch of the Bay of Biscay than in the Northern peripheral patch, introgression is low in both patches and does not perturb the analysis of the differentiation between the two M. edulis patches.
Indirect hitchhiking selection
In this section, we would like to settle whether selection acted directly on the locus surveyed or indirectly on a linked locus. The EF1α gene is recognised to be under strong purifying selective constraints and is not a good candidate for short term adaptive evolution. In addition, the few polymorphic non-synonymous mutations observed were in low frequency (singletons or doubletons). Furthermore, as we did not detect recombination in our data, the hitchhiking effect of a favourable mutation localised within the sequence surveyed should have eliminated all linked variation and produced a perfect star genealogy on which new post-hitchhiking mutations would map. In the presence of recombination between a selected locus and the locus studied, hitchhiking should have been incomplete and ancestral lineages could have survived to the sweep producing a partially star-shaped genealogy . The genealogies obtained, in which such sweep-surviving lineages have been sampled (Figure 5) are therefore in much better agreement with the hypothesis of indirect selection.
As selection must have been indirect, an inevitable question is the chromosomal distance that separates the selected locus and the locus surveyed. With the data at hand we cannot answer precisely this question but can try to give some indications. The chromosomal length affected by a selective sweep mainly depends on the strength of selection, the recombination rate, the population size, and the time elapsed since the sweep [47–49]. Here, the sweep would be young enough for a quasi absence of younger coalescence than the multifurcated one produced by the sweep (Figure 5). The population size, N, is negatively correlated with the magnitude of the hitchhiking effect. Given the very large population size for M. edulis, one may argue that a selective sweep would not have effects extending very far on either side of the adaptive substitution which may therefore possibly belong to unsequenced portion of the EF1α gene itself. However, thorough investigations of the effect of N on the hitchhiking effect showed that it is not as large as that determined by the r/s ratio [6, 49]. In fact, the main effect of N is to determine p0 the initial frequency of the favourable mutant (p0 = 1/2N), and was already incorporated in the deterministic model of Maynard Smith and Haigh , while the effect of a finite population size is negligible as soon as N is not too small . In the single population deterministic model, the final frequency after the sweep, u*, of the neutral allele A that hitchhikes with the favourable mutation is:
u* = u0 + (1-u0) p0 r/s
where u0 is the initial frequency of A and p0 the initial frequency of the favourable mutation . Taking into account the gene diversity observed in the M. galloprovincialis sample (Hd = 0.99) that has not been influence by hitchhiking selection, one can easily assume u0 to be small and simply approximate equation 1 by:
Equation 2 allows us to have an idea of the r/s ratio required to produce the frequency of A1* allele observed in the sample of the North Sea (WS), which was ~0.9. For instance, r/s would need to be 0.015 if N = 103, 0.008 if N = 106, 0.006 if N = 108. Although we have no idea of the local recombination rate in the region of the mussel genome in which EF1α is located, we know that the recombinational size of the M. edulis genome is ~1000 cM  and the physical size is ~1500 Mb , which leads an average recombination rate of 0.7 cM/Mb (as an indication, the same calculation in Drosophila melanogaster would give 1.5 cM/Mb). The distance, d, in Kb that separates the EFbis locus and the selected locus is therefore estimated to be ~1000s (e.g. 1 Kb when s = 0.001, 100 Kb when s = 0.1).
Selective scenarios: contemporary local adaptation versus past unconditional positive selection
Usually high FST outliers are interpreted as being the consequence of local adaptation [25, 42, 52]. The simplest form is disruptive selection on a bi-allelic locus in a two-habitat model, one allele being favoured in one habitat and the other allele being favoured in the other habitat. It has been shown that local selection produces higher FST values than expected without, and footprints on the diversity at linked neutral loci [20, 43]. Under such a scenario, the derived favoured mutation at the selected locus responsible for the star-shaped clade of alleles at the neutral locus (A1* in Figure 5) would have been fixed by positive selection in populations of the North Sea only while being counter-selected in populations of the Bay of Biscay. The presence of alleles of the clade A1* in the enclosed patch of the Bay of Biscay would be a consequence of recombination and gene flow. However, our data revealed many A1* alleles in the Bay of Biscay and very few A2 alleles in the North Sea. In the local adaptation scenario, this would imply asymmetric effects of recombination and neutral gene flow (predominantly from the North Sea to the Bay of Biscay but not the reverse), which are hard to explain. Besides, although some may inevitably point to some environmental differences between the North Sea and the Bay of Biscay (e.g. temperature), it is not clear that these differences should be more pronounced that the environmental heterogeneity observed within each patch.
We therefore hypothesised that the same favourable mutation had gone to fixation in the two populations , although this hypothesis is rarely considered to explain high FSTs. The interest of this hypothesis is to be inherently asymmetric because the wave of advance of a favourable allele is directional from its patch of origin to the other. The simple model we developed to illustrate this scenario together with the few available models of genetic hitchhiking in subdivided populations [21–23] show that the hitchhiking effect is expected to diminish as the favourable mutation spreads from the deme in which it originates. The presence of a barrier to gene flow amplifies the effect (Figure 8) owing to the delay produced to the spread of the favourable allele and to the peculiar behaviour of a wave of advance when crossing a barrier . This would explain the moderate frequency of clade A1* and the persistence of unswept alleles (e.g. clade A2) in the Bay of Biscay. The genealogy presented in Figure 5 fits very well with a genealogical interpretation of the model. Although our model is bi-allellic, one simply has to imagine that the unswept allele, a, is composed of old lineages that survived the sweep (non-A1* sequences) and the swept allele, A, is composed of young lineages belonging to the star genealogy (A1* sequences). The moderate frequency of the swept allele would also explain why statistics that summarise the mutation frequency spectrum such as Tajima's D and Fay and Wu's H did not capture the departure from the standard coalescent in the second patch , although methods that incorporate the additional information of linkage disequilibrium can . Furthermore, the delay required for the wave to cross the barrier should result in a younger coalescence (smaller terminal branches) of the lineages affected by the selective sweep (A1*) in the patch of the Bay of Biscay, a tendency that is actually observed in Figure 5. However, the barrier needs to be strong in order to generate the observed step in allele frequency, Δp~0.4 (Figure 9, also see ). Furthermore, we have seen that the r/s ratio needed to be small in order to generate a high frequency of A1* allele in the North Sea (0.005 <r/s < 0.02). Combining the two observations allows deducing that the migration rate between the two patches needs to be lower than mbar~10-8 (dot in Figure 9).
A necessary hypothesis of the hitchhiking in space scenario is that selection should be a stronger force than genetic drift. Otherwise the barrier to gene flow between the two patches of M. edulis would produce detectable FST at other neutral loci than EFbis. This assumption is present in the models of Slatkin and Wiehe  and ours that are deterministic. To employ the terminology developed by Gillespie , genetic drift should be a minor force relative to genetic draft, the impact of indirect selection on neutral variability. The scenario proposed would therefore mainly concern species with large effective population sizes. Although their effective size has often been debated , marine bivalves are known to reveal among the highest levels of genetic diversity ever observed within the animal kingdom [54–56]. Levels of diversity observed in Mytilus are in accordance with this view (Table 1). Indeed, marine bivalves are good candidates for the genetic draft model as was established for mitochondrial DNA  and suspected for nuclear genes .
Another question that emerges is the nature of the barrier to gene flow. A first possibility is geographic isolation, as is observed nowadays. The enclosed situation of the M. edulis patch of the Bay of Biscay isolates it from peripheral populations. Assuming that the local biogeography (i.e. the mosaic structure observed nowadays) has been stable for a while, the two M. edulis patches might be connected only by very rare events of long range dispersal. Alternatively, one can imagine that the favourable mutation has had to transit through the M. galloprovincialis genomic background before reaching the enclosed patch of M. edulis. Under this scenario the mutation would need to be neutral or slightly deleterious in the M. galloprovincialis background although favourable in the M. edulis background. It is also possible that the sweep predates the mosaic structure observed nowadays. We would therefore need to consider the possibility that the barrier was genetic instead of physical. The genetic barrier would simply have come to coincide secondarily with a region of low dispersal as theoretically expected . In a sense considering a genetic barrier amounts to consider the first scenario again -that the two M. edulis populations are structured into two backgrounds in such a way that at least in the genomic region of the EF1α gene, mixtures of genes from North Sea and Bay of Biscay are counter-selected. However, even if divergent selection occurs in the background, a neutral allele of the screened locus is assumed to have hitchhiked with an unconditionally favourable mutation from its background of origin to the other.
To conclude, we have shown that, in a structured population, a selective sweep at a positively selected linked locus is a simple scenario to account for unusually high level of differentiation at a marker locus. This scenario has rarely been considered in the literature  and likely applies to the example of the EF1α gene in M. edulis. However we cannot completely exclude more complex scenarios of local adaptation, whereby the selected allele responsible for the selective sweep would be confined in its patch of origin. A decisive (though very demanding) test would require walking on the chromosome toward the direct target of selection: under the scenario of local adaptation FST should increase  while under the scenario of unconditional positive selection FST should decrease (Figure 9, ).
Distinguishing the two scenarios ('local adaptation' and 'hitchhiking in space' ) is not unnecessary subtlety, it has important consequences on our appreciation of the cost paid by species to adapt to local environmental variation. It is very different to conclude that ~10% (one of eleven randomly chosen loci) of the genome has been marked by the footprint of past positive selection or ~10% of the genome is affected by a polymorphism maintained by selection in a heterogeneous environment. This is the reason why the 'hitchhiking in space' hypothesis should be considered more closely in genome scan studies.
On the basis of previous publications, we selected four geographic samples of 48 individuals to represent each of four species patches: FA (Faro, Algarve, Portugal) and PR (Primel, Brittany, France) respectively representative of M. galloprovincialis populations of the Atlantic Coasts of the Iberian Peninsula and Brittany, and LU (Lupin, Charente-Poitou, France) and WS (Wadden Sea, Holland) respectively representative of M. edulis populations of the Bay of Biscay and the North Sea (Figure 1). The PR sample was described in Bierne et al. . Samples FA and LU are new samples collected in the same sites as samples Faro and Brouage described in Bierne et al. . Sample WS is a new sample collected in the middle of a well characterised peripheral patch of the hybrid zone, well away from the transition zones. Additionally, we used a sample of M. trossulus (GD) from Gdansk (Poland) in the Baltic Sea as an outgroup. The new samples were treated as previously described [33, 59] except that we used the phenol-chloroform protocol to extract genomic DNA rather than the Chelex protocol.
In order to test that the level of differentiation observed at the EFbis locus was significantly higher than the level of differentiation observed at other loci between the peripheral M. edulis population of the North Sea and the internal patch of the Bay of Biscay, we compiled data on 5 allozyme loci (EST D, LAP, PGI, OCT, MPI) from Coustau et al.  and 6 nuclear DNA loci (EFbis, mac-1, Glu5', DAMP1, DAMP2, DAMP3) from Bierne et al. . The patch of the Bay of Biscay was represented by samples "Noirmoutier" and "Royan" of Coustau et al.  for allozyme loci and samples "Brouage" and "Boyar" of Bierne et al.  for nuclear DNA markers. The patch of the North Sea was represented by samples "Caen" and "Danemark" of Coustau et al.  for allozyme loci and samples "Grand-Fort-Philippe", "Tichwell" and "Cley" of Bierne et al.  for nuclear DNA markers. We first depicted the diversity observed in each patch at each locus by computing unbiased gene diversities  and estimating their confidence intervals by permutation techniques with the Genetix 4.0 software . Secondly, we used the method of Beaumont and Nichols  to identify loci that depart from the expected neutral distribution of FST. The average and 95% confidence interval of single locus FST as a function of heterozygosity was obtained from simulations performed with the fdist2 program. In order to be as conservative as possible, we used several set of parameters (number of demes, mutation model etc.) and used parameters that maximized the upper limit of the 95% confidence interval.
Analysis of the genetic composition of samples
The four newly sampled populations were analysed at the same three loci used by Bierne et al. , EFbis, mac-1 and Glu-5'. Polymerase chain reaction (PCR) and electrophoresis were performed as previously described. The fluorescent dye 5' end-labelled-primer technique was used, with dye 6-FAM (Sigma Genosys) for the forward primer of mac-1 and the primer Me15 of Glu-5' and TAMRA (Sigma Genosys) for the primer EFbis-F. Gels were scanned in a FMBIO II fluorescence imaging system (Hitachi Instruments) at 505 and 605 nm.
We performed a correspondence analysis (CA) on the matrix of allele counts per sample using the Genetix 4.0 software . This CA was performed in order to verify that the genetic composition of the four samples chosen for DNA sequencing matched the expectation from their geographic location. We used 11 previously analysed samples chosen to be representative of the four geographic patches: Faro (a previous sample from Portugal performed at the same location as sample FA), Setubal, Biarritz, Brouage, Boyard, Moguériec, Morlaix, Polzeath, Grand-Fort-Philippe, Cley, Tichwell . In addition, homogeneity of allelic frequencies between pairs of populations was tested by an exact test using the Genepop software  which allowed us to group samples with the same genetic composition. Finally, departure from Hardy-Weinberg and linkage equilibrium was tested with the Maximum-likelihood method of Barton  using species-specific compound alleles as described in Bierne et al. .
DNA sequences were obtained on a longer fragment of the EF1α gene than the EFbis locus, including the full region screened for length polymorphism. The new locus named EFseq was approximately one kilobase long (the shortest observed allele was 1013 bp long and the longest observed allele was 1349 bp long) and includes the second and the third introns, the third exon and portions of the second and fourth exons of the EF1 α gene. EFseq was amplified with the same reverse primer as EFbis, EFbis-R , which was designed in the fourth exon of EF1α, and a newly designed forward primer, EFseq-F (5'-AGGCTCCTTCAAGTACGCCTGGG-3'), designed in the second exon. The protocol of PCR reactions was the same as the one described for the EFbis locus  except that the annealing temperature was increased to 60°C and the elongation step was increased to one minute. The following steps -which include purification of PCR products, cloning and sequencing- were done following the mark-recapture (MR)-cloning protocol described in Bierne et al. . Briefly, MR-cloning allowed us to perform a single cloning reaction per population samples. Each individual of a sample were PCR-amplified separately using 5'-tailed primers with small poly-nucleotide tags. PCR products of similar quantities were mixed together and cloned into a pGEM-T vector by using a Promega pGEM-T cloning kit (Promega, Madison, WI, USA). Clones were sequenced with universal plasmid primers. The individual from which a sequence comes was identified by the tag sequences upstream of each initial primer. A consequence of the MR-cloning protocol is that the sample size is not completely under control. Within a given number of sequenced clones, the same allele of the same individual (recognised using the nucleotide tag) can be cloned several times, while some alleles or individuals are absent. Therefore the number of different sequences obtained is less than, although positively correlated to, the number of positive clones sequenced (called the effort of capture in ). However, an interesting side-effect of this protocol is the opportunity to assess the error rate due to mutations during the cloning and amplification process. Singleton mutations (which are important indicators of selective or demographic effects) are particularly sensitive to such artefacts; by restricting the analysis to sequences that were captured twice or more for the same individual, one can assess the potential impacts of artefacts .
Sequence alignment was performed with ClustalW  in the BioEdit interface  and verified by eye. For each sequence, the size of the region corresponding to the EFbis locus was computed. Alignment gaps were then excluded from the analyses. To give a representation of allele genealogy, phylogenetic reconstructions were obtained with Mega 3.1  using the neighbour-joining (NJ) algorithm with number of nucleotide differences. We used DNAsp  to compute basic population genetic parameters: the number of polymorphic sites, the number of synonymous, non-synonymous and non-coding mutations, levels of nucleotide diversity estimated from the number of polymorphic sites, θW , or from pairwise differences, θπ , and the minimum number of recombination, RM, estimated by the method of Hudson and Kaplan . DNAsp was also used to compute some indicators of the distortion of the allele frequency spectrum from the neutral expectation at mutation-drift equilibrium. Tajima's D  is a well-known statistic that proved very efficient to detect a shift of the allele frequency spectrum towards low-frequency polymorphism. The Fay and Wu's H test  is a more recent improvement that focuses on high frequency derived mutations, an excess of which is a specific footprint of a selective sweep. Departure from the neutral expectation at mutation-drift equilibrium was tested by coalescent simulations without recombination conditional on the number of segregating sites  that here proved to be the most conservative procedure. To the summary statistic approaches we added the coalescence-based maximum-likelihood method of Galtier et al. . This method is designed to detect a distortion in the shape of gene genealogies generated by a diversity-reducing event (hitchhiking or bottleneck). The likelihood of a model in which a drop in effective size of strength S occurred at time T in the past is compared to the likelihood of a constant-size model. Galtier et al.  defined S as the time that would be required to generate the same amount of coalescence if the population size had not changed. S is therefore expressed in coalescent time units (i.e. units of 2N e generations).
Deterministic model of genetic hitchhiking in a subdivided population
We developed a simple simulation model in order to illustrate the impact of a barrier to gene flow on the hitchhiking process. We modelled the fixation of an advantageous mutation and its effect on a linked neutral locus in large populations with a deterministic approach [3, 21]. We used a classical linear one-dimensional stepping-stone model composed of 200 demes. The migration rate between demes was m (m/2 in either direction). A barrier to gene flow was positioned in the middle of the metapopulation. The migration rate was m bar (m bar <<m) between deme 100 and deme 101. We considered two bi-allelic haploid loci with recombination rate c between them. One locus was neutral and the other was under positive selection. At the selected locus, allele B has a selective advantage s over the alternative allele b. Initially, all the demes were fixed for b and allele B was introduced in the first deme on the left side of the chain at a frequency p0. At the neutral locus, we assumed that one allele, A, was initially in frequency u0 in all the demes of the chain (the other allele, a, being at frequency 1-u0). Allele B at the selected locus was initially on a chromosome carrying A at the neutral locus. Genotypic frequencies in each deme at a given generation are deduced from the frequencies of the previous generation after accounting for migration, recombination and selection. We registered the evolution of allele frequencies and the speed of the wave of advance. To calculate the speed, the centre of the wave was defined as the deme in which the B allele frequency was closer to 0.5. A Borland Delphi 4.0 program is available from the authors upon request.
The authors are very grateful to Christine Coustau and Thierry De Meeus for sharing their allozyme datasets, to Khalid Belkhir for the permutation procedure to compute confidence intervals of heterozygosity, to two anonymous referees for helpful comments on the manuscript and to the National Sequencing Centre (Genoscope at Evry) and the IFR119 "Montpellier Environnement Biodiversité" for providing access to their sequencing platforms. This work was funded by the French national programme GIS "Génomique Marine" (The Bivalvomix Project, coord. N. Bierne) and the Marine Genomics Europe NoE (Fish & Shellfish node). This is article ISEM 2008-035 of Institut des Sciences de l'Evolution de Montpellier.
- Kreitman M, Akashi H: Molecular evidence for natural-selection. Annu Rev Ecol Syst. 1995, 26: 403-422.View ArticleGoogle Scholar
- Kreitman M: Methods to detect selection in populations with applications to the human. Annu Rev Genomics Hum Genet. 2000, 1: 539-559.View ArticlePubMedGoogle Scholar
- Maynard Smith J, Haigh J: Hitch-hiking effect of a favorable gene. Genet Res. 1974, 23: 23-35.View ArticleGoogle Scholar
- Fay JC, Wu CI: Hitchhiking under positive Darwinian selection. Genetics. 2000, 155: 1405-1413.PubMed CentralPubMedGoogle Scholar
- Hudson RR, Kaplan NL: The coalescent process in models with selection and recombination. Genetics. 1988, 120: 831-840.PubMed CentralPubMedGoogle Scholar
- Barton NH: The effect of hitch-hiking on neutral genealogies. Genet Res. 1998, 72: 123-133.View ArticleGoogle Scholar
- Nielsen R: Statistical tests of selective neutrality in the age of genomics. Heredity. 2001, 86: 641-647.View ArticlePubMedGoogle Scholar
- Aguade M, Miyashita N, Langley CH: Polymorphism and divergence in the Mst26a male accessory-gland gene region in Drosophila. Genetics. 1992, 132: 755-770.PubMed CentralPubMedGoogle Scholar
- Depaulis F, Brazier L, Veuille M: Selective sweep at the Drosophila melanogaster Suppressor of Hairless locus and its association with the In(2L)t inversion polymorphism. Genetics. 1999, 152: 1017-1024.PubMed CentralPubMedGoogle Scholar
- Wang RL, Stec A, Hey J, Lukens L, Doebley J: The limits of selection during maize domestication. Nature. 1999, 398: 236-239.View ArticlePubMedGoogle Scholar
- Quesada H, Ramirez UEM, Rozas J, Aguade M: Large-scale adaptive hitchhiking upon high recombination in Drosophila simulans. Genetics. 2003, 165: 895-900.PubMed CentralPubMedGoogle Scholar
- Li HP, Stephan W: Inferring the demographic history and rate of adaptive substitution in Drosophila. PLoS Genet. 2006, 2: 1580-1589.Google Scholar
- Felsenstein J: Evolutionary advantage of recombination. Genetics. 1974, 78: 737-756.PubMed CentralPubMedGoogle Scholar
- Gillespie JH: Genetic drift in an infinite population: The pseudohitchhiking model. Genetics. 2000, 155: 909-919.PubMed CentralPubMedGoogle Scholar
- Hudson RR, Kreitman M, Aguade M: A test of neutral molecular evolution based on nucleotide data. Genetics. 1987, 116: 153-159.PubMed CentralPubMedGoogle Scholar
- Galtier N, Depaulis F, Barton NH: Detecting bottlenecks and selective sweeps from DNA sequence polymorphism. Genetics. 2000, 155: 981-987.PubMed CentralPubMedGoogle Scholar
- Glinka S, Ometto L, Mousset S, Stephan W, De Lorenzo D: Demography and natural selection have shaped genetic variation in Drosophila melanogaster: A multi-locus approach. Genetics. 2003, 165: 1269-1278.PubMed CentralPubMedGoogle Scholar
- Nielsen R, Bustamante C, Clark AG, Glanowski S, Sackton TB, Hubisz MJ, Fledel-Alon A, Tanenbaum DM, Civello D, White TJ, Sninsky J, Adams MD, Cargill M: A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol. 2005, 3: e170-PubMed CentralView ArticlePubMedGoogle Scholar
- Charlesworth B, Charlesworth D, Barton NH: The effects of genetic and geographic structure on neutral variation. Annu Rev Ecol Syst. 2003, 34: 99-125.View ArticleGoogle Scholar
- Charlesworth B, Nordborg M, Charlesworth D: The effects of local selection, balanced polymorphism and background selection on equilibrium patterns of genetic diversity in subdivided populations. Genet Res. 1997, 70: 155-174.View ArticlePubMedGoogle Scholar
- Slatkin M, Wiehe T: Genetic hitch-hiking in a subdivided population. Genet Res. 1998, 71: 155-160.View ArticlePubMedGoogle Scholar
- Barton NH: Genetic hitchhiking. Philos Trans R Soc Lond B Biol Sci. 2000, 355: 1553-1562.PubMed CentralView ArticlePubMedGoogle Scholar
- Santiago E, Caballero A: Variation after a selective sweep in a subdivided population. Genetics. 2005, 169: 475-483.PubMed CentralView ArticlePubMedGoogle Scholar
- Lewontin RC, Krakauer J: Distribution of gene frequency as a test of theory of selective neutrality of polymorphisms. Genetics. 1973, 74: 175-195.PubMed CentralPubMedGoogle Scholar
- Beaumont MA, Nichols RA: Evaluating loci for use in the genetic analysis of population structure. Proc Biol Sci. 1996, 263: 1619-1626.View ArticleGoogle Scholar
- Vitalis R, Dawson K, Boursot P: Interpretation of variation across marker loci as evidence of selection. Genetics. 2001, 158: 1811-1823.PubMed CentralPubMedGoogle Scholar
- Beaumont MA, Balding DJ: Identifying adaptive genetic divergence among populations from genome scans. Mol Ecol. 2004, 13: 969-980.View ArticlePubMedGoogle Scholar
- Bierne N, Daguin C, Bonhomme F, David P, Borsa P: Direct selection on allozymes is not required to explain heterogeneity among marker loci across a Mytilus hybrid zone. Mol Ecol. 2003, 12: 2505-2510.View ArticlePubMedGoogle Scholar
- Wiehe T, Schmid K, Stephan W: Selective Sweep. 2005, New York, Kluwer Academic, 104-113. Selective sweeps in structured populations-Empirical evidence and theoritical studies, Nurminsky D, Molecular Biology Intelligence Unit, Landes R,View ArticleGoogle Scholar
- Pogson GH, Mesa KA, Boutilier RG: Genetic population-structure and gene flow in the atlantic cod Gadus morhua - a comparison of allozyme and nuclear RFLP loci. Genetics. 1995, 139: 375-385.PubMed CentralPubMedGoogle Scholar
- Akey JM, Eberle MA, Rieder MJ, Carlson CS, Shriver MD, Nickerson DA, Kruglyak L: Population history and natural selection shape patterns of genetic variation in 132 genes. PLoS Biol. 2004, 2: 1591-1599.View ArticleGoogle Scholar
- Skibinski DOF, Beardmore JA, Cross TF: Aspects of the population-genetics of Mytilus (Mytilidae, Mollusca) in the British-Isles. Biol J Linn Soc Lond. 1983, 19: 137-183.View ArticleGoogle Scholar
- Bierne N, Borsa P, Daguin C, Jollivet D, Viard F, Bonhomme F, David P: Introgression patterns in the mosaic hybrid zone between Mytilus edulis and M. galloprovincialis. Mol Ecol. 2003, 12: 447-462.View ArticlePubMedGoogle Scholar
- Coustau C, Renaud F, Delay B: Genetic-characterization of the hybridization between Mytilus edulis and Mytilus galloprovincialis on the Atlantic coast of France. Mar Biol. 1991, 111: 87-93.View ArticleGoogle Scholar
- Barton NH: Estimating multilocus linkage disequilibria. Heredity. 2000, 84: 373-389.View ArticlePubMedGoogle Scholar
- Bierne N, Tanguy A, Faure M, Faure B, David E, Boutet I, Boon E, Quere N, Plouviez S, Kemppainen P, Jollivet D, Boudry P, David P: Mark-recapture cloning: a straightforward and cost-effective cloning method for population genetics of single copy nuclear DNA sequences in diploids. Mol Ecol Notes. 2007, 7: 562-566.View ArticleGoogle Scholar
- Nei M: Molecular evolutionary genetics. 1987, New York, Columbia University PressGoogle Scholar
- Bierne N, David P, Boudry P, Bonhomme F: Assortative fertilization and selection at larval stage in the mussels Mytilus edulis and M. gallopovincialis. Evolution. 2002, 56: 292-298.View ArticlePubMedGoogle Scholar
- Tajima F: Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989, 123: 585-595.PubMed CentralPubMedGoogle Scholar
- Pennings PS, Hermisson J: Soft sweeps III: The signature of positive selection from recurrent mutation. PLoS Genet. 2006, 2: 1998-2012.View ArticleGoogle Scholar
- Pialek J, Barton NH: The spread of an advantageous allele across a barrier: the effects of random drift and selection against heterozygotes. Genetics. 1997, 145: 493-504.PubMed CentralPubMedGoogle Scholar
- Luikart G, England PR, Tallmon D, Jordan S, Taberlet P: The power and promise of population genomics: From genotyping to genome typing. Nat Rev Genet. 2003, 4: 981-994.View ArticlePubMedGoogle Scholar
- Pollinger JP, Bustamante CD, Fledel-Alon A, Schmutz S, Gray MM, Wayne RK: Selective sweep mapping of genes with large phenotypic effects. Genome Res. 2005, 15: 1809-1819.PubMed CentralView ArticlePubMedGoogle Scholar
- Bonin A, Taberlet P, Miaud C, Pompanon F: Explorative genome scan to detect candidate loci for adaptation along a gradient of altitude in the common frog (Rana temporaria). Mol Biol Evol. 2006, 23: 773-783.View ArticlePubMedGoogle Scholar
- Murray MC, Hare MP: A genomic scan for divergent selection in a secondary contact zone between Atlantic and Gulf of Mexico oysters, Crassostrea virginica. Mol Ecol. 2006, 15: 4229-4242.View ArticlePubMedGoogle Scholar
- Turner TL, Hahn MW, Nuzhdin SV: Genomic islands of speciation in Anopheles gambiae. PLoS Biol. 2005, 3: 1572-1578.View ArticleGoogle Scholar
- Kaplan NL, Hudson RR, Langley CH: The "hitchhiking effect" revisited. Genetics. 1989, 123: 887-899.PubMed CentralPubMedGoogle Scholar
- Wiehe TH, Stephan W: Analysis of a genetic hitchhiking model, and its application to DNA polymorphism data from Drosophila melanogaster. Mol Biol Evol. 1993, 10: 842-854.PubMedGoogle Scholar
- Kim Y, Stephan W: Detecting a local signature of genetic hitchhiking along a recombining chromosome. Genetics. 2002, 160: 765 -7777.PubMed CentralPubMedGoogle Scholar
- Lallias D, Lapegue S, Hecquet C, Boudry P, Beaumont AR: AFLP-based genetic linkage maps of the blue mussel (Mytilus edulis). Anim Genet. 2007, 38: 340-349.View ArticlePubMedGoogle Scholar
- Saavedra C, Bachere E: Bivalve genomics. Aquaculture. 2006, 256: 1-14.View ArticleGoogle Scholar
- Schlotterer C: Towards a molecular characterization of adaptation in local populations. Curr Opin Genet Dev. 2002, 12: 683-687.View ArticlePubMedGoogle Scholar
- Hedgecock D: Does variance in reproductive success limit effective population sizes of marine organisms?. Genetics and Evolution of Aquatic Organisms. Edited by: Beaumont AR. 1994, London, Chapman & Hall, 122-134.Google Scholar
- Ward RD, Skibinski DOF, Woodwark M: Protein heterozygosity, protein-structure, and taxonomic differentiation. Evol Biol. 1992, 26: 73-159.Google Scholar
- Bazin E, Glemin S, Galtier N: Population size does not influence mitochondrial genetic diversity in animals. Science. 2006, 312: 570-572.View ArticlePubMedGoogle Scholar
- Sauvage C, Bierne N, Lapegue S, Boudry P: Single nucleotide polymorphisms and their relationship to codon usage bias in the Pacific oyster Crassostrea gigas. Gene. 2007, 406: 13-22.View ArticlePubMedGoogle Scholar
- Eyre-Walker A: Evolution - Size does not matter for mitochondrial DNA. Science. 2006, 312: 537-538.View ArticlePubMedGoogle Scholar
- Barton NH: The dynamics of hybrid zone. Heredity. 1979, 43: 341-359.View ArticleGoogle Scholar
- Daguin C, Bonhomme F, Borsa P: The zone of sympatry and hybridization of Mytilus edulis and M. galloprovincialis, as described by intron length polymorphism at locus mac-1. Heredity. 2001, 86: 342-354.View ArticlePubMedGoogle Scholar
- Bierne N, David P, Langlade A, Bonhomme F: Can habitat specialisation maintain a mosaic hybrid zone in marine bivalves?. Mar Ecol Prog Ser. 2002, 245: 157-170.View ArticleGoogle Scholar
- Nei M: Estimation of average heterozygosity and genetic distance from a small number of individuals. Genetics. 1978, 89: 583-590.PubMed CentralPubMedGoogle Scholar
- Belkhir K, Borsa P, Goudet J, Chikhi L, Bonhomme F: Genetix. 1999, Montpellier, Université de Montpellier 2, version 4.0Google Scholar
- Raymond M, Rousset F: Genepop (version-1.2) - Population-genetics software for exact tests and ecumenicism. J Hered. 1995, 86: 248-249.Google Scholar
- Thompson JD, Higgins DG, Gibson TJ: Clustal-W - Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22: 4673-4680.PubMed CentralView ArticlePubMedGoogle Scholar
- Hall TA: BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp Ser. 1999, 42: 95-98.View ArticleGoogle Scholar
- Kumar S, Tamura K, Nei M: MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment. Brief Bioinform. 2004, 5: 150-163.View ArticlePubMedGoogle Scholar
- Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R: DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics. 2003, 19: 2496-2497.View ArticlePubMedGoogle Scholar
- Watterson GA: On the number of segregating sites in genetical models without recombination. Theor Popul Biol. 1975, 7: 256-276.View ArticlePubMedGoogle Scholar
- Hudson RR, Kaplan NL: Statistical properties of the number of recombination events in the history of a sample of DNA-sequences. Genetics. 1985, 111: 147-164.PubMed CentralPubMedGoogle Scholar
- Depaulis F, Mousset S, Veuille M: Haplotype tests using coalescent simulations conditional on the number of segregating sites. Mol Biol Evol. 2001, 18: 1136-1138.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.