The methodological framework outlined here presents one of the first attempts to outline a statistically sound approach for estimating the effect of GCC on intraspecific genetic diversity as a consequence of projected range losses. As shown above, already sampling all haplotypes or alleles present at a single sampling site can be quite demanding in terms of individuals that need to be screened in order to assure sampling completeness. Sampling at this depth and level of detail over entire species ranges greatly exceeds current standards for thorough phylogeographic studies, and is most probably not possible for non-model organisms with current methodologies. On the other hand using thoroughly sampled phylogeographic studies can be a good starting point for studies assessing GCC on population genetic diversity as shown with the worked Trochulus example. Additional studies with other taxa will have to show whether the pattern of rather limited loss of genetic diversity projected here is typical.
Previous studies with similar aims lacked an explicit assessment on whether the chosen level of genetic diversity was completely sampled. These studies have suggested a much more severe loss of intraspecific genetic diversity, even at the level of independent evolutionary lineages , more or less analogous to the higher level clades we defined here. In situations where projections were made for haplotypes (e.g. ) or genotypes [20, 21], it is likely that sampling saturation was not reached for the individual sampling sites nor for the entire species range (see Figure 5 for comparison). Thus, any quantitative and qualitative calculation of potential losses may be meaningless because the base line to which the projected losses were compared were only the haplotypes sampled and not all actually existing haplotypes. The use of haplotypes for projecting GCC effects pertains also to the second issue mentioned above, the choice of the genetic diversity level relevant for evolutionary or adaptive processes. The number of haplotypes or alleles that can be found in a study depends on the particular marker chosen as well as on the number of base pairs screened. Using a locus with different apparent mutation rate or even fragments of varying length of the same marker can yield more or less haplotypes/alleles and thus affect quantitative results. On the other hand, the use of higher order clades or evolutionary lineages is more or less independent of the marker(s) used as these can be expected to converge on higher evolutionary unit levels [56–58]. We argue that making GCC inferences on higher clade levels or the level of evolutionary independent lineages is reasonable, less likely to be flawed from insufficient sampling, and potentially more relevant from an evolutionary point of view. As an alternative to defining the biologically significant higher clade levels, the barcode gap has the potential to delimit units for such purposes but varies strongly among different taxa e.g. . The GMYC species concept may also provide a tool that allows comparative studies , but here again the delimitation of units differs strongly among taxa e.g. . All of the above approaches however are limited to defining evolutionary units based on haplotypes of a single gene region. Hierarchical genotype clustering methods (e.g. [58–60]) can be employed to incorporate multi-locus sequence data on the one hand, and multilocus genotypic data based on microsatellites, SNPs, AFLPs, or allozymes on the other hand. Currently, quantitative comparisons of the projected loss of molecular biodiversity among different species will remain difficult lacking a standard to compare evolutionary relevant units below taxonomic rank.
Separate evolutionary entities may have different responses to changing environmental conditions and therefore it could be useful to model their GCC-responses separately [61, 62]. However, for a technical and a conceptual reason, we decided not to do so in the present case. The technical reason is that reliable SDM may not be possible for each separate evolutionary unit at higher resolution clades, because they may not be sampled sufficiently often (~30 sites per evolutionary unit ). This is e.g. the case both of the worked examples including the high resolution sampling performed on T. villosus.
The second, conceptual reason is that modelling the lineages separately requires the assumption that indeed the inferred lineages react as a unit to GCC and not only genes responsible for their presumed local adaptation (see respective discussion above). However, the resulting niche estimate is always a modelling-technique-dependent, more or less additive composite of the spatial distribution of all underlying entities, encompassing the respective niches of these subunits . Using such a composite estimate would be critical if the goal was to project the direction and extent of range shifts for each of the subunits separately (which is not our focus). However, due to the additive nature of the niche estimates, the prediction which of current populations from the entire species range will be lost is uncritical. In other studies that apply units that are explicitly defined through a strong degree of isolation, e.g. GMYC species, modelling each unit separately would be appropriate and advisable.
It is recommended that the spatial scale of SDM should be consistent with the information content of the data . This also applies in the present case. Scaling the grid size according to the effectively sampled number of grid cells is therefore useful, even if only for the assessment of spatial sampling adequacy.
It is well known that the quality of the projection depends strongly on the quality of the SDM, with all the known issues of e.g. microrefugia below SDM resolution where the genetic variation is at least partially preserved and local adaptation to the new conditions is possible. These problems and pitfalls also pertain to the forecasting of intraspecific genetic diversity, but we will not discuss them here, because they have been discussed at length elsewhere .
Predicting the fate of intraspecific genetic variation has additional issues that should be kept in mind when interpreting the results. For example, GCC may alter dispersal behaviour and in some cases provoke increased dispersal , which may results in a shift of intraspecific variation compared to present day distributions without a change to the actual distribution range . Individuals from clades that were flagged as threatened may thus disperse to suitable areas if climatic pressure rises, preserving the genetic variation they carry. Conversely, generally increased dispersal may lead to swamping of locally adapted populations from more abundant populations . As a variation to this scenario, the shift of selective regimes associated with a changing climate may favour selection driven gene-flow of respective functional alleles despite the climate driven loss of their populations of origin and the neutral variation associated with those lost populations. The projections gained from the proposed approach therefore present severe or even worst-case scenarios, which is not necessarily a drawback in nature conservation and follows the precautionary principle .
To demonstrate the performance of the proposed approach, we deliberately chose data sets that differ considerably in number of sampling sites, sampling depth at the sample sites and the spatial sampling site distribution (Figures 2 and 3). In both worked examples, no complete loss of mitochondrial haplotype lineages at the chosen level was predicted under the chosen GCC scenario, despite significant projected losses of suitable species range. However, for S. mucronata it was only possible to make inferences on the highest clade level, while in T. villosus, the second most coarse clade level could be used (Figure 5). This is in part due to the higher resolution in terms of number of sampling sites and individuals which likely led per se to a higher number of haplotypes compared to the shallower sampled S mucronata data set. Partly, however, the different resolution might also be explained by the different biology as well: the low dispersal capacity of land snails usually leads to stronger population differentiation than can be generally expected from flying insects. However, studies of winged aquatic insects have shown species-specific patterns of genetic population structure and population differentiation that suggest dispersal capacity varies dramatically even among ecological similar or closely related species [28, 69, 70], which is also true for land snails e.g. . In T. villosus, the chosen 3-step clade level is already below the divergence level marking the two major glacial refugia and thus some potentially biologically relevant units (but see ). Whether the 3-step clades represent a Pleistocene substructure and to which extent this still has biological significance is not known. Whether the inferred 3-step clades in S. mucronata also mark a phylogeographic structure is unknown, because the data set was deemed unsuitable for such an analysis. However, the large geographic overlap and co-occurrence of only slightly disjunct clades at the same sites argues against a deeper biological significance of the inference clade level. In other species of Smicridea population structure and population differentiation are more pronounced, even at smaller geographic scales [42, 70]. In Trochulus, highly divergent and reproductively isolated lineages may be restricted to single valleys [36, 72, 73]. In these species, losses of regional haplotypes or clades may thus have more direct biological significance.
The higher spatial coverage and the sampling depth of the T. villosus data set allowed for a higher prediction accuracy, because it was possible to find rarer occurrences. For example, the inference that none of the 3-step clades are threatened by extinction is based on two occurrences at a single sampling site in this species. The number of sites sampled and their spatial distribution also determined the spatial resolution for SDM. The projection grain in T. villosus was both absolutely (10.9 vs. 39.8 arc minutes) and relatively (1% vs. 7% of respective species range size) finer than in S. mucronata. The more thoroughly sampled T. villosus example thus gives much more confidence in the validity of the achieved results.
However, despite the substantial quantitative differences in the data sets, the presented data is a conservative prediction in the sense that we have, inherently to the approach, rather under- than overestimated the spatial distribution of the respective clades. Intraspecific genetic diversity is thus probably even less threatened than suggested here. These inferences may well differ under different climate scenarios, but the aim here was rather to illustrate the approach than to exhaustively analyse the data.