Tables and Figures Multilocus species trees and species delimitation in a temporal context : application to the water shrews of the genus Neomys

Bayesian phylogenetic analyses of cytochrome b Using BEAST version 1.8, the Markov chain was run for 50 million generations and 10% of the generations were discarded as burn-in. The program Tracer of the BEAST package was used to check that the effective sample sizes of all the parameters of interest were above 200 and convergence had been reached. TreeAnnotator of the same package was used to obtain the maximum clade credibility tree and the corresponding posterior probabilities of each clade.

distance between two individuals. A similar tree was obtained from distances calculated with formula 8.1 in Freedman et al. [1], where a more conservative estimate of the differences at each position is computed (not shown).

Structurama MCMC chain parameters
In each run, the Markov chain was run for 10 million generations, sampling every 100th cycle and with the initial 10,000 samples discarded as burn-in. Population assignment of each specimen was based on the mean partition or partition that minimizes the squared distance to all of the sampled partitions.

BEAST priors and MCMC chain parameters
Several mammalian fossil dates were used as hard bound minimum and soft bound maximum constraints in key nodes in order to calibrate the phylogenetic tree (Table S6). Specifically, we set lognormal prior distributions as follows: the offset was defined by the hard minimum, the mean in real space was adjusted so that the upper 95th percentile of the probability density distribution was coincident with the soft maximum, and the standard deviation parameter was set to 1 (Table S6). All calibrated nodes were older than 10 Myr, and therefore well above the time at which the difference between estimated gene tree and species tree divergences is minimal for nuclear genes [2]. A Yule speciation model was used as tree prior. 75 million generations were run and 10% of the generations were discarded as burn-in. The program Tracer of the BEAST package was used to check that the effective sample sizes of all the parameters of interest were above 200 and convergence had been reached.
For the soricid mitochondrial DNA analysis, the tree was calibrated using a set of fossil constraints available for soricids (Table S7), setting lognormal prior distributions as before.
All calibrated nodes were older than 3 Myr, and therefore well above the time at which the difference between estimated gene tree and species tree divergences is minimal for mitochondrial genes; this time is smaller than for nuclear genes due the reduced population size of mitochondrial genes [2]. To improve convergence, the priors of the substitution rate parameters of the GTR model and relative rate parameters of the codon positions were changed to uniform distribution between 0 and 100. 50 million generations were run, 10% of the generations were discarded as burn-in, and convergence was checked with Tracer.

*BEAST priors and MCMC chain parameters
For each partition, HKY was selected as the substitution model. This model was used to match the model available in the program IMa2, which was used in a subsequent step to estimate additional parameters. However, it was checked that the use of more complex substitution models did not affect the results (not shown). The corresponding ploidy type of each marker (nuclear or mitochondrial) was set. In addition, a strict molecular clock was used for all partitions, a Yule process was set as species tree prior and the population size model was set as piecewise constant. All analyses were run for 50 million generations, 10% samples were discarded as burn-in and convergence was checked as before. The maximum clade credibility tree was constructed using median node heights.

IMa2 priors and MCMC chain parameters
The HKY model was used as substitution model. Maximum split time priors were set to 8, population size priors to 15 and migration rate priors to 2. Similar results were obtained when setting exponential migration rate priors with mean = 1 (not shown). When using cytochrome b, the heredity scalar for this locus was set as 0.25. Heating parameters were set as: hfg, hn15, ha0.96 and hb0.9. The final analyses consisted of a total of 50,000 sampled genealogies after 100,000 burn-in steps. As summary statistics of the posterior distributions, the bin with the highest value (after smoothing when this value was available in the IM output) and 95% confidence intervals were taken.
In IMa2, absolute mutation rates are not sampled in the MCMC chain. Rather, mutation rate scalars (the relative values of mutation rates) are estimated. The geometric mean of the externally estimated mutation rate of all loci is then used to scale demographic parameters, including divergence time. Therefore, unlike in *BEAST, it is not possible to introduce the variability of the rates that had been previously calculated in the mammalian multilocus analysis. However, it is possible to set ranges on mutation rates. The ratios of these limits are used as limits on the ratios of the mutation rate scalars. In order to test the effect of these limits, we used as mutation rate ranges the 95% confidence intervals of the mutation rates estimated in the previous mammalian multilocus analysis. In the introns-only analysis, the means and 95% confidence limits of the divergence times estimated by IMa2 were very similar than in the main analysis. When cytochrome b was included, the results with mutation rate ranges were more altered. However, these estimations were very similar again when the upper range of the cytochrome b was increased (10 times the estimated upper limit) to account for the possibility that the mutation rate previously calculated was saturated, similarly as we did in *BEAST (not shown).

BPP priors, MCMC chain parameters and additional tests
Mutation rates were set to be variable among loci and relative rates were generated from a Dirichlet distribution. When using cytochrome b, the heredity scalar for this locus was set as 0.25. As priors for θ (population size parameter) and τ (age of the root) we initially used values estimated from IMa2, after scaling them to reflect mutations per site. Two different Gamma distributions were constructed for each of these parameters with α = 2 (fairly diffused) and α = 20 (more informative), respectively. The β of the Gamma distribution for θ and τ was obtained by dividing the α value by the corresponding mean of the parameter.
Other divergence time parameters were assigned a Dirichlet prior. Additionally, we ran BPP with θ and τ priors that were respectively one order of magnitude lower and higher than the initial ones. Each prior set was analyzed using the two described reversible-jump Markov Chain Monte Carlo algorithms (rjMCMC) using default options. Each analysis consisted of 20,000 samples taken after 2000 burn-in steps.      Figure S4. Bayesian relaxed clock tree reconstructed with cytochrome b sequences of soricids. Calibration nodes are shown with a white circle and the corresponding constraints are given in Table  S7. The Neomys fodiens branch from which the mutations rate was estimated is shown with a thicker line.