bModelTest: Bayesian phylogenetic site model averaging and model comparison
 Remco R. Bouckaert^{1, 2, 3}Email authorView ORCID ID profile and
 Alexei J. Drummond^{1, 2}
DOI: 10.1186/s1286201708906
© The Author(s) 2017
Received: 19 June 2016
Accepted: 19 January 2017
Published: 6 February 2017
Abstract
Background
Reconstructing phylogenies through Bayesian methods has many benefits, which include providing a mathematically sound framework, providing realistic estimates of uncertainty and being able to incorporate different sources of information based on formal principles. Bayesian phylogenetic analyses are popular for interpreting nucleotide sequence data, however for such studies one needs to specify a site model and associated substitution model. Often, the parameters of the site model is of no interest and an adhoc or additional likelihood based analysis is used to select a single site model.
Results
bModelTest allows for a Bayesian approach to inferring and marginalizing site models in a phylogenetic analysis. It is based on transdimensional Markov chain Monte Carlo (MCMC) proposals that allow switching between substitution models as well as estimating the posterior probability for gammadistributed rate heterogeneity, a proportion of invariable sites and unequal base frequencies. The model can be used with the full set of timereversible models on nucleotides, but we also introduce and demonstrate the use of two subsets of timereversible substitution models.
Conclusion
With the new method the site model can be inferred (and marginalized) during the MCMC analysis and does not need to be predetermined, as is now often the case in practice, by likelihoodbased methods. The method is implemented in the bModelTest package of the popular BEAST 2 software, which is open source, licensed under the GNU Lesser General Public License and allows joint site model and tree inference under a wide range of models.
Keywords
Model averaging Model selection Model comparison Statistical phylogenetics ModelTest Phylogenetic model averaging Phylogenetic model comparison Substitution model Site modelBackground
One of the choices that needs to be made when performing a Bayesian phylogenetic analysis is which site model to use. A common approach is to use a likelihoodbased method like ModelTest [1], jModelTest [2], or jModelTest2 [3] to determine the site model. The site model is comprised of (i) a substitution model defining the relative rates of different classes of substitutions and (ii) a model of rate heterogeneity across sites which may include a gamma distribution [4] and/or a proportion of invariable sites [5, 6]. The site model recommended by such likelihoodbased method is then often used in a subsequent Bayesian phylogenetic analysis. This analysis framework introduces a certain circularity, as the original model selection step requires a phylogeny, which is usually estimated by a simplistic approach. Also, by forcing the subsequent Bayesian phylogenetic analysis to condition on the selected site model, the uncertainty in the site model can’t be incorporated into the uncertainty in the phylogenetic posterior distribution. A more statistically rigorous and elegant method is to coestimate the site model and the phylogeny in a single Bayesian analysis, thus alleviating these issues.
Coestimation of the substitution model for a nucleotide alignment can be achieved by sampling all possible reversible models [7], or just a nested set of models [8], using either reversible jump MCMC or stochastic Bayesian variable selection [9]. The CATGTR model [10, 11] solves a related problem by providing a mixture model over sites that often fits better than using any single model for all sites. Wu et al. [12] use reversible jump for both substitution models and partitions and furthermore sample the use of gamma rate heterogeneity for each site category. However, since the method divides sites among a set of substitution models, it does not address invariable sites, and only considers a limited set of five (K80, F81, HKY85, TN93, and GTR) substitution models.
Here we introduce a method which combines model averaging over substitution models with model averaging of the parameters governing rate heterogeneity across sites using reversible jump. Whether one considers the method to be selecting the site model, or averaging over (marginalizing over) site models depends on which random variables are viewed as parameters of interest and which are viewed as nuisance parameters. If the phylogeny is viewed as the parameter of interest, then bModelTest provides estimates of the phylogeny averaged over site models. Alternatively if the site model is of interest, then bModelTest can be used to select the site model averaged over phylogenies. These are matters of postprocessing of the MCMC output, and it is also possible to consider the interaction of phylogeny and site models. For example one could construct phylogeny estimates conditional on different features of the site model from the results of a single MCMC analysis.
The method is implemented in the bModelTest package of BEAST 2 [13] with GUI support for BEAUti making it easy to use. It is open source and available under LGPL licence. Source code, installation instructions and documentation can be found at https://github.com/BEAST2Dev/bModelTest.
Implementation

The model number M,

A variable size rate parameter (depending on model number) R,

A binary variable to indicate whether 1 or k>1 nonzero rate categories should be used,

A shape parameter α, used for gamma rate heterogeneity when there are k>1 rate categories,

A binary variable to indicate whether or not a category for invariable sites should be used,

The proportion of invariable sites p _{ inv },
Rates r _{ ac }, r _{ ag }, r _{ at }, r _{ cg }, r _{ ct } and r _{ gt } are determined from the model number M and rate parameter R. Further, we restrict R such that the sum of the six rates \(\sum r_{..}\) equals 6 in order to ensure identifiability. This is implemented by starting each rate with value 1, and ensuring proposals keep the sum of rates in (see details on proposals below).
Prior
By default, bModelTest uses the flat Dirichlet prior on rates from [7]. From empirical studies [16, 17], we know that transition rates tend to be higher than transversion rates. It makes sense to encode this information in our prior and bModelTest allows for rates to get a different prior on transition rates (default log normal with mean 1 and standard deviation of 1.25 for the log rates) and transversion rates (default exponential with mean 1 for the rates).
An obvious choice for the prior on models is to use a uniform prior over all valid models. As Fig. 1 shows, there are many more models with 3 parameters than with 1. An alternative allowed in bModelTest is to use a uniform prior on the number of parameters in the model. In that case, Jukes Cantor and GTR get a prior probability of 1/6, since these are the only models with 0 and 5 degrees of freedom respectively. Depending on the model set, a much lower probability is assigned to each of the individual models such that the total prior probability summed over models with K parameters, p(K)=1/6 for K∈{0,1,2,3,4,5}.
For frequencies a Dirichlet(4,4,4,4) prior is used, reflecting our believe that frequencies over nucleotides tend to be fairly evenly distributed, but allowing a 2.2% chance for a frequency to be under 0.05. For p _{ inv } a Beta(4,1) prior on the interval (0,1) is used giving a mean of 0.2 and for α an exponential with a mean 1. These priors only affect the posterior when the respective binary indicator is 1.
MCMC proposals
Model merge/split proposal
For splitting (or merging) substitution models, suppose we start with a model M. To determine the proposed model M ^{′}, we randomly select one of the child (or parent) nodes in the graph (as shown in Fig. 1). This is in contrast to the approach of Huelsenbeck et al. [7], in which first a group is randomly selected, then a subgrouping is randomly created. For any set of substitution models organised in an adjacency graph our merge/split operator applies, making our graphbased method easier to generalise to other model sets (e.g. the one used in [24]). If there are no candidates to split (that is, model M=123,456 is GTR) the proposal returns the current state (this proposal is important to guarantee uniform sampling of models). Likewise, when attempting to merge model M=111,111, the current state is proposed (M ^{′}=111,111). Let r be the rate of the group to be split. We have to generate two rates r _{ i } and r _{ j } for the split into groups of size n _{ i } and n _{ j }. To ensure rates sum to 6, we select u uniformly from the interval (−n _{ i } r,n _{ j } r) and set r _{ i }=r+u/n _{ i } and r _{ j }=r−u/n _{ j }.
For a merge proposal, the rate of the merged group r from two split groups i and j with sizes n _{ i } and n _{ j }, as well as rates r _{ i } and r _{ j } is calculated as \(r=\frac {n_{i}r_{i}+n_{j}r_{j}}{n_{i}+n_{j}}\).
The Jacobian for splitting is \(\frac {n_{i}+n_{j}}{n_{i}n_{j}}\) and for merging it is \(\frac {n_{i}n_{j}}{n_{i}+n_{j}}\).
Rate exchange proposal
The rate exchange proposal randomly selects two groups, and exchanges a random amount such that the condition that all six rates sum to 6 is met. A random number is selected from the interval [ 0,δ] where δ is a tuning parameter of the proposal (δ is automatically optimized to achieve the desired acceptance probability for the data during the MCMC chain). Let n _{ i }, r _{ i }, n _{ j } and r _{ j } as before, then the new rates are r i′=r _{ i }−u and \(r_{j}'= r_{j} + u\frac {n_{i}}{n_{j}}\). The proposal fails when r i′<0.
The proposal ratio as well as the Jacobian are 1.
Birth/death proposal
Birth and death proposals set or unset the category count flag and sample a new value for α from the prior when the flag is set. The proposal ratio is d(α ^{′}) for birth and 1/d(α) for death where d(.) is the density used to sample from (by default an exponential density with a mean of 1).
Likewise for setting the indicator flag to include a proportion of invariable sites and sampling p _{ inv } from the prior. The Jacobian is 1 for all these proposals.
Scale proposal
For the α, we use the standard scale operator in BEAST 2 [13], adapted so it only samples if the category count flag is set for α. Likewise, for pInv this scale operator is used, but only if the indicator flag to include a proportion of invariable sites is set.
Results and discussion
Since implementation of the split/merge and rate exchange proposals is not straightforward, nor is derivation of the proposal ratio and Jacobian, unit tests were written to guarantee their correctness and lack of bias in proposals (available on https://github.com/BEAST2Dev/bModelTest).
To validate the method we performed a simulation study by drawing site models from the prior, then used these models to generate sequence data of 10K sites length on a tree (in Newick (A:0.2,(B:0.15,C:0.15):0.05)) with three taxa under a strict clock. The data was analysed using a Yule tree prior, a strict clock and bModelTest as site model with uniform prior over models and exponential with mean one for transversions and lognormal with mean one and variance 1.25 for transition rates. A hundred alignments were generated with gamma rate heterogeneity and a hundred without rate heterogeneity using a (Bouckaert, RR: BEASTShell – scripting for bayesian hierarchical clustering, submitted) script. Invariant sites can be generated in the process and are left in the alignment.
Coverage summary for simulation study
Site  Rate coverage  Mean  Subst. Model  
Freqs  model  AC  AG  AT  CG  CT  GT  rate  coverage 
Equal  plain  93  97  94  96  95  95  95  98 
Equal  +G  91  95  93  93  95  93  93.3  97 
Equal  +I  92  94  94  95  93  94  93.6  96 
Equal  +G+I  89  96  95  94  95  95  94  98 
Unequal  plain  96  95  96  97  93  96  95.5  96 
Unequal  +G  95  94  94  94  96  96  94.8  98 
Unequal  +I  89  94  95  95  93  95  93.5  93 
Unequal  +G+I  97  94  94  93  93  96  94.5  97 
Mean  94.25  94.25  94.75  94.75  93.75  95.75  94.6  96  
Site  Site model  Frequency  Frequency coverage  
Freqs  model  coverage  α  p _{ inv }  coverage  A  C  G  T 
Equal  plain  100  100  100  100  100  100  
Equal  +G  96  94  100  100  100  100  100  
Equal  +I  98  95  100  100  100  100  100  
Equal  +G+I  99  89  88  100  100  100  100  100 
Unequal  plain  100  100  92  95  97  96  
Unequal  +G  97  94  100  97  92  92  98  
Unequal  +I  98  92  100  95  94  94  89  
Unequal  +G+I  100  93  91  100  99  96  96  98 
Mean  98.75  93.50  91.50  100.00  97.38  97.88  97.13  97.38 
Coverage of rate estimates and frequencies are as expected, as shown in the table. Substitution model coverage is measured by first creating the 95% credible set of models for each simulation and then counting how often the model used to generate the data was part of the 95% credible set. The 95% credible set is the smallest set of models having total posterior probability ≥0.95. As Table 1 shows, model coverage is as expected (Subst. Model coverage column). The situation with gamma shape parameter estimates and proportion of invariable sites is not as straightforward as for the relative rates of the substitution process. The site model coverage can be measured in a similar fashion: the site model coverage column shows how often the 95% credible sets for the four different site models (plain, +G. +I and +G+I) contains the true model used to generate the data. The coverage is as expected. When looking at how well the shape parameter (α column in Table 1) and the proportion invariable sites (p _{ inv } column in the table) is estimated, we calculated the 95% HPD intervals for that part of the trace where the true site model was sampled. Coverage is as expected when only gamma rate heterogeneity is used, or when only a proportion of invariable sites is used, but when both are used an interaction between the two site model categories appears to slightly reduce the coverage of both parameters. In these experiments the coverage for the frequency estimates for the individual nucleotides was as expected. In summary, the statistical performance of the model is as expected for almost all parameters except for the case where gamma and a proportion of invariable sites are used due to their interaction as discussed further below.
To investigate robustness of the approach, we repeated the study with a log normal uncorrelated relaxed clock [26] with a gamma (α=30,β=0.005) prior over the standard deviation for the log normal distribution. Trees with 5 taxa were randomly sampled from a Yule prior with log normal distribution (the birth rate was drawn from a distribution with a mean of the rate of 5.5, and a standarddeviation of the lograte of 0.048) giving trees with mean height ≈0.25 and 95% HPD interval of 0.015 to 0.7. The study as outlined above was repeated, and results are summarised in Additional file 1: Table S1, which looks very similar to that of Table 1. So, we conclude that the model is not sensitive to small variation in molecular clock rates among branches.
The same study with 5 taxa was repeated with the substitution model fixed to HKY and GTR, but estimating the other parts of the model. Results are summarised in Additional file 1: Tables S2 and S3 respectively. Fixing the model to HKY results in severe degradation of accuracy in all parameter and model estimates. The lack of coverage of frequency estimates when the true model has equal frequencies suggests that lack of degrees of freedom in substitution model parameters is compensated by estimating frequencies instead of keeping them equal. So substitution model misspecification can result in considerable misspecification of the remainder of the model. Results when fixing the substitution model to GTR shows a table with results very similar to that of bModelTest, however the substitution model parameters have on average a 95% HPD interval of size 0.17 while that of bModelTest is only 0.13. The extra parameters that need to be estimated for GTR compared to bModelTest result in more uncertain estimates, and thus more uncertainty in the analysis.
To see the impact of the model set, the experiment was repeated with sampling from all 203 reversible models instead of using only the 30 transition/transversion split models. Results are shown in Additional file 1: Table S4, which do not differ substantially from Table 1. Further, to investigate the effect of the number of taxa and sequence length, the study was repeated with 16 taxa and sequence lengths 1K and 0.5K base pairs long under a relaxed clock as before. Results are summarised in Additional file 1: Tables S5 and S6 respectively. The tables do not show significant differences to Table 1 or degradation with decreasing sequence length, so the ability of our Bayesian method to correctly estimate the posterior distribution of substitution models and their parameters does not appear to depend substantially on sequence length or number of taxa.
Comparison with jModelTest
We ran jModelTest version 2.1.10 [3] on the sequence data used for the last simulation study with 5 taxa (using all reversible models, since only that set is the same for both jModelTest and bModelTest) and the two simulation studies using 16 taxa and compared the substitution model coverage (with settings BIC AIC f g 4 i s 203). For each dataset, we collected the top models according to the AIC and BIC criteria such that the cumulative weight exceeded 95% of the models as shown in the jModelTest output and registered whether the true model was contained in the resulting set. Results are summarised in Additional file 1: Table S7, which shows that both AIC and BIC do not cover the true model 95% of the time as would be desirable. For some combinations the coverage is close to the desirable value (89.4% for AIC with 5 taxa) and for some it is much lower (61.1% for BIC with 0.5K length sequences and 16 taxa). Coverage of both AIC and BIC appears to decrease with increasing number of taxa and decreasing sequence length, although we have not attempted a systematic study. In contrast bModelTest has a coverage of ∼95% for all scenarios. jModelTest uses a single maximum likelihood tree and it seems that increasing uncertainty in the true tree (by increasing the number of taxa or decreasing sequence length) results in an increasing chance of incorrect model weights from jModelTest. For BIC, we find substantially less coverage of jModelTest than the around 90% model coverage reported in a previous study [3]. This is probably because our data contains a larger amount of uncertainty due to shorter sequences and tree lengths. Another factor is that we use different priors. For example, we use a Beta(1,4) for the proportion of invariable sites, while the previous study [3] used a Beta(1,3) that was then truncated to the interval [0.2,0.8], thereby avoiding extreme values which might cause difficulties. To confirm this we produced simulated data more closely matched to previously published experiments (with 40 taxa, sequences of 2500 base pairs, models selected uniformly from the 11 named models, tree length with mean of 6.5, truncated prior for invariable sites, BIC criterion) and obtained a coverage of 93.8% for the 95% credible set and 89.5% coverage by the best fitting model, similar to the results in [3].
In practice, users of Bayesian phylogenetic packages only use the most highly weighted model returned by jModelTest. Additional file 1: Table S7 shows how often the best fitting model according to AIC and BIC matches the true model, which ranges from 73.9% for BIC on 5 taxa to 30.8% for AIC on 0.5K length sequences and 16 taxa, suggesting that the probability of model misspecification using this approach increases with phylogenetic uncertainty.
Implementation details
The calculation of the tree likelihood typically consumes the bulk (≫90%) of computational time. Note that for a category with invariable sites, the rate is zero, hence only sites that are invariant (allowing for missing data) contribute to the tree likelihood. The contribution is 1 for those sites for any tree and for any parameter setting, so by counting the number of invariant sites, the tree likelihood can be calculated in constant time. Switching between with and without gamma rate heterogeneity means switching between one and k rate categories, which requires k time as much calculation. Having two tree likelihood objects, one for each of these two scenarios, and a switch object that selects the one required allows use of the BEAST 2 updating mechanism [9] so that only the tree likelihood that needs updating is performing calculations. So, jModelTest and bModelTest can, but do not necessarily agree on the most appropriate model to use.
Conclusions
bModelTest is a BEAST 2 package which can be used in any analysis where trees are estimated based on nucleotide sequences, such as multispecies coalescent analysis [29, 30], various forms of phylogeographical analyses, sampled ancestor analysis [31], demographic reconstruction using coalescent [32], birth death skyline analysis [33], et cetera. The GUI support provided through BEAUti makes it easy to set up an analysis with the bModelTest site model: just select bModelTest instead of the default gamma site model from the combo box in the site model panel.
A promising direction for further research would be to incorporate efficient averaging over partitioning of the alignment [10–12] to the site model averaging approach described here.
bModelTest allows estimation of the site model using a full Bayesian approach, without the need to rely on nonBayesian tools for selecting the site model.
Availability and requirements
Project name: bModelTestProject home page: https://github.com/BEAST2Dev/bModelTest/Operating systems: Windos, OSX, Linux and any other OSProgramming language: JavaOther requirements: requires BEAST 2 (from http://beast2) Licence: LGPL.
Endnotes
^{1} Estimated shape parameters only take values of the shape parameter in account in the portion of the posterior sample where gamma rate heterogeneity indicator is 1.
^{2} The estimated proportion of invariable sites only take values of the parameter in account in the posterior sample where the invariant category was present.
Abbreviations
 BEAST:

Bayesian evolutionary analysis by sampling trees
 GTR:

general time reversible
 GUI:

graphical user interface
 HCV:

hepatitis C virus
 HPD:

highest probability density
 LGPL:

Lesser general public license
 MCMC:

Markov chain Monte Carlo
Declarations
Acknowledgements
Not applicable.
Funding
This work was funded by a Rutherford fellowship (http://www.royalsociety.org.nz/programmes/funds/rutherforddiscovery/) from the Royal Society of New Zealand awarded to Prof. Alexei Drummond.
Availability of data and material
See supplementary information for material for the simulation study as well as the primates and HCV analyses. The method is implemented in the bModelTest package of BEAST 2 available from http://beast2.orgunder LGPL license. Source code, installation instructions and documentation can be found at https://github.com/BEAST2Dev/bModelTest.
Authors’ contributions
RRB designed and developed software. AJD and RRB designed experiments. RRB executed experiments. AJD and RRB analyzed the data. AJD and RRB prepared the figures. AJD and RRB wrote the manuscript. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Consent for publication
Not applicable.
Ethics approval and consent to participate
Not applicable.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 Posada D, Crandall KA. Modeltest: testing the model of dna substitution. Bioinformatics. 1998; 14(9):817–8.View ArticlePubMedGoogle Scholar
 Posada D. jModelTest: phylogenetic model averaging. Mol Biol Evol. 2008; 25(7):1253–56.View ArticlePubMedGoogle Scholar
 Darriba D, Taboada GL, Doallo R, Posada D. jModelTest 2: more models, new heuristics and parallel computing. Nat Methods. 2012; 9(8):772–2.View ArticlePubMedPubMed CentralGoogle Scholar
 Yang Z. Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J Mol Evol. 1994; 39(3):306–14. doi:10.1007/BF00160154.View ArticlePubMedGoogle Scholar
 Gu X, Fu YX, Li WH. Maximum likelihood estimation of the heterogeneity of substitution rate among nucleotide sites. Mol Biol Evol. 1995; 12(4):546–7.PubMedGoogle Scholar
 Waddell P, Penny D. Evolutionary trees of apes and humans from DNA sequences In: Lock AJ, Peters CR, editors. Handbook of Symbolic Evolution. Oxford: Clarendon Press: 1996. p. 53–73.Google Scholar
 Huelsenbeck JP, Larget B, Alfaro ME. Bayesian phylogenetic model selection using reversible jump markov chain monte carlo. Mol Biol Evol. 2004; 21(6):1123–33. doi:10.1093/molbev/msh123.View ArticlePubMedGoogle Scholar
 Bouckaert RR, AlvaradoMora M, Rebello Pinho Ja. Evolutionary rates and hbv: issues of rate estimation with bayesian molecular methods. Antivir Ther. 2013; 18(3 Pt B):497–503.View ArticlePubMedGoogle Scholar
 Drummond AJ, Bouckaert RR. Bayesian evolutionary analysis with BEAST. Cambridge: Cambridge University Press; 2015.View ArticleGoogle Scholar
 Lartillot N, Lepage T, Blanquart S. Phylobayes 3: a bayesian software package for phylogenetic reconstruction and molecular dating. Bioinformatics. 2009; 25(17):2286–288.View ArticlePubMedGoogle Scholar
 Lartillot N, Philippe H. A bayesian mixture model for acrosssite heterogeneities in the aminoacid replacement process. Mol Biol Evol. 2004; 21(6):1095–109.View ArticlePubMedGoogle Scholar
 Wu CH, Suchard MA, Drummond AJ. Bayesian selection of nucleotide substitution models and their site assignments. Mol Biol Evol. 2013; 30(3):669–88. doi:10.1093/molbev/mss258.View ArticlePubMedGoogle Scholar
 Bouckaert RR, Heled J, Kühnert D, Vaughan T, Wu CH, Xie D, Suchard MA, Rambaut A, Drummond AJ. BEAST 2: a software platform for bayesian evolutionary analysis. PLoS Comput Biol. 2014; 10(4):1003537. doi:10.1371/journal.pcbi.1003537.View ArticleGoogle Scholar
 Tavaré S. Some probabilistic and statistical problems in the analysis of dna sequences. Lect Math Life Sci. 1986; 17:57–86.Google Scholar
 Hasegawa M, Kishino H, Yano T. Dating the humanape splitting by a molecular clock of mitochondrial dna. J Mol Evol. 1985; 22:160–74.View ArticlePubMedGoogle Scholar
 Pereira L, Freitas F, Fernandes V, Pereira JB, Costa MD, Costa S, Máximo V, Macaulay V, Rocha R, Samuels DC. The diversity present in 5140 human mitochondrial genomes. Am J Hum Genet. 2009; 84(5):628–40. doi:10.1016/j.ajhg.2009.04.013.View ArticlePubMedPubMed CentralGoogle Scholar
 Rosenberg NA. The shapes of neutral gene genealogies in two species: probabilities of monophyly, paraphyly and polyphyly in a coalescent model. Evolution. 2003; 57(7):1465–77.View ArticlePubMedGoogle Scholar
 Jukes T, Cantor C. Evolution of protein molecules In: Munro HN, editor. Mammaliam Protein Metabolism. New York: Academic Press: 1969. p. 21–132.Google Scholar
 Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 1981; 17:368–76.View ArticlePubMedGoogle Scholar
 Tamura K, Nei M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol. 1993; 10:512–26.PubMedGoogle Scholar
 Kimura M. Estimation of evolutionary distances between homologous nucleotide sequences. Proc Natl Acad Sci. 1981; 78(1):454–8.View ArticlePubMedPubMed CentralGoogle Scholar
 Posada D. Using MODELTEST and PAUP* to select a model of nucleotide substitution. Curr Protoc Bioinforma. 2003;6–5.
 Green PJ. Reversible jump markov chain monte carlo computation and bayesian model determination. Biometrika. 1995; 82:711–32.View ArticleGoogle Scholar
 Pagel M, Meade A. Bayesian analysis of correlated evolution of discrete characters by reversiblejump markov chain monte carlo. Am Nat. 2006; 167(6):808–25.PubMedGoogle Scholar
 Dawid AP. The wellcalibrated bayesian. J Am Stat Assoc. 1982; 77(379):605–10.View ArticleGoogle Scholar
 Drummond AJ, Ho SYW, Phillips MJ, Rambaut A. Relaxed phylogenetics and dating with confidence. PLoS Biol. 2006; 4(5):88. doi:10.1371/journal.pbio.0040088.View ArticleGoogle Scholar
 Hayasaka K, Gojobori T, Horai S. Molecular phylogeny and evolution of primate mitochondrial dna. Mol Biol Evol. 1988; 5(6):626–44.PubMedGoogle Scholar
 Gray RR, Parker J, Lemey P, Salemi M, Katzourakis A, Pybus OG. The mode and tempo of hepatitis c virus evolution within and among hosts. BMC Evol Biol. 2011; 11(1):131.View ArticlePubMedPubMed CentralGoogle Scholar
 Heled J, Drummond AJ. Bayesian inference of species trees from multilocus data. Mol Biol Evol. 2010; 27(3):570–80. doi:10.1093/molbev/msp274.View ArticlePubMedGoogle Scholar
 Ogilvie HA, Heled J, Xie D, Drummond AJ. Computational performance and statistical accuracy of *beast and comparisons with other methods. Syst Biol. 2016; 65(3):381–96. doi:10.1093/sysbio/syv118.View ArticlePubMedPubMed CentralGoogle Scholar
 Gavryushkina A, Welch D, Stadler T, Drummond AJ. Bayesian inference of sampled ancestor trees for epidemiology and fossil calibration. PLoS Comput Biol. 2014; 10(12):1003919. doi:10.1371/journal.pcbi.1003919.View ArticleGoogle Scholar
 Heled J, Drummond AJ. Bayesian inference of population size history from multiple loci. BMC Evol Biol. 2008; 8:289. doi:10.1186/147121488289.View ArticlePubMedPubMed CentralGoogle Scholar
 Stadler T, Kühnert D, Bonhoeffer S, Drummond AJ. Birthdeath skyline plot reveals temporal changes of epidemic spread in hiv and hepatitis c virus (hcv). Proc Natl Acad Sci USA. 2013; 110(1):228–33. doi:10.1073/pnas.1207965110.View ArticlePubMedGoogle Scholar