Multiplex array capturing and sequencing
DNA was extracted from hair roots of 60 horses using NucleoSpin Tissue KIT after manufacturer instruction (Macherey-Nagel, Düren, Germany) (Additional file 1, Table S1). Horse samples were taken in correspondence with German animal protection law (Potsdam: 32/44456+11). Genomic DNA (100 μl, conc. 10 ng/μl) was sheared by sonication using a Bioruptor system to a fragment size around 150-250 bp. Next, barcoded Illumina sequencing libraries with a different barcode used for each sample were constructed from the fragmented DNA according to the protocol described in Meyer and Kircher . The 60 barcoded libraries were pooled in equimolar ratio and hybridized on a single 244K custom microarray (Agilent). The microarray was designed that overlapping 60-mer oligonucleotide probes targeting the whole mitochondrial genome were tiled every 15 nucleotides. The repetitive part of the control region (motif GTG CAC CT, pos. 16,129-16,360) was not targeted. Hybridization and sequencing preparations were performed as described in [21, 22]. After enrichment, the DNA library was sequenced on the Illumina/Solexa Genome Analyzer II platform (Illumina, San Diego, CA, USA).
Sequence data analysis
Sequencing runs were analyzed from raw images using the Illumina Genome Analyzer pipeline. Bases were called using Ibis (http://bioinf.eva.mpg.de/Ibis/, ) and reads with five or more positions with a PHRED-like quality score below 20 were discarded. Each read was sorted according to the sample specific barcode and the adapter sequence was trimmed. The reads of each sample were aligned against one published mitochondrial genome ([GenBank: X79547.1] ) using BWA v0.5.1 (http://bio-bwa.sourceforge.net/, ). The BAM alignment files were further processed with SAMtools v0.1.7 (http://samtools.sourceforge.net/, ) and alignment statistics including number of mapped reads and average coverage per position were determined. After duplicate removal, for each position in the alignment, the consensus base was called and several quality scores were calculated (i.e. Phred-scaled consensus quality, SNP quality, mapping quality; see http://samtools.sourceforge.net/pileup.shtml) by using the SAMtools "pileup -c" command. The final consensus base was called when the position had a consensus quality score of at least Q30 and a mapping quality score of at least Q20. Further, for base calls that were different to the reference sequence, a SNP quality score of at least Q30 and three-fold coverage in this position was required. Indels were not considered in the base calling process.
From the consensus sequences a multiple sequence alignment was obtained using ClustalW (http://www.ebi.ac.uk/Tools/clustalw2/). We further added all currently available complete mtDNA-genome sequences from horses we found on the NCBI GenBank (http://www.ncbi.nlm.nih.gov/genbank/) and from the wild ass (Equus asinus), the closest relative with a fully sequenced mtDNA genome (Additional file 1, Table S3). The repetitive part of the control region (pos. 16,129-16,360 referring to X79547.1 ) was masked with "N's" as we also discarded this region in the probe design of the array. Any nucleotide position in the multiple alignment that failed to have information for at least three samples of the alignment (4.47% of the samples) was deleted on the grounds that it was unlikely to represent homology shared across the alignment. A preliminary phylogenetic analysis showed that three of the GenBank-derived sequences ("jeju", "debao", and "zhongdian", respectively accession numbers [GenBank: AY584828.1], [GenBank: EU939445], and [GenBank: EF597512.1]; Additional file 1, Table S3) exhibited unusually long branches, and strong departure from the clocklike evolution of the rest of sequences (Additional file 1, Figure S1). The same pattern was observed on several different MrBayes runs with different parameters, as well as maximum likelihood runs with PHYML and RAxML. This behavior indicates contamination of these sequences by nuclear DNA (numts), or some other problem with these sequences; therefore, they were eliminated from the alignment and excluded from the remainder of the analysis. It might be possible that the long branches of the excluded sequences were due to some "real" effect, such as adaptation to high-altitude environments; however, only one of the three removed sequences, zhongdian, was derived from a study on mitochondrial adaptations in high-altitude Tibetan horse breeds , and the other two sequences derived from that study (deqin and naqu), although both from high-altitude locations above 3,000 m in China or Tibet, did not exhibit unusual branch lengths. Therefore, it was judged unlikely that the long branches of the excluded sequences were due to high-altitude adaptation or some similar effect.
The final alignment had 64 sequences and 16,419 nucleotide positions. Initial summary statistics were calculated in PAUP* 4.0 . Phylogenies were estimated using maximum parsimony (MP), maximum likelihood (ML), and Bayesian methods.
Phylogenetic analysis -- maximum parsimony
Parsimony analysis was conducted with TNT version 1.1 . and summary statistics including CI (consistency index; ) and RI (retention index; ) were calculated using the Stats.run script available online at the TNT wiki (http://tnt.insectmuseum.org/index.php/Scripts). The tree search was conducted with the "mult" command, using 100 random addition runs as starting points, each followed by branch swapping via TBR (tree bisection and regrafting). After calculating summary statistics, the collection of most-parsimonious trees was summarized using combinable components (Bremer consensus tree), strict consensus (Nelsen consensus tree), and majority-rule consensus . For each consensus tree, branch support was calculated using Bremer support [46, 47], also known as decay index , and non-parametric bootstrapping . Bremer support was calculating using the script KWBremer.run (provided by Kipling Will, personal communication), a modified version of Bremer.run available on the TNT scripts page. Bootstrapping was conducted with the "resample" command, using 100 bootstrap replicates.
Phylogenetic analysis - maximum likelihood
Initial ML analysis was conducted using the online RAxML server  at http://phylobench.vital-it.ch/raxml-bb/ using defaults, and adding the option to estimate proportion of invariant sites.
More detailed ML analysis was conducted with Modeltest [51, 52] and the PHYML  server online at http://mobyle.pasteur.fr/cgi-bin/portal.py?form=phyml. The standard Modeltest PAUP block was used to assess the likelihood of the sequence data as explained by a neighbor-joining (NJ) tree estimated from the data by PAUP and 56 different substitution models. The hierarchical likelihood ratio test (hLRT) selected HKY+I+G as the best model, and the Akaike Information Criterion (AIC) selected GTR+I+G. Three PHYML runs were conducted, the first using the specific parameters selected by Modeltest for HKY+I+G, the second using the specific parameters selected for GTR+I+G, and finally a run in which the GTR+I+G model was selected, but all parameters were estimated during the analysis. All runs were conducted with 100 bootstrap replicates, and majority-rule consensus trees with bootstrap branch support were calculated by PHYML.
Phylogenetic analysis - Bayesian
MrModelTest  was used to assess the likelihood of the 24 sequence evolution models available in MrBayes. Again, hLRT selected HKY+I+G and AIC selected GTR+I+G; however, the point of Bayesian analysis is to sample trees (topologies and branch lengths) as well as substitution model parameters from the joint posterior distribution of trees and models, so no specific sequence evolution model was specified for MrBayes beyond the generic GTR+I+G with all parameters to be estimated during the run. MrBayes [55, 56], available at http://mrbayes.csit.fsu.edu/, was used to conduct the phylogenetic analysis. Default parameters for estimation under a GTR+I+G model were used, with uniform priors set on the base frequencies and rate matrix, proportion of invariant sites, and topology. The prior on branch lengths was exponential with rate (alpha) = 10.0 and four categories were used to approximate gamma-distributed rate variation. Two independent runs were conducted of 1,000,000 generations each, with trees sampled every 1000 generations. The first 50% of each run was discarded as burnin, and the remaining 1000 saved trees were summarized using majority rule consensus. The standard deviation of split frequencies between the two runs stabilized at about 0.02, indicating that the runs had successfully converged and were sampling from the same posterior. There was some chance that estimating the full suite of parameters for a GTR+I+G model might be overly ambitious. Therefore, a second MrBayes analysis was performed using the same parameters, except with a maximally simple Jukes-Cantor (JC) model with no sequence evolution parameters estimated.
Divergence Time Estimation
Inspection of the ML and Bayesian consensus trees indicated approximately clocklike behavior. Therefore tests were conducted to see if the hypothesis of a global molecular clock would be rejected by the data. The first set of tests was conducted in PAUP. The consensus tree from the GTR+I+G MrBayes run was manually rooted using the wild ass as outgroup. It was loaded into PAUP and its likelihood was measured for sequence evolution models constrained, and not constrained, to a global clock. The likelihoods were then compared to test for statistically significant difference using a likelihood ratio test with 62 degrees of freedom (number of taxa - 2). The test was repeated using 3 different models of sequence evolution: the HKY+I+G model selected by Modeltest, the GTR+I+G model selected by MrModeltest, and the posterior mean parameters of the GTR+I+G analysis selected by MrBayes.
The global clock hypothesis was also tested using the somewhat different procedures in the baseml program in PAML . Here, the likelihood of the data with and without a global clock was estimated for the rooted Bayesian consensus tree using the GTR (termed "REV" in PAML) +I+G model where baseml estimates the optimal substitution model parameters for each analysis.
Following the decision that the assumption of a global clock was defensible, divergence times were estimated using r8s [58, 59] and BEAST . The primary goal of the analysis was to bracket the time of divergence of the horse breed mtDNA sequences; a completely thorough molecular dating exercise was not attempted here, as this would take a separate complex study at least involving the incorporation of many partial mtDNA sequences available from subfossil equines . Therefore, the divergence time of the horse/ass clade as estimated from the fossil record was used as the only constraint. Since bracketing the divergence time was the major goal, the maximum (3.5 mybp) and minimum (1.0 mybp) possible divergence times based on the fossil record  were used as the constraints. For a maximum-divergence-time r8s run the horse/ass split was fixed at 3.5 mybp, and for the minimum-divergence-time r8s run, it was fixed to 1.0 mybp. R8s was run using the default Langley-Fitch (molecular clock) method of estimating divergence times.
Divergence time estimation was also conducted using a strict global clock assumption in BEAST, in order to get a more reasonable sense of the variability in divergence times for horse lineages. It is admitted that the choice of prior used in this analysis is fairly subjective and thus the results are heuristic rather than firm conclusions. Utilizing the reasonable assumption that the true divergence time of horse and ass is more likely to be in the middle of the 3.5-1.0 mybp range than at the edges, the prior on the divergence time of horse and ass was set to be normally distributed with a mean of 2.25 mybp, and with the standard deviation set to 0.3125 my, so that "maximum" and "minimum" divergence times occurred 4 standard deviations above and below the mean. All other parameters were allowed to vary during BEAST's sampling routine, using default priors, except as follows: the substitution model was HKY+I+G with 4 gamma rate categories, estimated base frequencies, and uniform prior of the substitution rate sampling between 0 to 1. The convergence of the MCMC analysis was judged to be adequate after inspection of the run in Tracer. The first 10% of the BEAST MCMC run was discarded as burn-in, and the remaining samples were summarized using TreeAnnotator. The resulting ultrametric trees were displayed in FigTree (all programs available with BEAST at: http://beast.bio.ed.ac.uk/Main_Page).
The Bayesian skyline plot method implemented in BEAST was used to estimate past population dynamics through time from the 63 whole mtDNA horse sequences. A piecewise linear model and the HKY+I+G substitution model was chosen and the substitution rate (estimated in the divergence time analysis) was set by a normally distributed prior with a mean of 0.074 subst/pos/Mya and a standard deviation of 0.01 subst/pos/Mya. Each MCMC run was conducted on 10 million iterations and the first 10% of each run was discarded as burn-in. The results of three independent runs were verified in Tracer and combined with Treeannotator. BSP were drawn with Tracer using linear change mode and the combined tree file. The effective population size was estimated assuming a generation time of 10 years . The same parameters were used to run the MCMC for a constant size coalescence model. In order to compare the BSP and constant size model and to see if one model is favoured over the other the Bayes factor (BF) was calculated using the BF analysis option implemented in Tracer.