### Alignment and sequence analysis

Sequences were aligned in MUSCLE 3.6 [23] and manually adjusted in Se-Al 2.0a11 [23] before being collapsed into haplotypes in COLLAPSE 1.2 [24] treating gaps as 5^{th} state (see Additional file 4). Previously published control region sequences [15] were also included in the alignment (GenBank:U47061–U47076). The haplotype alignment was used in evolutionary analyses. Sequences were submitted to GenBank (GenBank:EF057069–EF057098). The Taiwanese serow *Nemorhaedus swinhoei* (GenBank:AY149639) was used as outgroup, as a closely related caprine [25].

We attempted to cluster homologous clone sequences, so as to differentiate true mitochondrial sequences from Numts, by applying two different classification schemes: (i) clones were categorized by each muskox individual regardless of the number of clones per individual, resembling to a multiple-partitions dataset; the inequality of the number of clones per individual was corrected by coding missing clones as partitions with missing data. (ii) clones were categorized by individual and by tissue type, e.g. hair (H) and blood (B), knowing from previous research [7] that hair may be more Numt-rich than blood. The homology assessment analysis was done in POY [26] using the -chromosome -n2reorder command.

In recent years, evidence has burgeoned for the occurrence of mitochondrial recombination in a wide variety of animal taxa [27, 28], and more specifically in members of the family Bovidae, such as *Cephalophus* spp. (duikers), *Ovis* spp. (sheep), *Tragelaphus* spp., and *Kobus* spp (see Table 1 in [28]). Recombination should not produce misleading interpretations of populations with limited or nonexistent gene flow; however, it can mimic the traces of population size expansion, underestimate divergence times, and mask ancient polymorphisms as recurring mutations [27, 29]. In the particular case of caprines, in which mtDNA recombination is an acknowledged issue [28], it should be routine to search for evidence of recombination before making evolutionary inferences.

In order to examine the possibility of detecting HVR recombinant sequences, we employed several recombination detection methods, such as Bootscan, Chimaera, Geneconv, MaxChi, RDP, and SiScan, as implemented in the program RDP2 2b08 [30], using the automated scan option, and 1,000 permutations and 1,000 bootstrap replicates where applicable.

We tried to detect changes in base composition across sequences by plotting the relative compositions of individual bases, purines, and pyrimidines across all haplotypes, using a sliding window of 50 bp and a step of 10 bp in Treefinder [31].

The substitution model suggested by MrModeltest 2.2 [32] in conjunction with PAUP* 4b10 [33] using Akaike's Information Criterion (AIC) [34] was the general time-reversible model [35–37] with substitution rates following a Γ-distribution (GTR+Γ_{4}) with shape parameter α = 0.3732.

Bayesian inference (BI) of phylogeny was performed for all haplotypes of the short alignment in MrBayes 3.1.2 [38, 39] on a 9-node dual G5 processor (2.0 GHz, 18 GB RAM) XServe cluster at AMNH, using the model suggested by MrModeltest (GTR+Γ, with a flat prior on base frequencies following a Dirichlet distribution and six rate categories). Two simultaneous analyses were run for 10^{7} generations each, on two separate occasions, starting from different random trees, and the resulting trees were saved every 1000 generations. One "cold" and three "heated" chains were run with heating scheme fine-tuning; temperature parameters ranging from 2 to 0.03 were used and the latter was chosen, as it was leading to better mixing. The heat applied to all chains except for the "cold" one, was 0.97, 0.94, and 0.92, respectively. Successful chain swaps were in the range of 33–75%. Following visual inspection in Tracer 1.3 [40], stationarity of the Ln-likelihood (i.e. the Ln-probability of the data given the parameter values) was reached before 10^{6} generations. Similarly, we examined the average standard deviation of split frequencies and confirmed it approached zero (0.003418), indicating a satisfactory run length. Subsequently, the first 1,000 trees were discarded as burn-in and a 50% majority-rule consensus tree [41] was built with 18,002 optimal trees. The proportion of resulting trees presenting a given clade, in other words, the posterior probability of this clade, represents clade support in a Bayesian phylogenetic analysis [42].

Phylogeny was also estimated in a maximum likelihood (ML) framework in Treefinder using the GTR+Γ_{4} model. Approximate bootstrap support was estimated by applying the Shimodaira-Hasegawa (SH) test with RELL approximation [43] to all local rearrangements around an edge on the topology (also referred to as edge support or LRSH) and 50,000 replicates. For every edge on the tree, all its adjacent branch nearest-neighbor interchanges are computed and edge lengths are re-estimated by fixing all other parameters, and finally the SH test is applied. LRSH support values are the complement of the worst *p* -values from the SH test. A support value of 99 means this clade is significant at the 0.01 level, a value of 95 at the 0.05 level, and so on.

We employed unweighted and weighted maximum parsimony (MP) in PAUP* with a heuristic search using 100 random addition sequence replicates and TBR branch swapping. Given the number of transitions and transversions as calculated in MacClade 4.08 [44] (88 transitions and 22 transversions) yielding a transition-transversion ratio of 4:1, we downweighted transitions four times using a step-matrix. In unweighted parsimony, gaps were treated as missing data and as a 5^{th} state. Clade support was provided with 500 bootstrap replicates [45] and strict consensus trees [46] were built using all equally most parsimonious trees.

We used a Pairwise Relative Rates Test [47] to detect sequences with significantly different evolutionary rates, as implemented in HyPhy 0.99b [48] with a GTR+Γ substitution model and a "Local" model. Given a predefined outgroup sequence, the maximum likelihood estimate (MLE) for each 3-taxon tree is calculated, followed by the MLE calculation for the 3-taxon trees with evolutionary rates along the ingroup sequences constrained to be equal. A likelihood-ratio test (LRT) is then performed to examine if rates are equal or independent. The desired precision (absolute error) in the calculation of the Ln-likelihood value was set at 0.001.