In phylogenetic studies, data matrices are assembled and analysed to infer evolutionary relationships among species or higher taxa. Depending on the study, character-state data or distance matrices may be used, and several different types of data may be available to estimate the phylogeny of a particular group [1]. An increasing number of phylogenomic studies are published for data sets including more than 100 genes [2–10]. Whereas character-state data (e.g., nucleotide sequences) are commonly used for parsimony, maximum likelihood or Bayesian analyses, distance methods can be selected as an alternative option to decrease computing time when analysing large data sets, or else, can be used in comparative studies where the primary data are not available.

Different approaches have been proposed to analyse the growing amount of information that may originate from different sources. The total evidence approach [11], also called character congruence approach [*sensu* [12]] or combined analysis [*sensu* [13]], combines different data sets in a single supermatrix [14–17]. The taxonomic congruence approach [*sensu* [12]], or consensus approach [13], analyses each matrix separately, and combines the resulting trees *a posteriori* using a consensus [18–22] or a supertree method [23–26]. The pros and cons of these competing approaches have been debated at length in the literature [7, 17, 21, 22, 27–32]. An intermediate approach, referred to as the conditional data combination, consists in testing *a priori* the level of congruence of different data sets. Only the data sets that are considered statistically congruent, i.e. in phylogenetic agreement, are combined in a supermatrix. The remaining incongruent data sets are analysed separately [13, 19, 33–35].

The approach used often depends on the level of congruence or incongruence in the data. In phylogenetic analysis, "incongruence" can be defined as differences in phylogenetic trees. It is observed when different partitions, or data sets, sampled on the same taxa suggest different evolutionary histories [36]. However, incongruence may also arise when the data violate the assumptions of the phylogenetic method. Incongruence among data sets is fairly common and can be present at varying degrees [37]. Hence, statistical tests have been designed to detect the presence of incongruence and its magnitude [36]. In general, such incongruence tests are used to determine if the topological differences observed could have simply arose by chance [38]. The null hypothesis of most of these tests (H_{0}) is congruence, i.e., topologically identical trees, where any topological difference is the result of stochastic variation in the data sets [see [22], [38] for reviews]. The most commonly used test of this type is the Incongruence Length Difference test [ILD: [39]]. However, numerous problems are known to be associated to it. For example, type I error rates were shown to be well above the nominal significance level when data sets (with great differences in substitution rates among sites) were compared [40, 41]. Therefore, nominal significance levels of 0.01 or 0.001 have been suggested as more appropriate [36]. Also, power was low when short nucleotide sequences simulated on different tree structures were compared [41].

Numerous factors have been described to explain differences in phylogenetic trees obtained from the analysis of data sets containing the same species. A wide range of evolutionary processes may cause nucleotides at different sites to evolve differently, for examples due to their codon positions or to different functional constraints [42–44]. Also, various parts of the genome may have experienced different phylogenetic histories (e.g., mitochondrial vs. nuclear genes) and trees inferred from different data types (e.g., morphological or molecular data) may support different phylogenies [45]. Other evolutionary processes can explain incongruence between data sets: horizontal transfer, duplications, insertions or losses, incomplete lineage sorting, mobile elements, recombination, hybridization and introgression [see [37], [38] for an exhaustive list]. Furthermore, the use of an inappropriate method to analyse a given data set may lead to a spurious phylogeny, that can be erroneously incongruent to some extent with another phylogeny that has been correctly estimated [22, 33, 40]. Thus, given two data sets, one of which has parameters prone to long-branch attraction [46, 47], the choice of an inconsistent phylogenetic method to analyse both data sets may produce different trees. Incongruence due to systematic errors can be addressed by changing the evolutionary model or the phylogenetic method so that it conforms better to the data. However, incongruence resulting from genealogical discordance processes must be detected and handled in some appropriate ways, e.g., by using phylogenetic network inference methods [see [48] for a review]. Thus, three main causes can be invoked to explain incongruence: 1) different phylogenetic trees may be inferred due to random sampling errors, 2) different trees may be produced due to the presence of systematic errors, leading to erroneous phylogenetic inference, or 3) real differences may exist between phylogenetic trees due to contrasting evolutionary histories [38].

Alternatively, the term "congruence" is often used to describe data sets, characters or trees that correspond to identical (or compatible) relationships among taxa [49]. However, many authors use a definition of congruence that is looser than the previously described *identical* topology and that incorporate varying degrees of topological similarities. For example, taxonomic congruence, as defined by [12], is the *degree* to which different classifications of the same taxa support the same groupings. Since the pioneer study of [12], different measures and indices have been proposed to quantify the level of congruence [see review by [18]]. Conditional data combination often relies on such indices to determine the degree of congruence and on statistical tests in order to determine whether or not the data sets should be combined [13, 18, 33].

As described above, the term "congruence" and "incongruence" can have a more or less strict meaning with regards to the level of similarity. The definitions used in this paper are in concordance with the test of congruence among distance matrices (CADM). CADM was introduced by [50] and is applicable to two or more matrices. The null hypothesis of the test (H_{0}) is the complete incongruence of all trees (two or more), which corresponds to phylogenies with different topologies and/or very different branch lengths. Hence, the method can also account for branch lengths [as in [51]]. For two matrices (or two trees), the alternative hypothesis (H_{1}) is that the inferred trees are partially or completely congruent. When more than two matrices (or trees) are tested, H_{1} postulates that *at least* two trees in the group are partially or completely congruent. It is then possible to test for specific pairs. In this paper, *incongruence* refers to phylogenetic trees with different topologies, which suggests completely distinct evolutionary histories. At the opposite, *congruence* refers to two or more identical trees with an underlying identical evolutionary history (i.e., *complete congruence* or topological identity) or to two or more phylogenetic trees with a partial degree of similarity in their evolutionary relationships (i.e., *partial congruence*). The level of congruence can be measured by the test statistic, which ranges from 0 to 1.

More specifically, given two or more data sets (e.g., different genes) studied on the same species, a concordance statistic [Kendall's *W* statistic: [52], [53]] is calculated among the distance matrices corresponding to the gene sequences or to the trees and tested against a distribution of permuted values to estimate the probability that the data correspond to the null hypothesis. CADM is an extension of the Mantel test of matrix correspondence, which can be used to test the null hypothesis of complete incongruence of the distance matrices (corresponding to all data sets or trees under study). As a complement to the p-value, the *W* statistic provides an estimate of the degree of congruence of two or more matrices on a scale between 0 (no congruence) and 1 (complete congruence). Note that when trees have identical topologies but different branch lengths, a statistical conclusion of partial congruence or even incongruence may be reached, depending on the level of differences among the distance matrices that cause differences in relative distance rankings. The test allows users to detect these two cases; thus, both topological and phylogenetic congruence can be tested with CADM (see the Methods section).

*A posteriori* tests can be used to identify which data sets are congruent and to estimate their level of congruence. When the level of congruence among all distance matrices is low, researchers can decide to analyse the matrices separately or in subgroups. However, rejection of independence of the gene trees does not imply that inference of a tree from the combined data set is appropriate. In the case of partial congruence, phylogenetic network methods [48, 54], instead of traditional tree reconstruction methods, can be more appropriate to combine congruent data sets into a single analysis. Indeed, in these cases, evolutionary relationships may be better depicted as reticulated relationships. Additional tests or studies can also be performed to determine the causes underlying partial congruence [e.g., [55]]. Thus, with its null hypothesis (H_{0}) of complete incongruence of all trees, CADM differs from most other available phylogenetic tests of congruence/incongruence, which assume a common evolutionary history (H_{0}: congruence) and test the alternative hypothesis of different histories among the data sets (H_{1}: incongruence).

Previously published simulations have shown that the global and *a posteriori* CADM tests have a correct type I error rate and good power when applied to dissimilarity matrices computed from independently-generated raw data [50]. Identical results were obtained in simulations involving ultrametric distance matrices [56]. CADM has also been successfully used to detect congruence among phylogenetic trees obtained from different gene sequences [57]. In this paper, we expand on previous CADM simulations to assess the performance of the test when it is applied to phylogenetic trees. Specifically, the type I error rate and power of the global and *a posteriori* CADM tests were assessed using distance matrices obtained from nucleotide sequences simulated on additive trees under various phylogenetic conditions.