Skip to main content

Multigenic phylogeny and analysis of tree incongruences in Triticeae (Poaceae)

Abstract

Background

Introgressive events (e.g., hybridization, gene flow, horizontal gene transfer) and incomplete lineage sorting of ancestral polymorphisms are a challenge for phylogenetic analyses since different genes may exhibit conflicting genealogical histories. Grasses of the Triticeae tribe provide a particularly striking example of incongruence among gene trees. Previous phylogenies, mostly inferred with one gene, are in conflict for several taxon positions. Therefore, obtaining a resolved picture of relationships among genera and species of this tribe has been a challenging task. Here, we obtain the most comprehensive molecular dataset to date in Triticeae, including one chloroplastic and 26 nuclear genes. We aim to test whether it is possible to infer phylogenetic relationships in the face of (potentially) large-scale introgressive events and/or incomplete lineage sorting; to identify parts of the evolutionary history that have not evolved in a tree-like manner; and to decipher the biological causes of gene-tree conflicts in this tribe.

Results

We obtain resolved phylogenetic hypotheses using the supermatrix and Bayesian Concordance Factors (BCF) approaches despite numerous incongruences among gene trees. These phylogenies suggest the existence of 4-5 major clades within Triticeae, with Psathyrostachys and Hordeum being the deepest genera. In addition, we construct a multigenic network that highlights parts of the Triticeae history that have not evolved in a tree-like manner. Dasypyrum, Heteranthelium and genera of clade V, grouping Secale, Taeniatherum, Triticum and Aegilops, have evolved in a reticulated manner. Their relationships are thus better represented by the multigenic network than by the supermatrix or BCF trees. Noteworthy, we demonstrate that gene-tree incongruences increase with genetic distance and are greater in telomeric than centromeric genes. Together, our results suggest that recombination is the main factor decoupling gene trees from multigenic trees.

Conclusions

Our study is the first to propose a comprehensive, multigenic phylogeny of Triticeae. It clarifies several aspects of the relationships among genera and species of this tribe, and pinpoints biological groups with likely reticulate evolution. Importantly, this study extends previous results obtained in Drosophila by demonstrating that recombination can exacerbate gene-tree conflicts in phylogenetic reconstructions.

Background

When reconstructing the phylogeny of a biological group it is implicitly assumed that species split in a tree-like manner and that all characters (e.g., all genes in the genome) reveal the same genealogical history that has occurred in each lineage after the split from a common ancestor. When these two assumptions are met phylogenetic trees inferred from one or a few genes can be used as proxies of the species tree. However, recent studies have shown that trees inferred from different genes may conflict with each other and that violation of these assumptions is more common than previously thought [1–10].

Incongruence may appear among gene trees for various reasons. If the genes used to infer the phylogenetic relationships among genera and species are sampled from introgressed portions of the genome produced by hybridization, gene flow or horizontal gene transfer, the trees obtained likely reflect the history of the introgression rather than the history of lineage splitting [11, 12]. The genealogical histories of individual genes may also be misleading due to retention and stochastic sorting of ancestral polymorphisms caused by incomplete lineage sorting. This is especially likely when the effective population size of a given lineage is large with respect to the time elapsed since divergence [13–15]. In this case, genetic drift is unlikely to have brought alleles to fixation before subsequent divergence [1, 6]. Finally, gene duplication followed by gene loss may lead to incongruence because paralogous gene copies are incorrectly inferred to be orthologous [16].

Whatever their origin, incongruences among gene trees require careful attention for several reasons. First, they affect the interpretation of morphological and molecular patterns of evolution. Second, they maintain extensive instability in taxonomy. Third, they complicate the choice of wild taxa as sources of novel genes in breeding programs (e.g., genes conferring resistance to pathogens, tolerance to salt, low temperatures and drought). Finally, uncertainty in phylogenetic relationships may lead to inadequate conservation decisions (e.g., the protection of particular species or habitats).

In prokaryotes, some authors argue that numerous hybridizations and gene transfers preclude the possibility and the meaning of a tree-like representation of a species history [17, 18]. In plants too it has been argued that, in some cases, reticulate evolution is more appropriate than a tree-like description [19]. On the contrary, other authors argue that despite incongruences it is possible to reconstruct phylogenies and tree-like histories [20, 21]. Among angiosperms, Triticeae grasses provide a particularly striking example of incongruence among gene trees, suggesting reticulate evolution [22]. This tribe comprises species of major economic importance, including wheat, barley and rye. In recent years, attempts to try and sort out the phylogenetic details of the group, based on analyses of single-copy nuclear genes [23–26], highly repetitive nuclear DNA [27], internal transcribed spacers [28], and chloroplastic genes [29–31], failed to lead to any consensual definition of clades. Current evidence suggests that different portions of the nuclear and chloroplastic genomes have different genealogical histories. Because published trees are in conflict for almost all taxon positions we do not know whether the historical relationships among the genera and species of this tribe can be resolved, or not, in a tree-like manner and, if so, what are the real phylogenetic relationships. In this paper, we use the most comprehensive molecular dataset to date in Triticeae, including 27 gene fragments, with the aim to (i) reconstruct a multigenic phylogeny of this tribe, (ii) quantify tree incongruences, and (iii) explore possible factors affecting incongruence, including the frequency of recombination, chromosomal location and evolutionary rate.

Methods

Species Studied and Loci Sampled

Nineteen diploid species, spanning 13 genera of Triticeae, were analyzed. These species were selected because they belong to most phylogenetic clades recognized so far [22, 26, 29] and represent most of the diversity of diploid genera (68% according to [22] and [32]), life styles (annual and perennial), mating systems (self-compatible and self-incompatible), and geographic origin (Europe, Middle East, Asia, North America and Australia). One or two accessions per species were obtained from the United States Department of Agriculture (USDA), National Plant Germplasm System (available at http://www.ars-grin.gov/npgs/index.html), making a total of 32 accessions (Table 1). Although Bromus is supposed to be the closest outgroup of Triticeae [26, 33, 34], to simplify primer design we preferred to use Brachypodium distachyon, a more distant species for which the complete genome is available [35]. As the ingroup topology may depend upon the choice of a single outgroup, Zea mays and Oryza sativa were also incorporated as additional, more distant outgroups. The choice of distant outgroups may increase the number of homoplasies. However, owing to the selective constraints likely acting on the coding sequences we used (see below), it is likely they have been affected by low substitutional saturation and hence by a low homoplasy level.

Table 1 Species names, accession numbers in the USDA database, and geographic origin of sampled Triticeae

Orthologous coding sequences (cDNA) of one gene fragment from the chloroplast (MATK) and 26 nuclear gene fragments located on three different chromosomes (out of the seven chromosomes representative of Triticeae) were sequenced for each accession (Table 2; GenBank: HM539308-HM540073). Sequences of B. distachyon were obtained from the US Department of Energy Joint Genome Institute http://www.jgi.doe.gov/. Sequences of Z. mays and O. sativa were obtained from the GenBank.

Table 2 Relevant phylogenetic and genomic parameters for sequenced loci

RNA Extraction, cDNA Synthesis, PCR Amplification and Sequencing

Total RNA was extracted from 100 mg of young leaves using the RNeasy Plant Mini Kit (Qiagen). cDNA was synthesized from this RNA using oligo-dT primers with the Reverse Transcription System kit (Promega), following the manufacturer's protocol. For each gene fragment a couple of primers were designed on conserved regions identified based on alignments of barley and wheat EST (Additional file 1, Table S1). PCR amplification was performed on cDNA and amplification products were purified with the AMPure kit (Agencourt). Sanger sequencing was performed on amplicons with the same primers used in the PCR amplification. The BigDye Terminator v3.1 Cycle Sequencing Kit (Applied BioSystems) was used with 1.0 μl of BidDye v3.1 enzymatic reaction mix. The reactions were purified with the CleanSEQ kit (Agencourt) and separated on a 3130×l Genetic Analyzer (Applied BioSystems).

Orthology Determination and Location of Loci on the Triticeae Genome

Because no genome of Triticeae species has been sequenced yet the orthology of the 27 sequenced gene fragments was established indirectly. Two observations strongly suggest the uniqueness of our sequences in the genome. First, all the sequenced loci are present in single copy in rice and B. distachyon, two species whose complete genomes are available. Second, according to EST data, they are expressed in one copy in diploid barley and in three copies in hexaploid wheat, which suggests the existence of a single copy per genome. The probability that our sequences coexist with paralogs is therefore limited.

From the 26 sequenced nuclear loci, 21 were derived from the rice chromosome 1, known to be collinear with the wheat chromosome 3B [36–38]. We assumed that chromosomal locations are mostly conserved across Triticeae [36–40] and used the location of rice orthologs as a proxy of their chromosomal position. Moreover, we checked that using B. distachyon as reference did not alter the estimated physical position. The relative distance to the centromere of each gene fragment was then computed assuming that the chromosome fraction separating them from the centromere was proportional in rice chromosome 1 and Triticeae chromosome 3 [36, 38]. The centromere is located around 17 Mb from the telomere of the short arm in rice and 388 Mb in wheat [36, 41].

Wheat has a strong recombination gradient [41, 42], like other Triticeae species [39, 43, 44], which fits a positive exponential function from centromeres to telomeres [45]. The 21 loci located on chromosome 3 were thus suitable candidates to study the impact of recombination intensity in gene evolution. These loci were classified as centromeric (physical distance < 70% of chromosome arm), for which recombination is low, and telomeric (physical distance > 70% of chromosome arm), which concentrate most recombination events [45]. Due to the strong non-linear relationship between the physical and genetic map, the genetic distance along the chromosome was approximated according to reference [45]. Akhunov et al. [45] estimated the centimorgan per mega base (cM/Mb) ratio as a function of the percent of chromosome arm. To obtain the genetic distance in cM, we integrated the equation given in figure 1 in [45] and multiplied the result by the arm length:

(1)

where L is the length of the chromosome arm (388 Mb for the short arm and 437 Mb for the long arm; [41]) and x is the relative distance to the centromere. To follow the evolution of recombination along the chromosome, positive (respectively negative) distances were assigned to the long (respectively short) chromosome arm.

In addition to the 21 nuclear loci located on chromosome 3, two loci corresponding to the hardness gene (PinA and PinB; [46]), one gene fragment corresponding to a eukaryotic initiation factor involved in translational regulation (eIFiso4E), and two gene fragments involved in the carotenoid biosynthetic pathway (CRTISO and PSY2; [36, 47]), were sequenced. Positions of loci PinA and PinB were obtained from published data [46]. Positions of eIFiso4E and CRTISO were inferred from synteny with rice. The position of PSY2 is undetermined.

Individual Gene Trees

Raw sequence data were aligned with the Staden Package [48] and the resulting alignments were manually corrected. Alignments for individual loci were analyzed using maximum likelihood (ML) and Bayesian approaches. ML analyses were conducted using the best-fitting model of sequence evolution, based on Akaike Information Criterion (AIC) using ModelTest 3.7 [49] (Table 3). PAUP* 4.0b10 [50] was used to obtain the highest-likelihood phylogenetic trees (heuristic search with neighbor-joining starting tree, tree bisection-reconnection branch swapping and 100 bootstrap replicates). Bayesian analyses were performed with MrBayes 3.1.2 [51, 52] with the following priors: Dirichlet priors (1,1,1,1) for base frequencies and (1,1,1,1,1) for General Time Reversible (GTR) parameters scaled to the G-T rate, a uniform (0.05,50) and (0,1) priors for the gamma (Γ) shape and the proportion of invariable sites (I), and an exponential (10.0) prior for branch lengths. Metropolis-coupled Markov Chain Monte Carlo analyses (MCMCMC) were run with random starting trees and five simultaneous, sequentially heated independent chains. Analyses were run for 10,000,000 generations. We used the BPCOMP program implemented in PhyloBayes 2.3c [53] to determine appropriate convergence of the chains (i.e., the maximum difference [maxdiff] between posterior probabilities attached to the same clade as evidenced by independent chains is < 0.10). A burn-in was discarded after identifying the stationary phase.

Table 3 Best-fitting model of sequence evolution for each locus

Multigenic Trees and Network

We obtained a multigenic, supermatrix tree by concatenating alignments of all 27 loci (24,652 bp). ML analysis of this supermatrix was performed in the same way as individual locus alignments using a GTR+Γ+I model. Bayesian inference was performed by partitioning the concatenate alignment on the basis of individual loci using a GTR+Γ+I model. Chains were run for 10,000,000 generations.

Multigenic, Bayesian Concordance Factors (BCF) were estimated using BUCKy 1.3.1 [54]. BCF estimates the degree of conflict of individual gene trees and accounts for all biological processes resulting in different phylogenies (e.g., introgression, incomplete lineage sorting). We summarized information from five MCMCMC chains obtained in individual locus analyses with MrBayes and removed 50% of the samples from each chain as burn-in. Then, BCF were estimated using six a priori levels of discordance among loci (α = 0.1, 0.5, 1, 5, 10 and 100) and 1,000,000 generations in each run.

Supermatrix and BCF analyses provide powerful means of using the evidence from all characters in the final estimation of the phylogenetic tree [55]. However, they implicitly assume that species split in a tree-like manner, which would not be the case when hybridization and/or lineage sorting have played an important role in the history of a group, as seems to be the case in Triticeae. To identify regions of the phylogeny of Triticeae that have not evolved in a tree-like manner, we constructed a multigenic network summarizing information conveyed by individual gene trees. The 27 gene trees were modified using the PhySIC_IST preprocess of source trees [56]. This preprocess aims at reducing source tree conflicts by eliminating a topological resolution when it is significantly less frequent in source trees than an alternative conflicting resolution. We applied a correction threshold of 0.9 to only keep strongly supported incongruences. Then, a network displaying all clades present in at least one among the modified gene trees was computed using the Cass algorithm [57] implemented in Dendroscope 2 [58], inputted with the Z-closure of the modified trees [59].

Incongruence Quantification

The level of incongruence among individual gene trees and the two multigenic trees was first assessed by Shimodaira and Hasegawa tests [60]. The Shimodaira and Hasegawa test, based on sequence alignments, was used to compare majority-rule consensus tree topologies obtained with PAUP* for individual genes and the topologies of the supermatrix and BUCKy trees. Polytomies were randomly resolved by bipartitions using the multi2di function implemented in the APE package [61] of R 2.9.1 [62]. This was done because polytomies were strongly penalized in the log-likelihood score. Indeed, if polytomies are left unresolved it is not possible to determine whether significance of Shimodaira and Hasegawa tests was due to the fact that an alternative topology was more (or less) likely than a given tested topology or simply because it was more (respectively less) resolved. In all cases, when polytomies were kept the supermatrix and BUCKy trees resulted in higher log-likelihoods, consistent with the fact they are fully resolved. Shimodaira and Hasegawa tests were run using a GTR+Γ+I model in the BASEML program implemented in PAML 4.1 [63].

In addition, we used the χ2 test of the PhySIC_IST preprocess [56] to identify triplets of leaves observed in the multigenic trees that were strongly rejected by the 27 bootstrap gene-tree collections. A strong rejection was defined as follows: denoting R s the set of triplets of a multigenic tree (supermatrix or BUCKy), and R b the set of triplets of the 2,700 bootstrap gene trees (100 per locus), a triplet of R s was said to be strongly rejected if it contradicted at least one triplet of R b and failed the χ2 test described in [56] with a threshold of 0.9. Using this procedure we counted the number of strongly rejected triplets a taxon belongs to.

To quantify the degree of incongruence between individual gene trees and the two multigenic trees, we defined a triplet-based distance between a given multigenic tree (T s ) and the forest (F j ) of 100 bootstrap trees obtained for locus j. To put it simply, the triplet distance represented the percentage of triplets that were resolved differently by a multigenic tree (supermatrix or BUCKy) and a given gene tree. In order to separate the signal of this locus from potential stochastic errors, we focused on triplets that appeared more than 50% of times in F j . This threshold has the advantage of keeping one and only one resolution per group of 3 species. Defining a threshold at 60% does not qualitatively alter our results (results not shown).

We denoted the number of retained triplets of F j that had the same resolution as T s , and the number of retained triplets of F j with a different resolution. We defined the distance between the tree T s and the forest F j , denoted d(T s , F j ), as the triplet fit dissimilarity (1 minus the triplet fit similarity [64]) between the triplet set of T s and the retained triplets of F j :

(2)

Using similar procedures, we computed the triplet distance between all pairs of the 21 individual loci located on chromosome 3. We defined a triplet-based distance between each pair of forests F i and F j , where F i and F j were, respectively, the forests of 100 bootstrap trees obtained for loci i and j. As above, we focused on triplets that appeared more than 50% of times in each forest in order to eliminate potential stochastic errors. The distance d(F i , F j ) between F i and F j is defined as:

(3)

In this way, we obtained a symmetric distance matrix (M) with 21 rows and 21 columns, where each entry M ij contained the triplet distance between loci i and j. This matrix was used in the analysis of gene-tree incongruence and recombination (see below).

Analyses of Patterns of Incongruence

In order to understand the origin of incongruences, we correlated triplet distances between individual loci and the multigenic trees (d(T s , F j ) in equation 2) to relevant phylogenetic parameters, including alignment length, average evolutionary rate (estimated with the super-distance matrix method [65]), and shape parameter α of the gamma distribution (obtained in ML analyses of individual loci). We also tested if incongruences were positively correlated with recombination by using the 21 loci located on chromosome 3. This correlation is expected whatever the origin of tree incongruence. Indeed, following interspecific hybridization, recombination is necessary for genes of one species to introgress into the genome of the other species. Alternatively, because the effective population size is expected to be smaller in low than in high recombining regions [66, 67], coalescence is expected to be quicker and lineage sorting more complete when recombination is low. We thus tested if the triplet distance was lower in centromeric than in telomeric regions by fitting a quadratic regression of d(T s , F j ) on the genetic distance. We performed the same analyses on the aforementioned phylogenetic parameters because recombination could affect incongruences indirectly through these parameters (e.g., higher evolutionary rates in high recombining regions).

In addition, we tested whether the distribution of incongruences differed significantly between centromeric and telomeric loci located on chromosome 3. To this end, we estimated the triplet distance per pair of loci by distinguishing chromosome arms (short, long) and regions (centromere, telomere). Note that we did not mix loci located on different arms. Then, we obtained the difference in medians of the two distributions. To test whether this difference was statistically significant, we performed 10,000 replicates by permuting loci on each arm and recalculated the difference in medians at each permutation. The median difference observed with the actual dataset was compared with those observed in the permutated datasets.

Finally, closely linked loci more likely share a common genealogical history than distant loci [13]. To test this hypothesis we constructed a matrix of genetic distance between pairs of loci for the 21 genes located on chromosome 3. We correlated this matrix with the matrix of incongruences by pairs (M ij ) and tested the significance of the correlation by performing 10,000 permutations of locus locations on each chromosome arm (avoiding permutation from one arm to another).

Statistical analyses were performed with R 2.9.1 [62] and analysis of the distribution of incongruences in centromeric and telomeric loci was performed with Mathematica [68].

Results

Numerous Incongruences among Individual Gene Trees

The best models describing the evolution of individual loci are presented in Table 3 and the corresponding trees in Additional files 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27 and 28, Figures S1-S27. Phylogenetic reconstructions using individual loci produce variable topologies. Often, relationships among genera and species are incongruent among individual loci. The positions of Pseudoroegneria and Hordeum are not stable among individual gene trees: in some cases, Pseudoroegneria branches in basal positions (e.g., LOC_Os01g01790, LOC_Os01g09300, LOC_Os01g24680, LOC_Os01g55530, LOC_Os01g56630, LOC_Os01g62900, eIFiso4E), whereas in other cases it branches within more recently diverging clades (e.g., LOC_Os01g11070, LOC_Os01g13200, LOC_Os01g39310, LOC_Os01g53720, LOC_Os01g61720, LOC_Os01g68770, LOC_Os01g73790, CRTISO, PinA, PSY2, MATK). Likewise, Hordeum, a genus thought to be one of the deepest among Triticeae [29, 30, 33], sometimes branches into quite terminal positions (LOC_Os01g01790, LOC_Os01g11070, LOC_Os01g21160, LOC_Os01g24680, LOC_Os01g37560, LOC_Os01g48720, LOC_Os01g53720, LOC_Os01g55530, LOC_Os01g56630, LOC_Os01g68770, LOC_Os01g70670, LOC_Os01g73790, PSY2, eIFiso4E, and CRTISO). Several other odd relationships involving different taxa are displayed by individual gene trees (Additional files 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27 and 28, Figures S1-S27). In general, individual gene trees have shorter internal branches than terminal branches (i.e., low treeness). In addition, support values (bootstrap values and posterior probabilities) of deeper nodes are weaker than those of more recent nodes. Similar observations were made in previous studies [22].

Multigenic Analyses: a more Resolved Picture

The supermatrix tree obtained with the concatenation of all loci (~25 Kb) provides a much more resolved picture than individual gene trees. ML and Bayesian analyses are consistent and produce very similar trees. According to these trees, we distinguished 5 to 7 clades depending on posterior probability or bootstrap supporting values (Figure 1). The first divergent group within Triticeae is Psathyrostachys (clade I), followed by Hordeum (clade IIA) and Pseudoroegneria (clade IIB). The internal branches are quite short compared with the terminal branches, suggesting that cladogenesis occurred in rapid succession. Two well-supported clades diverge at this point. The first is formed by Australopyrum (clade IIIA), Henrardia and Eremopyrum bonaepartis (clade IIIB), and Agropyrum and E. triticeum (clade IIIC). The second consists of Dasypyrum and Heteranthelium (clade IV), on the one hand, and Secale, Taeniatherum, Triticum and Aegilops (clade V), on the other hand.

Figure 1
figure 1

Supermatrix phylogeny of Triticeae. Phylogenetic tree inferred with the concatenation of 27 loci (~25 Kb). Bootstrap values are given in percentage. Maximal posterior probability (100%) for all nodes except one (indicated in brackets). Note that branch lengths of the outgroups are divided by 10 (dotted lines) in order to zoom in Triticeae.

BUCKy retrieves a unique topology irrespective of the different a priori levels of incongruence (α varying from 0.1 to 100), although mean sample-wide concordance factors (i.e., the proportion of the dataset that supports a bipartition) diminish with increasing α (Figure 2, Table 4). There is agreement between the estimated sample-wide and the extrapolated genome-wide concordance factors (i.e., the proportion of the whole genome that agrees with a given bipartition) and both concordance factors are higher in terminal than in deeper branches. The BCF tree is congruent with the supermatrix tree in several respects. First, Psathyrostachys (clade I) and then Hordeum (clade IIA) are the first divergent genera within Triticeae. Second, clade V is retrieved although branching within this clade changes relative to the supermatrix tree: Secale and Taeniatherum branch together, T. monococcum branches sister to Ae. tauschii, and Ae. speltoides and Ae. longissima group together. Third, monophyly of clade III is confirmed in this analysis although with alternative branching: Henrardia and E. bonaepartis (clade IIIB) are the first divergent taxa, and Australopyrum (clade IIIA) branches sister to Agropyron and E. triticeum (clade IIIC). However, a major discrepancy between this tree and the supermatrix tree is worth noting: Pseudoroegneria does not group with Hordeum but sister to Dasypyrum. Consequently, Heteranthelium branches at the base of clade V and these two new inferred clades (Pseudoroegneria-Dasypyrum and Heteranthelium-clade V) are closely related to each other. Despite differences among the supermatrix and BUCKy trees, the resolution and support gained with multigenic approaches compared with single-locus analyses are remarkable. Differences among trees are mainly due to uncertainty in the position of Pseudoroegneria.

Figure 2
figure 2

Primary concordance tree of Triticeae. Phylogenetic tree inferred with BUCKy. Splits are presented in branches. Concordance factors for splits are presented in Table 4. Clades named as in Figure 1.

Table 4 Primary concordance factors

The multigenic network displays most of the relationships present in the supermatrix and BUCKy trees (Figure 3). In addition, it points to the less resolved parts of the phylogeny that mainly correspond to nodes with low support. Psathyrostachys and Hordeum are the deepest genera of Triticeae. Their divergence occurs in a tree-like manner as do relationships within clade III. Note that the topology of clade III is the same between the multigenic network and the BUCKy tree. Uncertainties for inter-clade relationships mainly involve Dasypyrum and Heteranthelium but only few alternatives are proposed by the network analysis. Finally, branching of most species in clade V is quite variable and this instability is better taken into account by the network than by any of the multigenic trees (supermatrix or BUCKy). Overall, the network analysis reveals a general tree-like divergence history of the Triticeae with local episodes of reticulate evolution.

Figure 3
figure 3

Multigenic network of Triticeae. Network obtained from the 27 individual gene trees modified with PhySIC_IST [56] using a correction threshold of 0.9 (see details in Methods).

Patterns of Incongruence among Trees

One of the most puzzling results obtained in this study is the numerous incongruences among individual gene trees. In most cases, Shimodaira and Hasegawa tests confirm that, regarding a given locus alignment, the corresponding gene tree has a significantly higher log-likelihood than that of other individual gene trees. However, in most cases, differences between individual gene trees and the two multigenic trees (supermatrix and BUCKy) are not statistically significant (Table 5). This suggests that the splitting histories of species lineages depicted by the supermatrix or BUCKy trees are reasonable compromises of individual gene tree scenarios.

Table 5 Shimodaira and Hasegawa tests among individual gene trees and the two multigenic trees (supermatrix and BUCKy)

In order to quantify topological incongruences, we estimated triplet distances among individual gene trees, as well as between each individual gene tree and the two multigenic trees, as described in Incongruence Quantification in the Methods section. The average triplet distance between individual gene trees (in absolute value) is 0.53 ± 0.14 (mean ± SD; range: 0.10-0.88). The average triplet distance between individual gene trees and the supermatrix tree is 0.21 ± 0.10 (0.08-0.50) and the average triplet distance between individual gene trees and the BUCKy tree is 0.25 ± 0.09 (0.07-0.43). In addition, we counted, for each accession, the number of triplets that are observed in individual gene trees and strongly rejected by the two multigenic trees. Excepting the two Psathyrostachys accessions, all other taxa are involved in several strongly rejected triplets (Table 6). Pseudoroegneria is often the most incongruent genus (13% or 24% of all incongruences according to the BUCKy or supermatrix tree, respectively). Whereas Aegilops/Triticum species are not especially incongruent with the supermatrix tree (16%), they are highly incongruent with the BUCKy tree (35%). Conversely, Hordeum is involved in many incongruent triplets when the supermatrix tree is used (20%) but is only involved in as few as 1% of incongruent triplets when the BUCKy tree is considered. Results are similar when we remove the outgroups (Zea, Oryza and Brachypodium) and re-root the trees with Psathyrostachys (results not shown). This demonstrates that incongruences are not due to the difficult positioning of the root of the trees.

Table 6 Number of incongruent strongly rejected triplets per accession

The Effect of Recombination on Incongruences

We performed several tests to understand the origin of incongruences among gene trees. First, we tested if variation in incongruence could be explained by the nature of the phylogenetic signal. After correlating the triplet distance between individual genes and the two multigenic trees (supermatrix and BUCKy) with relevant phylogenetic parameters per locus, we only detect a significant positive correlation between the average evolutionary rate and triplet distance separating individual gene trees from the supermatrix tree (Spearman's rho = 0.42, P = 0.03), although not with the BUCKy tree (rho = 0.21, P = 0.28). Next, we investigated the effect of recombination. Recombination does not significantly affect any phylogenetic parameter (P > 0.5 in all correlations; results not shown). In contrast, it affects incongruences in three ways. First, incongruences are significantly greater in telomeric than in centromeric loci (respective medians = 0.511 and 0.429, P = 0.028; Figure 4A). Second, loci located close together on the chromosome tend to have more similar genealogical histories than distant loci (Figure 4B; rho = 0.22, P = 0.026). Third, although the statistical significance is low (possibly because of the small number of genes we sampled), incongruences between individual gene trees and the two multigenic trees tend to increase with the genetic distance between pairs of loci, that is, with the likelihood of recombination (P = 0.09 for the supermatrix tree; P = 0.10 for the BUCKy tree; in the two cases we removed one potential outlier; Figure 5). Note that similar qualitative patterns are observed when using a more restrictive cut-off to keep incongruences (triplets that appeared more than 60% of times in the 100 bootstrap trees rather than 50% of times), although the statistical significance disappeared (results not shown). Again, this is possibly due to the limited number of genes we sampled.

Figure 4
figure 4

Effect of recombination on incongruences. A. Incongruence level of loci located in centromeres (filled bars) and telomeres (open bars). B. Correlation between the triplet distance and genetic distance between pairs of loci. Only loci located on chromosome 3 are depicted.

Figure 5
figure 5

Effect of recombination on incongruences. Relationship between the triplet distance between individual gene trees and the two multigenic trees (supermatrix tree in A; BUCKy tree in B) as a function of the genetic distance between genes located on chromosome 3. The triplet distance between individual gene trees and the multigenic trees is the percentage of triplets of accessions that were resolved differently by a multigenic tree and a given gene tree. Solid line: best fit using all points; dashed line: best fit without a potential outlier (filled point). The genetic distance is connected to the chromosomal position according to the schematic diagram presented in C (red point: centromere; dark blue: centromeric regions; light blue: telomeric regions). D. Degree of incongruence among pairs of loci relative to the genetic distance on chromosome 3. Colors represent the degree of incongruence (white: no incongruence; red: strongest incongruence).

Discussion

Multigenic Phylogeny of Triticeae

Up to date, morphological and molecular analyses have failed to infer a reliable phylogeny of the Triticeae. Many previous phylogenetic reconstructions were based on a limited number of genes, in most cases only one (see references above). The many conflicts among published trees, combined with poor resolution of branching among genera and species, prevented a clear picture of the relationships among members of this tribe from emerging. Moreover, it was not possible to conclude whether reticulate evolution is the dominant rule in this tribe, so that reconstructing a resolved phylogeny is hopeless, or whether multigenic approaches could solve the problem. Thanks to the largest dataset used so far in this tribe (27 genes), we show that combining information from several loci located on different chromosomes and cellular compartments (nucleus and chloroplast) enable the identification of major clades. Although the branching position of some groups remains uncertain (e.g., Pseudoroegneria), the two multigenic trees and the multigenic network enabled us to resolve most parts of the Triticeae phylogeny.

Some incongruence persists in our analyses. They are represented in the phylogeny by low support values (bootstrap, posterior support and concordance factors) and are summarized within the multigenic network. Figures 1, 2 and 3 show that Psathyrostachys branches sister to the remaining Triticeae, followed by the sequential branching of Hordeum. Some previously published phylogenies recognized the early divergence of Psathyrostachys and Hordeum, including nuclear [23, 27, 33] and chloroplastic DNA analyses [29, 30], although many other studies disagreed [24, 26, 28, 32, 33]. Here, Psathyrostachys is involved in no incongruence (Table 6) and the branch leading to the rest of the tribe is among the longest internal branches. This clearly indicates that this genus is the sister group of all other Triticeae.

Our dataset does not allow us to resolve the position of Pseudoroegneria. No study has raised the possibility that it branches out with Hordeum, as proposed by the supermatrix tree, and only one study suggested that Pseudoroegneria could group with Dasypyrum, as proposed by the BUCKy tree, although support for this relationship was low [69]. Other studies proposed that Pseudoroegneria could branch sister to Taeniatherum and/or Australopyrum [23, 24, 33], Heteranthelium [26] or Aegilops [32]. In other cases its branching pattern was unstable [22, 29] and it was even considered paraphyletic [30, 70]. Consistent with the difficult positioning of this genus, the supermatrix tree groups it with Hordeum with a rather weak bootstrap support (0.69), although maximal posterior probability (1.00), conflicting with the BUCKy tree. More strikingly, the three Pseudoroegneria accessions are, in general, involved in more incongruent triplets than other accessions (Table 6). This could be due to a strong capacity of introgression during divergence of this group and/or a large ancestral population size. In agreement with the first hypothesis, Pseudoroegneria and Hordeum hybridized and were at the origin of the species-rich allotetraploid genus Elymus [23, 30]. If recent interspecific introgression were responsible for the incongruence pattern we observed it would likely proceed via polyploids because all genera investigated here are currently intersterile [71–74]. Polyploids could serve as bridges of genes between diploid species [22]. Alternatively, if rapid radiation followed by incomplete lineage sorting were the main factors contributing to incongruence, one would expect deep branches to be short and less supported than external branches. This is basically what we observe in individual gene trees and the supermatrix tree. The relatively strong support obtained in the supermatrix analysis is due to accumulation of phylogenetic signals when concatenating all genes. Hence, it could be that both factors, hybridization and incomplete lineage sorting of ancestral polymorphism, contributed to the pattern of incongruence and the unstable positioning of Pseudoroegneria.

Phylogenetic positions of all other genera and species varied in previous studies and no consensus emerged. In the present study we find strong support for clades III and V, although branching order varied depending on the multigenic tree. Overall, our results suggest rapid radiation following -or concomitant with- divergence of clade III (grouping Agropyron, Australopyrum, Eremopyrum and Henrardia).

Incongruence and Recombination

Our results provide strong evidence of incongruence among individual gene trees, unraveling a complex biological reality where different portions of the genome exhibit different histories (their own evolutionary histories). We analyze the pattern of incongruence and pinpoint the role of recombination and gene location on it. We demonstrate that physically close loci are more likely to share a common history than distant loci. More interestingly, loci located in centromeric regions tend to be more congruent with one another than loci located in telomeric regions. A similar correlation was found in Drosophila at the kilobase scale, the scale of linkage disequilibrium in this group [13]. In contrast, the mosaics of conflicting genealogies observed in Oryza (rice) were randomly distributed across the genome [75]. It could be surprising that the correlation we observe holds at the scale of the whole chromosome 3 (~1 Gb; [38]). Several non-exclusive reasons could explain this pattern. First, the recombination gradient along chromosomes is very steep in all Triticeae, including wheat [42, 43, 45], rye [76] and Aegilops speltoides [44]. For instance, along chromosome 3B in bread wheat (Triticum aestivum), the cM/Mb ratio spans about two orders of magnitude (from 0.01 to 0.85; [41]). Accordingly, in bread and durum wheat (T. aestivum and T. durum, respectively), linkage disequilibrium decays slowly over several cM [77]. Despite the impressive chromosome size, linkage disequilibrium could be high in centromeric regions because recombination is strongly reduced (see [78] for a study on chromosome 3B). However, the level of linkage disequilibrium is low in barley [79]. This discrepancy highlights the need for studying linkage disequilibrium patterns in Triticeae in more detail. Second, centromeric genes may have a lower local effective size than telomeric genes because of hitchhiking effects due to the lack of recombination [66, 67]. In agreement with this prediction, the levels of diversity positively correlate with the proxy of recombination in Aegilops: the RFLP polymorphism is 1.5 to 25 times higher in telomeric than in centromeric regions [80]. Consequently, ancient polymorphisms would be less completely sorted in genes located in high recombining than in low recombining regions. Finally, recombination could play an important role in introgressive events between species. For instance, genes located in high recombining regions would introgress easier than genes located in regions of low recombination.

There is no straightforward way to distinguish whether the overall pattern of incongruence in Triticeae is produced by incomplete lineage sorting or a form of introgression (e.g., gene flow proceeding via polyploids). Methods that enable introgression and incomplete lineage sorting to be distinguished [81] require the estimation of population sizes and divergence times for all branches of the species phylogeny. This information cannot be obtained in Triticeae without making strong assumptions. More knowledge about population parameters and divergence times is necessary to distinguish between these two sources of tree incongruence in this tribe.

Conclusions

Our study contributes two important aspects for research in Triticeae in particular, and for the broad phylogenetic community in general. First, we show that in spite of strong tree conflicts not all clades of Triticeae are affected by introgression and/or incomplete lineage sorting. Notably, Psathyrostachys, Hordeum and genera in the clade III (including Agropyron, Australopyrum, Eremopyrum and Henrardia) diverge in a tree-like manner, a result that was not supported by previous studies. Because the evolution of Pseudoroegneria and genera in clades IV (Dasypyrum and Heteranthelium) and V (Secale, Taeniatherum, Triticum and Aegilops) is more reticulated than in other clades, the multigenic network better reflects their phylogenetic history than do the supermatrix or BUCKy trees. Second, we demonstrate that recombination could be an important evolutionary force in exacerbating the level of incongruence among gene trees. It would be worthwhile estimating the frequency of recombination of genes used in future phylogenetic studies in order to assess the generality of the pattern previously observed in Drosophila and now evidenced in Triticeae.

References

  1. Pamilo P, Nei M: Relationships between gene trees and species trees. Molecular Biology and Evolution. 1988, 5 (5): 568-583.

    CAS  PubMed  Google Scholar 

  2. Maddison WP: Gene trees in species trees. Systematic Biology. 1997, 46: 523-536. 10.1093/sysbio/46.3.523.

    Article  Google Scholar 

  3. Hudson RR, Coyne JA: Mathematical consequences of the genealogical species concept. Evolution. 2002, 56: 1557-1565.

    Article  PubMed  Google Scholar 

  4. Ane C, Sanderson MJ: Missing the forest for the trees: phylogenetic compression and its implications for inferring complex evolutionary histories. Systematic Biology. 2005, 54 (1): 146-157. 10.1080/10635150590905984.

    Article  PubMed  Google Scholar 

  5. Degnan JH, Rosenberg NA: Discordance of species trees with their most likely gene trees. PLoS Genetics. 2006, 2: e68-10.1371/journal.pgen.0020068.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Maddison WP, Knowles LL: Inferring phylogeny despite incomplete lineage sorting. Systematic Biology. 2006, 55: 21-30. 10.1080/10635150500354928.

    Article  PubMed  Google Scholar 

  7. Edwards SV: Is a new and general theory of molecular systematics emerging?. Evolution. 2009, 63 (1): 1-19. 10.1111/j.1558-5646.2008.00549.x.

    Article  CAS  PubMed  Google Scholar 

  8. Kubatko LS, Degnan JH: Inconsistency of phylogenetic estimates from concatenated data under coalescence. Systematic Biology. 2007, 56 (1): 17-24. 10.1080/10635150601146041.

    Article  CAS  PubMed  Google Scholar 

  9. Degnan JH, Rosenberg NA: Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends in Ecology & Evolution. 2009, 24 (6): 332-340. 10.1016/j.tree.2009.01.009.

    Article  Google Scholar 

  10. Knowles LL: Estimating species trees: methods of phylogenetic analysis when there is incongruence across genes. Systematic Biology. 2009, 58 (5): 463-467. 10.1093/sysbio/syp061.

    Article  PubMed  Google Scholar 

  11. Rieseberg LH, Baird SJE, Gardner KA: Hybridization, introgression, and linkage evolution. Plant Molecular Biology. 2000, 42: 205-224. 10.1023/A:1006340407546.

    Article  CAS  PubMed  Google Scholar 

  12. Eckert AJ, Carstens BC: Does gene flow destroy phylogenetic signal? the performance of three methods for estimating species phylogenies in the presence of gene flow. Molecular Phylogenetics and Evolution. 2008, 49 (3): 832-842. 10.1016/j.ympev.2008.09.008.

    Article  CAS  PubMed  Google Scholar 

  13. Pollard DA, Iyer VN, Moses AM, Eisen MB: Widespread discordance of gene trees with species tree in Drosophila: evidence for incomplete lineage sorting. PLoS Genetics. 2006, 2 (10): 1634-1647.

    CAS  Google Scholar 

  14. Whitfield JB, Lockhart PJ: Deciphering ancient rapid radiations. Trends in Ecology and Evolution. 2007, 22: 258-265. 10.1016/j.tree.2007.01.012.

    Article  PubMed  Google Scholar 

  15. Dutheil JY, Ganapathy G, Hobolth A, Mailund T, Uyenoyama MK, Schierup MH: Ancestral population genomics: the coalescent hidden Markov model approach. Genetics. 2009, 183 (1): 259-274. 10.1534/genetics.109.103010.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Page RDM, Charleston MA: From gene to organismal phylogeny: reconciled trees and the gene tree species tree problem. Molecular Phylogenetics and Evolution. 1997, 7: 231-240. 10.1006/mpev.1996.0390.

    Article  CAS  PubMed  Google Scholar 

  17. Doolittle WF, Bapteste E: Pattern pluralism and the Tree of Life hypothesis. Proceedings of the National Academy of Sciences of the United States of America. 2007, 104 (7): 2043-2049. 10.1073/pnas.0610699104.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Bapteste E, Susko E, Leigh J, Ruiz-Trillo I, Bucknam J, Doolittle WF: Alternative methods for concatenation of core genes indicate a lack of resolution in deep nodes of the prokaryotic phylogeny. Molecular Biology and Evolution. 2008, 25 (1): 83-91.

    Article  CAS  PubMed  Google Scholar 

  19. Linder CR, Rieseberg LH: Reconstructing patterns of reticulate evolution in plants. American Journal of Botany. 2004, 91 (10): 1700-1708. 10.3732/ajb.91.10.1700.

    Article  PubMed Central  Google Scholar 

  20. Galtier N: A model of horizontal gene transfer and the bacterial phylogeny problem. Systematic Biology. 2007, 56 (4): 633-642. 10.1080/10635150701546231.

    Article  PubMed  Google Scholar 

  21. Galtier N, Daubin V: Dealing with incongruence in phylogenomic analyses. Philosophical Transactions of the Royal Society B: Biological Sciences. 2008, 363 (1512): 4023-4029. 10.1098/rstb.2008.0144.

    Article  Google Scholar 

  22. Kellogg EA, Appels R, Mason-Gamer RJ: When genes tell different stories: the diploid genera of Triticeae (Gramineae). Systematic Botany. 1996, 21 (3): 321-347. 10.2307/2419662.

    Article  Google Scholar 

  23. Mason-Gamer RJ: Origin of North American Elymus (Poaceae: Triticeae) allotetraploids based on granule-bound starch synthase gene sequences. Systematic Botany. 2001, 26 (4): 757-768.

    Google Scholar 

  24. Petersen G, Seberg O: Molecular evolution and phylogenetic application of DMC1. Molecular Phylogenetics and Evolution. 2002, 22 (1): 43-50. 10.1006/mpev.2001.1011.

    Article  CAS  PubMed  Google Scholar 

  25. Helfgott DM, Mason-Gamer RJ: The evolution of North American Elymus (Triticeae, Poaceae) allotetraploids: evidence from phosphoenolpyruvate carboxylase gene sequences. Systematic Botany. 2004, 29 (4): 850-861. 10.1600/0363644042451017.

    Article  Google Scholar 

  26. Mason-Gamer RJ: The β-amylase genes of grasses and a phylogenetic analysis of the Triticeae (Poaceae). American Journal of Botany. 2005, 92 (6): 1045-1058. 10.3732/ajb.92.6.1045.

    Article  CAS  PubMed  Google Scholar 

  27. Kellogg EA, Appels R: Intraspecific and interspecific variation in 5S RNA genes are decoupled in diploid wheat relatives. Genetics. 1995, 140: 325-343.

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Hsiao C, Chatterton NJ, Asay KH, Jensen KB: Phylogenetic relationships of the monogenomic species of the wheat tribe Triticeae (Poaceae), inferred from nuclear rDNA (internal transcribed spacer) sequences. Genome. 1995, 38: 211-223. 10.1139/g95-026.

    Article  CAS  PubMed  Google Scholar 

  29. Petersen G, Seberg O: Phylogenetic analysis of the Triticeae (Poaceae) based on rpoA sequence data. Molecular Phylogenetics and Evolution. 1997, 7 (2): 217-230. 10.1006/mpev.1996.0389.

    Article  CAS  PubMed  Google Scholar 

  30. Mason-Gamer RJ, Orme NL, Anderson CM: Phylogenetic analysis of North American Elymus and the monogenomic Triticeae (Poaceae) using three chloroplast DNA data sets. Genome. 2002, 45: 991-1002. 10.1139/g02-065.

    Article  CAS  PubMed  Google Scholar 

  31. Yamane K, Kawahara T: Intra- and interspecific phylogenetic relationships among diploid Triticum-Aegilops species (poaceae) based on base-pair substitutions, indels, and microsatellites in chloroplast noncoding sequences. American Journal of Botany. 2005, 92 (11): 1887-1898. 10.3732/ajb.92.11.1887.

    Article  CAS  PubMed  Google Scholar 

  32. Seberg O, Frederiksen S: A phylogenetic analysis of the monogenomic Triticeae (Poaceae) based on morphology. Botanical Journal of the Linnean Society. 2001, 136 (1): 75-97. 10.1111/j.1095-8339.2001.tb00557.x.

    Article  Google Scholar 

  33. Petersen G, Seberg O, Yde M, Berthelsen K: Phylogenetic relationships of Triticum and Aegilops and evidence for the origin of the A, B, and D genomes of common wheat (Triticum aestivum). Molecular Phylogenetics and Evolution. 2006, 39: 70-82. 10.1016/j.ympev.2006.01.023.

    Article  CAS  PubMed  Google Scholar 

  34. Bouchenak-Khelladi Y, Salamin N, Savolainen V, Forest F, Bank M, Chase MW, Hodkinson TR: Large multi-gene phylogenetic trees of the grasses (Poaceae): progress towards complete tribal and generic level sampling. Molecular Phylogenetics and Evolution. 2008, 47: 488-505. 10.1016/j.ympev.2008.01.035.

    Article  CAS  PubMed  Google Scholar 

  35. International Brachypodium Initiative: Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature. 2010, 463: 763-768. 10.1038/nature08747.

    Article  Google Scholar 

  36. Sorrells ME, La Rota M, Bermudez-Kandianis CE, Greene RA, Kantety R, Munkvold JD, Miftahudin , Mahmoud A, Ma X, Gustafson PJ, et al: Comparative DNA sequence analysis of wheat and rice genomes. Genome Research. 2003, 13: 1818-1827.

    CAS  PubMed  PubMed Central  Google Scholar 

  37. Munkvold JD, Greene RA, Bertmudez-Kandianis CE, La Rota CM, Edwards H, Sorrells SF, Dake T, Benscher D, Kantety R, Linkiewicz AM, et al: Group 3 chromosome bin maps of wheat and their relationship to rice chromosome 1. Genetics. 2004, 168 (2): 639-650. 10.1534/genetics.104.034819.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Paux E, Sourdille P, Salse J, Saintenac C, Choulet F, Leroy P, Korol A, Michalak M, Kianian S, Spielmeyer W, et al: A physical map of the 1-gigabase bread wheat chromosome 3B. Science. 2008, 322 (5898): 101-104. 10.1126/science.1161847.

    Article  CAS  PubMed  Google Scholar 

  39. Dubcovsky J, Luo MC, Zhong GY, Bransteitter R, Desai A, Kilian A, Kleinhofs A, Dvorak J: Genetic map of diploid wheat, Triticum monococcum L, and its comparison with maps of Hordeum vulgare L. Genetics. 1996, 143 (2): 983-999.

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Rota M, Sorrells ME: Comparative DNA sequence analysis of mapped wheat ESTs reveals the complexity of genome relationships between rice and wheat. Functional and Integrative Genomics. 2004, 4: 34-46. 10.1007/s10142-003-0098-2.

    Article  PubMed  Google Scholar 

  41. Saintenac C, Falque M, Martin OC, Paux E, Feuillet C, Sourdille P: Detailed recombination studies along chromosome 3B provide new insights on crossover distribution in wheat (Triticum aestivum L.). Genetics. 2009, 181: 393-403.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Akhunov ED, Akhunova AR, Linkiewicz AM, Dubcovsky J, Hummel D, Lazo G, Chao SM, Anderson OD, David J, Qi LL, et al: Synteny perturbations between wheat homoeologous chromosomes caused by locus duplications and deletions correlate with recombination rates. Proceedings of the National Academy of Sciences of the United States of America. 2003, 100 (19): 10836-10841. 10.1073/pnas.1934431100.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Luo MC, Yang ZL, Kota RS, Dvorak J: Recombination of chromosomes 3A (m) and 5A (m) of Triticum monococcum with homeologous chromosomes 3A and 5A of wheat: the distribution of recombination across chromosomes. Genetics. 2000, 154 (3): 1301-1308.

    CAS  PubMed  PubMed Central  Google Scholar 

  44. Luo MC, Deal KR, Yang ZL, Dvorak J: Comparative genetic maps reveal extreme crossover localization in the Aegilops speltoides chromosomes. Theoretical and Applied Genetics. 2005, 111: 1098-1106. 10.1007/s00122-005-0035-y.

    Article  CAS  PubMed  Google Scholar 

  45. Akhunov ED, Goodyear AW, Geng S, Qi LL, Echalier B, Gill BS, Miftahudin , Gustafson JP, Lazo G, Chao SM, et al: The organization and rate of evolution of wheat genomes are correlated with recombination rates along chromosome arms. Genome Research. 2003, 13 (5): 753-763. 10.1101/gr.808603.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Chantret N, Salse J, Sabot F, Rahman S, Bellec A, Laubin B, Dubois I, Dossat C, Sourdille P, Joudrier P, et al: Molecular basis of evolutionary events that shaped the hardness locus in diploid and polyploid wheat species (Triticum and Aegilops). Plant Cell. 2005, 17: 1033-1045. 10.1105/tpc.104.029181.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Cenci A, Somma S, Chantret N, Dubcovsky J, Blanco A: PCR identification of durum wheat BAC clones containing genes coding for carotenoid biosynthesis enzymes and their chromosome localization. Genome. 2004, 47: 911-917. 10.1139/g04-033.

    Article  CAS  PubMed  Google Scholar 

  48. Staden R, Judge DP, Bonfield JK: The Staden package, 1998. Methods in Molecular Biology. 2000, 132: 115-130.

    CAS  PubMed  Google Scholar 

  49. Posada D, Crandall KA: Modeltest: testing the model of DNA substitution. Bioinformatics. 1998, 14 (9): 817-818. 10.1093/bioinformatics/14.9.817.

    Article  CAS  PubMed  Google Scholar 

  50. Swofford DL: PAUP*: Phylogenetic analysis using parsimony (*and other methods), version 4.0b10. 2003, Sunderland, Massachusetts: Sinauer Associates

    Google Scholar 

  51. Huelsenbeck JP, Ronquist F: MrBayes: Bayesian inference of phylogenetic trees. Bioinformatics. 2001, 17: 754-755. 10.1093/bioinformatics/17.8.754.

    Article  CAS  PubMed  Google Scholar 

  52. Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003, 19: 1572-1574. 10.1093/bioinformatics/btg180.

    Article  CAS  PubMed  Google Scholar 

  53. Lartillot N, Philippe H: A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Molecular Biology and Evolution. 2004, 21 (6): 1095-1109. 10.1093/molbev/msh112.

    Article  CAS  PubMed  Google Scholar 

  54. Ane C, Larget B, Baum DA, Smith SD, Rokas A: Bayesian estimation of concordance among gene trees. Molecular Biology and Evolution. 2007, 24 (2): 412-426.

    Article  CAS  PubMed  Google Scholar 

  55. De Queiroz A, Gatesy J: The supermatrix approach to systematics. Trends in Ecology & Evolution. 2006, 22 (1): 34-41.

    Article  Google Scholar 

  56. Scornavacca C, Berry V, Lefort V, Douzery EJP, Ranwez V: PhySIC_IST: cleaning source trees to infer more informative supertrees. BMC Bioinformatics. 2008, 9 (413): 1-17.

    Google Scholar 

  57. van Iersel L, Kelk S, Rupp R, Huson D: Phylogenetic networks do not need to be complex: using fewer reticulations to represent conflicting clusters. Proceedings of the 18th Annual International Conference on Intelligent Systems for Molecular Biology (ISMB): 2010. 2010, Boston, MA: International Society for Computational Biology

    Google Scholar 

  58. Huson DH, Richter DC, Rausch C, Dezulian T, Franz M, Rupp R: Dendroscope: an interactive viewer for large phylogenetic trees. BMC Bioinformatics. 2007, 8 (1): 460-10.1186/1471-2105-8-460.

    Article  PubMed  PubMed Central  Google Scholar 

  59. Huson DH, Dezulian T, Kloepper T, Steel MA: Phylogenetic super-networks from partial trees. IEEE/ACM Transactions in Computational Biology and Bioinformatics. 2004, 1 (4): 151-158. 10.1109/TCBB.2004.44.

    Article  CAS  Google Scholar 

  60. Shimodaira H, Hasegawa M: Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Molecular Biology and Evolution. 1999, 16: 1114-1116.

    Article  CAS  Google Scholar 

  61. Paradis E, Claude J, Strimmer K: APE: analyses of phylogenetics and evolution in R language. Bioinformatics. 2004, 20: 289-290. 10.1093/bioinformatics/btg412.

    Article  CAS  PubMed  Google Scholar 

  62. R Development Core Team: R: A language and environment for statistical computing. 2009, Vienna, Austria: R Foundation for Statistical Computing

    Google Scholar 

  63. Yang Z: PAML 4: a program package for phylogenetic analysis by maximum likelihood. Molecular Biology and Evolution. 2007, 24: 1586-1591. 10.1093/molbev/msm088.

    Article  CAS  PubMed  Google Scholar 

  64. Page RD: Modified mincut supertrees. Proceedings of the Second International Workshop on Algorithms in Bioinformatics. 2002, London: Springer-Verlag, 537-552.

    Chapter  Google Scholar 

  65. Criscuolo A, Berry V, Douzery EJP, Gascuel O: SDM: a fast distance-based approach for (super) tree building in phylogenomics. Systematic Biology. 2006, 55 (5): 740-755. 10.1080/10635150600969872.

    Article  PubMed  Google Scholar 

  66. Presgraves DC: Recombination enhances protein adaptation in Drosophila melanogaster. Current Biology. 2005, 15: 1651-1656. 10.1016/j.cub.2005.07.065.

    Article  CAS  PubMed  Google Scholar 

  67. Charlesworth B: Effective population size and patterns of molecular evolution and variation. Nature Reviews Genetics. 2009, 10: 195-205.

    Article  CAS  PubMed  Google Scholar 

  68. Wolfram S: The Mathematica Book. 1996, Cambridge, UK: Cambridge University Press

    Google Scholar 

  69. Mason-Gamer RJ, Kellogg EA: Chloroplast DNA analysis of the monogenomic Triticeae: phylogenetic implications and genome-specific markers. Methods of Genome Analysis in Plants. Edited by: Jauhar PP. 1996, Boca Raton, Florida: CRC Press, 301-325.

    Google Scholar 

  70. Mason-Gamer RJ: Allohexaploidy, introgression, and the complex phylogenetic history of Elymus repens (Poaceae). Molecular Phylogenetics and Evolution. 2008, 47: 598-611. 10.1016/j.ympev.2008.02.008.

    Article  CAS  PubMed  Google Scholar 

  71. Wang RRC: An assessment of genome analysis based on chromosome pairing in hybrids of perennial Triticeae. Genome. 1989, 32: 179-189. 10.1139/g89-427.

    Article  Google Scholar 

  72. Fernández-Calvin B, Orellana J: Relationship between pairing frequencies and genome affinity estimations in Aegilops ovata × Triticum aestivum hybrid plants. Heredity. 1992, 68: 165-172. 10.1038/hdy.1992.25.

    Article  Google Scholar 

  73. Wang RRC: Genome relationships in the perennial Triticeae based on diploid hybrids and beyond. Hereditas. 1992, 116: 133-136.

    Article  Google Scholar 

  74. Waines JG, Barnhart D: Biosystematic research in Aegilops and Triticum. Hereditas. 1992, 116: 207-212.

    Article  Google Scholar 

  75. Zou XH, Zhang F-M, Zhang J-G, Zang L-L, Tang L, Wang J, Sang T, Ge S: Analysis of 142 genes resolves the rapid diversification of the rice genus. Genome Biology. 2008, 9: R49-10.1186/gb-2008-9-3-r49.

    Article  PubMed  PubMed Central  Google Scholar 

  76. Lukaszewski AJ, Curtis CA: Physical distribution of recombination in B-genome chromosomes of tetraploid wheat. Theoretical and Applied Genetics. 1993, 84: 121-127.

    Google Scholar 

  77. Somers DJ, Banks T, Depauw R, Fox S, Clarke J, Pozniak C, McCartney C: Genome-wide linkage disequilibrium analysis in bread wheat and durum wheat. Genome. 2007, 50: 557-567. 10.1139/G07-031.

    Article  CAS  PubMed  Google Scholar 

  78. Horvath A, Didier A, Koenig J, Exbrayat F, Charmet G, Balfourier F: Analysis of diversity and linkage disequilibrium along chromosome 3B of bread wheat (Triticum aestivum L.). Theoretical and Applied Genetics. 2009, 119: 1523-1537. 10.1007/s00122-009-1153-8.

    Article  CAS  PubMed  Google Scholar 

  79. Morrell PL, Toleno DM, Lundy KE, Clegg MT: Low levels of linkage disequilibrium in wild barley (Hordeum vulgare ssp. spontaneum) despite high rates of self-fertilization. Proceedings of the National Academy of Sciences of the United States of America. 2005, 102: 2442-2447. 10.1073/pnas.0409804102.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Dvorak J, Luo MC, Yang ZL: Restriction fragment length polymorphism and divergence in the genomic regions of high and low recombination in self-fertilizing and cross-fertilizing Aegilops species. Genetics. 1998, 148: 423-434.

    CAS  PubMed  PubMed Central  Google Scholar 

  81. Joly S, McLenachan PA, Lockhart PJ: A statistical approach for distinguishing hybridization and incomplete lineage sorting. American Naturalist. 2009, 174 (2): E54-E70. 10.1086/600082.

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

JSE and AC were funded by the Institut National de la Recherche Agronomique, postdoctoral grant program. This work was supported by the Institut National de la Recherche Agronomique (Tritipol initiative) and Agence Nationale de la Recherche (Exegese program ANR-05-BLANC-0258-01 to JD, and PhylAriane project ANR-08-EMER-011 to VR). This publication is contribution N°ISEM 2011-073 of the Institut des Sciences de l'Evolution de Montpellier (UMR 5554 - CNRS).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan S Escobar.

Additional information

Authors' contributions

JSE performed analyses, discussed results and wrote the paper. CS performed analyses, discussed results and wrote the paper. AC designed experiments and obtained sequence data. CG obtained sequence data. SS designed experiments and obtained sequence data. EJPD discussed results and wrote the paper. VR performed analyses, discussed results and wrote the paper. SG coordinated the project, performed analyses, discussed results and wrote the paper. JD coordinated the project and discussed results. All authors read and approved the final version of the paper.

Electronic supplementary material

12862_2011_1788_MOESM1_ESM.PDF

Additional file 1: Primers used in the PCR amplification of each locus. Table S1 with the sequence of each primer used during PCR amplification. (PDF 94 KB)

12862_2011_1788_MOESM2_ESM.PDF

Additional file 2: Phylogenetic tree inferred with LOC_Os01g01790 sequences. Figure S1 showing the phylogenetic tree inferred with locus LOC_Os01g01790. (PDF 7 KB)

12862_2011_1788_MOESM3_ESM.PDF

Additional file 3: Phylogenetic tree inferred with LOC_Os01g09300sequences. Figure S2 showing the phylogenetic tree inferred with locus LOC_Os01g09300. (PDF 7 KB)

12862_2011_1788_MOESM4_ESM.PDF

Additional file 4: Phylogenetic tree inferred with LOC_Os01g11070 sequences. Figure S3 showing the phylogenetic tree inferred with locus LOC_Os01g11070. (PDF 7 KB)

12862_2011_1788_MOESM5_ESM.PDF

Additional file 5: Phylogenetic tree inferred with LOC_Os01g13200 sequences. Figure S4 showing the phylogenetic tree inferred with locus LOC_Os01g13200. (PDF 7 KB)

12862_2011_1788_MOESM6_ESM.PDF

Additional file 6: Phylogenetic tree inferred with LOC_Os01g19470 sequences. Figure S5 showing the phylogenetic tree inferred with locus LOC_Os01g19470. (PDF 7 KB)

12862_2011_1788_MOESM7_ESM.PDF

Additional file 7: Phylogenetic tree inferred with LOC_Os01g21160 sequences. Figure S6 showing the phylogenetic tree inferred with locus LOC_Os01g21160. (PDF 7 KB)

12862_2011_1788_MOESM8_ESM.PDF

Additional file 8: Phylogenetic tree inferred with LOC_Os01g24680 sequences. Figure S7 showing the phylogenetic tree inferred with locus LOC_Os01g24680. (PDF 7 KB)

12862_2011_1788_MOESM9_ESM.PDF

Additional file 9: Phylogenetic tree inferred with LOC_Os01g37560 sequences. Figure S8 showing the phylogenetic tree inferred with locus LOC_Os01g37560. (PDF 6 KB)

12862_2011_1788_MOESM10_ESM.PDF

Additional file 10: Phylogenetic tree inferred with LOC_Os01g39310 sequences. Figure S9 showing the phylogenetic tree inferred with locus LOC_Os01g39310. (PDF 6 KB)

12862_2011_1788_MOESM11_ESM.PDF

Additional file 11: Phylogenetic tree inferred with LOC_Os01g48720 sequences. Figure S10 showing the phylogenetic tree inferred with locus LOC_Os01g48720. (PDF 7 KB)

12862_2011_1788_MOESM12_ESM.PDF

Additional file 12: Phylogenetic tree inferred with LOC_Os01g53720 sequences. Figure S11 showing the phylogenetic tree inferred with locus LOC_Os01g53720. (PDF 7 KB)

12862_2011_1788_MOESM13_ESM.PDF

Additional file 13: Phylogenetic tree inferred with LOC_Os01g55530 sequences. Figure S12 showing the phylogenetic tree inferred with locus LOC_Os01g55530. (PDF 7 KB)

12862_2011_1788_MOESM14_ESM.PDF

Additional file 14: Phylogenetic tree inferred with LOC_Os01g56630 sequences. Figure S13 showing the phylogenetic tree inferred with locus LOC_Os01g56630. (PDF 7 KB)

12862_2011_1788_MOESM15_ESM.PDF

Additional file 15: Phylogenetic tree inferred with LOC_Os01g60230 sequences. Figure S14 showing the phylogenetic tree inferred with locus LOC_Os01g60230. (PDF 6 KB)

12862_2011_1788_MOESM16_ESM.PDF

Additional file 16: Phylogenetic tree inferred with LOC_Os01g61720 sequences. Figure S15 showing the phylogenetic tree inferred with locus LOC_Os01g61720. (PDF 6 KB)

12862_2011_1788_MOESM17_ESM.PDF

Additional file 17: Phylogenetic tree inferred with LOC_Os01g62900 sequences. Figure S16 showing the phylogenetic tree inferred with locus LOC_Os01g62900. (PDF 7 KB)

12862_2011_1788_MOESM18_ESM.PDF

Additional file 18: Phylogenetic tree inferred with LOC_Os01g67220 sequences. Figure S17 showing the phylogenetic tree inferred with locus LOC_Os01g67220. (PDF 6 KB)

12862_2011_1788_MOESM19_ESM.PDF

Additional file 19: Phylogenetic tree inferred with LOC_Os01g68770 sequences. Figure S18 showing the phylogenetic tree inferred with locus LOC_Os01g68770. (PDF 7 KB)

12862_2011_1788_MOESM20_ESM.PDF

Additional file 20: Phylogenetic tree inferred with LOC_Os01g70670 sequences. Figure S19 showing the phylogenetic tree inferred with locus LOC_Os01g70670. (PDF 7 KB)

12862_2011_1788_MOESM21_ESM.PDF

Additional file 21: Phylogenetic tree inferred with LOC_Os01g72220 sequences. Figure S20 showing the phylogenetic tree inferred with locus LOC_Os01g72220. (PDF 7 KB)

12862_2011_1788_MOESM22_ESM.PDF

Additional file 22: Phylogenetic tree inferred with LOC_Os01g73790 sequences. Figure S21 showing the phylogenetic tree inferred with locus LOC_Os01g73790. (PDF 7 KB)

12862_2011_1788_MOESM23_ESM.PDF

Additional file 23: Phylogenetic tree inferred with eIFiso4E sequences. Figure S22 showing the phylogenetic tree inferred with locus eIFiso4E. (PDF 7 KB)

12862_2011_1788_MOESM24_ESM.PDF

Additional file 24: Phylogenetic tree inferred with CRTISO sequences. Figure S23 showing the phylogenetic tree inferred with locus CRTISO. (PDF 7 KB)

12862_2011_1788_MOESM25_ESM.PDF

Additional file 25: Phylogenetic tree inferred with PinA sequences. Figure S24 showing the phylogenetic tree inferred with locus PinA. (PDF 6 KB)

12862_2011_1788_MOESM26_ESM.PDF

Additional file 26: Phylogenetic tree inferred with PinB sequences. Figure S25 showing the phylogenetic tree inferred with locus PinB. (PDF 6 KB)

12862_2011_1788_MOESM27_ESM.PDF

Additional file 27: Phylogenetic tree inferred with PSY2 sequences. Figure S26 showing the phylogenetic tree inferred with locus PSY2. (PDF 6 KB)

12862_2011_1788_MOESM28_ESM.PDF

Additional file 28: Phylogenetic tree inferred with MATK sequences. Figure S27 showing the phylogenetic tree inferred with locus MATK. (PDF 7 KB)

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Escobar, J.S., Scornavacca, C., Cenci, A. et al. Multigenic phylogeny and analysis of tree incongruences in Triticeae (Poaceae). BMC Evol Biol 11, 181 (2011). https://doi.org/10.1186/1471-2148-11-181

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1471-2148-11-181

Keywords