- Research article
- Open Access
Phylogenomic evidence for ancient recombination between plastid genomes of the Cupressus-Juniperus-Xanthocyparis complex (Cupressaceae)
BMC Evolutionary Biology volume 18, Article number: 137 (2018)
Phylogenetic relationships among Eastern Hemisphere cypresses, Western Hemisphere cypresses, junipers, and their closest relatives are controversial, and generic delimitations have been in flux for the past decade. To address relationships and attempt to produce a more robust classification, we sequenced 11 new plastid genomes (plastomes) from the five variously described genera in this complex (Callitropsis, Cupressus, Hesperocyparis, Juniperus, and Xanthocyparis) and compared them with additional plastomes from diverse members of Cupressaceae.
Phylogenetic analysis of protein-coding genes recovered a topology in which Juniperus is sister to Cupressus, whereas a tree based on whole plastomes indicated that the Callitropsis-Hesperocyparis-Xanthocyparis (CaHX) clade is sister to Cupressus. A sliding window analysis of site-specific phylogenetic support identified a ~ 15 kb region, spanning the genes ycf1 and ycf2, which harbored an anomalous signal relative to the rest of the genome. After excluding these genes, trees based on the remainder of the genes and genome consistently recovered a topology grouping the CaHX clade and Cupressus with strong bootstrap support. In contrast, trees based on the ycf1 and ycf2 region strongly supported a sister relationship between Cupressus and Juniperus.
These results demonstrate that standard phylogenomic analyses can result in strongly supported but conflicting trees. We suggest that the conflicting plastomic signals result from an ancient introgression event involving ycf1 and ycf2 that occurred in an ancestor of this species complex. The introgression event was facilitated by plastomic recombination in an ancestral heteroplasmic individual carrying distinct plastid haplotypes, offering further evidence that recombination occurs between plastomes. Finally, we provide strong support for previous proposals to recognize five genera in this species complex: Callitropsis, Cupressus, Hesperocyparis, Juniperus, and Xanthocyparis.
The discovery in northern Vietnam of a new conifer species, Xanthocyparis vietnamensis Farjon & T. H. Nguyên [1, 2], has caused taxonomic upheaval within the Cupressaceae. Based on distinctive morphological traits, this conifer was initially placed in a new genus (Xanthocyparis) and inferred to be closely related to Callitropsis nootkatensis (D. Don) Oersted ex D. P. Little . Ca. nootkatensis is another taxonomically controversial species that has been variously classified into Chamaecyparis, Callitropsis, Cupressus, and Xanthocyparis [3, 4]. How these two species relate to one another and to other Cupressaceae conifers has been a topic of ongoing taxonomic debate, driven by a paucity of distinguishing morphological characteristics [1, 3, 5] as well as incongruence among molecular phylogenetic analyses [4, 6,7,8,9,10,11,12,13,14]. From a broader perspective, the phylogenetic positions of X. vietnamensis and Ca. nootkatensis impinge on a large taxonomic debate regarding the treatment of Western Hemisphere cypresses (hereafter Hesperocyparis) and Eastern Hemisphere cypresses (hereafter Cupressus) [8, 9, 15], and affect phylogeographic interpretations of migration patterns among flora spanning the Eastern and Western Hemispheres [9, 10].
Phylogenetic relationships among the (up to) five recognized genera (Callitropsis, Cupressus, Hesperocyparis, Juniperus, Xanthocyparis) of this (hereafter CaCuHJX) complex of Cupressaceae species are unresolved. Early phylogenetic studies based primarily on the internal transcribed spacer region of the nuclear ribosomal DNA cluster have generally recovered a tree in which X. vietnamensis and Ca. nootkatensis form a clade that is sister to Hesperocyparis, which together are more closely related to Juniperus than to Cupressus [4, 6,7,8, 13]. In contrast, chloroplast markers, while generally providing less resolution, have tended to construct (or at least be consistent with) a grouping of Ca. nootkatensis and Hesperocyparis, which are successively sister to X. vietnamensis, then Cupressus, and finally Juniperus [4, 7,8,9,10,11, 13, 14]. Analyses using nuclear or mitochondrial protein-coding genes [7, 12], or the fastest-evolving sites in the plastid genome , have recovered a third topology, in which Juniperus is monophyletic with Cupressus while Ca. nootkatensis, X. vietnamensis, and Hesperocyparis form a second monophyletic group with less certain resolution.
Collectively, all of the aforementioned studies agree that Hesperocyparis is more closely related to Ca. nootkatensis and X. vietnamensis than to Cupressus or Juniperus, although the precise relationships among these five genera are as yet unclear. Intriguingly, these previous studies also suggest fundamental incongruence between and within the plastid and nuclear genomes. To stabilize the classification of these five genera, and to explore the source of conflicting intraplastomic signals, we sequenced 11 plastomes and compared them with 10 existing plastomes from all five genera. Through extensive phylogenetic comparisons, we present a robust phylogeny of the five genera and identify the genes ycf1 and ycf2 as the major source of intraplastomic phylogenetic conflict. By integrating recent discoveries on organelle inheritance, we highlight potential effects of genetic leakage and ancient recombination on phylogenomic analysis.
General features of newly sequenced Cupressaceae plastomes
We sequenced complete plastomes from 11 species spanning five genera of Cupressaceae, including Callitropsis (Ca. nootkatensis), Cupressus (Cu. sempervirens, Cu. tonkinensis, Cu. torulosa), Hesperocyparis (H. arizonica, H. benthamii, H. glabra, H. lindleyi, H. lusitanica), Juniperus (J. communis) and Xanthocyparis (X. vietnamensis). Genomes are very similar in size (127–129 kb) and content, with nearly identical proportions of guanine plus cytosine (G + C = 34.6–34.9%) and an identical set of 82 protein-coding genes, 4 ribosomal RNAs, 33 transfer RNAs and 18 introns (Table 1). Pairwise alignment of entire plastome sequences demonstrated a high level of intra- and intergeneric similarity (Fig. 1). Notably, the plastomes from Cupressus and the CaHX clade are in all cases more similar to one another (93.6–95.5% identity) than they are to Juniperus plastomes (90.5–93.0%).
Different plastid phylogenomic approaches construct strongly conflicting trees
To examine the phylogenetic relationships among Callitropsis, Cupressus, Hesperocyparis, Juniperus, and Xanthocyparis, we performed plastid phylogenomic analyses using two common approaches: 1) a concatenated alignment of all 82 protein-coding genes, and 2) a whole plastome alignment. The trees resulting from analysis of both data sets were largely congruent, particularly with respect to the relationships among species within Juniperus, within Cupressus, and among genera within the CaHX clade (Fig. 2).
However, there was strong conflict for relationships among the Cupressus, Juniperus and CaHX clades (Fig. 2). The tree constructed from the 82-gene alignment indicated a sister relationship between Juniperus and Cupressus with strong (96%) bootstrap support (Fig. 2a). In contrast, the whole plastome alignment resulted in a tree that united the CaHX and Cupressus clades with strong (92%) bootstrap support (Fig. 2b). For each data set, the Approximately Unbiased and Shimodaira-Hasegawa alternative topology tests significantly rejected (p < 0.05) the topology recovered by the other data set. Thus, two standard phylogenomic approaches produced strongly supported but incongruent trees.
ycf1 and ycf2 have a distinct phylogenetic signal relative to the rest of the plastome
To investigate the source of phylogenetic incongruence within the plastome, we calculated the likelihood of each site in the whole plastome alignment for the two competing topologies. By taking the difference in the log of the site-likelihood values for the two tree topologies, we identified those sites that provided the strongest preference for one or the other topology. Sites providing strong preference for the Juniperus + Cupressus topology were mostly clustered within the 31 kb to 47 kb segment in the whole plastome alignment, whereas sites providing strong support for the CaHX + Cupressus topology were more evenly spread throughout the data set (Fig. 3a). Sliding window analysis provided clear evidence that this 31 kb to 47 kb genomic segment favored the Juniperus + Cupressus relationship, whereas the remainder of the genome provided greater support for the CaHX + Cupressus relationship (Fig. 3b). This anomalous region of the alignment corresponds to a segment of the plastome containing the entirety of the genes ycf1, trnL-CAA, ycf2, and trnI-CAU and a portion of the ccsA gene (Fig. 3c).
To verify that ycf1 and ycf2 have an anomalous phylogenetic signal, we reevaluated the concatenated gene alignment after separating the ycf1 + ycf2 genes from the remaining 80 genes (Fig. 4a). We also reexamined the whole-plastome analyses with the ycf1 + ycf2 genomic region separated from the remainder of the genome (Fig. 4b). Results of both analyses were fully consistent. The ycf1 + ycf2 gene and genomic segment data sets provided strong support for Cupressus + Juniperus as sister taxa, while the rest of the genes and genome produced trees with strong support for CaHX + Cupressus (Fig. 4a and b).
The ycf1 and ycf2 genes are known to be fast evolving, with substantial levels of positive selection and numerous indels [16,17,18]. In Cupressaceae, ycf1 and ycf2 are also relatively faster evolving, as demonstrated by the generally 2- to 3-fold longer branch lengths in the trees of ycf1 + ycf2 relative to the remaining genes (Fig. 4a) and genomic regions (Fig. 4b), and by the larger number of gap-containing columns in the untrimmed ycf1 + ycf2 gene alignment (22.2% of 16,614 positions) compared with the untrimmed 80-gene alignment (5.50% of 62,853 positions). Despite the faster relative rate of evolution, no substitutional saturation was detected (See Additional file 1: Table S1) in the Gblocks-trimmed ycf1 + ycf2 data sets based on an entropy test of substitution saturation [19, 20].
We also confirmed that the different selection pressures and rates of evolution at 3rd codon positions compared with 1st and 2nd codon positions had no effect on the recovered topology. Indeed, regardless of codon partitioning scheme (all codon positions, 1st + 2nd positions only, or 3rd positions only), the ycf1 + ycf2 gene data sets recovered Cupressus + Juniperus with moderate to strong support, while the 80 gene data set recovered Cupressus + CaHX with moderate to strong support (Fig. 4a). Finally, given the large number of indels in the ycf1 and ycf2 alignments, we examined the effect of gap treatment during alignment filtering of the genome data sets. Regardless of Gblocks settings, the ycf1 + ycf2 genomic segment recovered Cupressus + Juniperus with strong support, while the remainder of the genome recovered Cupressus + CaHX with strong support (Fig. 4b).
Structural features of Cupressaceae plastomes
Cupressaceae plastomes lack the large inverted repeat (IR) that is a diagnostic feature of most other land plant plastomes. Instead, they contain a much smaller (~ 260 bp) IR that duplicates the trnQ gene [21,22,23]. The two copies of the trnQ-IR flank a ~ 36 kb segment of the plastome, and collinearity analysis indicated that IR recombination has led to the inversion of this genomic segment in the newly sequenced J. communis plastome (Fig. 5a). This inverted region was previously defined as the “B” arrangement to contrast with the non-inverted “A” arrangement that is present in most Cupressaceae species, although several other Cupressaceae species were also shown to have a plastome in this “B” arrangement .
Analysis of mapped read pairs (Fig. 5b) verified that nearly all read pairs that span the trnQ-IR (814/834) supported the “B” arrangement in J. communis. However, 2.7% (20/834) of these J. communis read pairs instead supported the existence of the “A” arrangement, demonstrating that the “A” arrangement exists at a substoichiometric level relative to the predominant “B” arrangement within the sampled J. communis individual. By contrast, the H. lindleyi and H. lusitanica plastomes exist primarily in the “A” arrangement in the sampled individuals, with a small proportion (< 1%) of reads supporting the presence of the “B” arrangement at a substoichiometric level. The coexistence of predominant and substoichiometric forms of the plastome was previously reported [21, 24] for other Cupressaceae species (Fig. 5b, shown in red).
Previous studies have disagreed on the inferred phylogenetic relationships among major lineages of the CaCuHJX clade, which comprises Eastern Hemisphere cypresses (Cupressus), Western Hemisphere cypresses (Hesperocyparis), junipers (Juniperus), and the taxonomically enigmatic species X. vietnamensis and Ca. nootkatensis. Their relationships have remained contentious due in part to phylogenetic incongruence between nuclear and plastid data as well as intragenomic conflict among loci within the plastid and nuclear genomes. In this study, 21 complete plastomes (11 newly generated) from species in the CaCuHJX complex were used to reexamine phylogenetic relationships among genera and to evaluate the distribution of conflicting phylogenetic signals across the plastome. Our whole-plastome analyses offer substantially more informative characters than previous analyses using a small number of loci [4, 6,7,8,9,10,11,12,13] and more than twice the number of ingroup taxa compared with the only other plastome-based phylogenetic study .
Our results demonstrate that different phylogenomic approaches can produce strongly supported but conflicting phylogenetic hypotheses (Fig. 2). In this case, we showed that the phylogenetic conflict comes from a ~ 15 kb region of the plastome (spanning ycf1 and ycf2) that exhibits a phylogenetic signal incongruent with the rest of the plastome (Figs. 3 and 4). Phylogenetic incongruence of one or few loci within the plastid genome has been reported in other lineages of seed plants, including Sileneae , Citrus , Pinus  and Picea , with the incongruence also spanning a region containing the ycf1 and ycf2 genes for the latter two genera in Pinaceae. An important question arising from these analyses is why some plastid loci may have distinct evolutionary signals. Below we discuss the potential causes and taxonomic significance of these findings.
Unique characteristics of ycf1 and ycf2 do not explain phylogenetic incongruence
There is no doubt that ycf1 and ycf2 exhibit higher rates of sequence and indel evolution compared with most plastid genes. Both the Pinus and Picea studies [16, 18] identified several sites of the ycf1 and ycf2 genes under positive selection. However, pervasive positive selection is not likely to be a determining factor for the conflicting phylogenetic trees in the CaCuHJX complex. The codon partitioning results argues strongly against any confounding phylogenetic effects stemming from differences in substitution rate or selection pressure at different codon positions (Fig. 4a). Second, while the ycf1 and ycf2 genes are mutational hotspots for the accumulation of indels, analysis of data sets that either excluded all gaps (strict filtering) or allowed gaps when present in < 50% of taxa (relaxed filtering) recovered the same tree, which was still incongruent with signals of the rest of the genome (Fig. 4b).
Finally, the effect of substitutional saturation can be ruled out because individual branch lengths in all trees are very short at this low taxonomic level (Fig. 4) and no substitutional saturation was detected by an entropy test [19, 20] implemented in DAMBE. Note that the previous plastome-based study of the CaCuHJX complex did report substitutional saturation for nine plastid genes (including ycf1 and ycf2) ; in that study, untrimmed alignments were apparently used for the entropy analysis based on the fact that we can somewhat replicate their results when using our own untrimmed alignment of the ycf1 + ycf2 gene data set (See Additional file 1: Table S1). However, given the high indel rate in the ycf1 and ycf2 genes, alignment filtration using programs such as Gblocks is a necessity to avoid spurious results in phylogenetic analysis, and this would also apply to entropy tests which aim to assess the suitability of a data set for phylogenetic analysis. Moreover, the DAMBE software warns against including gaps and unresolved characters in the alignment due to the potential for false positives.
A biological basis for phylogenetic incongruence
If phylogenetic artifacts due to the unique properties of the ycf1 and ycf2 genes can be excluded, then biological factors may be the more likely source of phylogenetic incongruence. To explain the intragenomic conflict within the plastomes of the CaCuHJX clade, we propose that the anomalous signal resulted from an ancient introgression event involving the ycf1 and ycf2 genes. This event would require several evolutionary processes to occur: 1) ancient hybridization or incomplete lineage sorting to establish an ancestral population having two plastid haplotypes with distinct evolutionary ancestry, 2) creation of a heteroplasmic individual containing both plastid haplotypes via at least occasional biparental inheritance, and 3) recombination between the two plastid haplotypes.
Hybridization is a common phenomenon in plant evolution that can confound phylogenetic analyses, particularly when using cytoplasmic loci , and even more so if recombination among distinct plastid haplotypes has occurred . In conifers, hybridization has resulted in chloroplast capture, nuclear introgression, and phylogenetic incongruence between the nuclear and plastid genomes [18, 29, 30]. Thus, it is plausible that members of the CaCuHJX complex may have experienced some level of reticulate evolution. In fact, long-distance dispersal of seed cones has been well documented for many Juniperus species [9, 10], and ancient hybridization has been previously suggested to explain phylogenetic incongruence between the nuclear and plastid genomes in the CaCuHJX clade . Incomplete lineage sorting could also be an explanation for coexisting plastome haplotypes in a population, although this mechanism has received less attention in the plastome literature [31, 32].
Once distinct plastome haplotypes were established in a population (via ancient hybridization or incomplete lineage sorting), some level of biparental inheritance could have created a heteroplasmic state, which could then have facilitated recombination between plastomes from different species, resulting in the introgression of foreign ycf1 and ycf2 genes. Frequent reversals of uniparental inheritance (maternal-to-paternal and vice versa) have been found for both mitochondrial and chloroplast genomes , and genetic leakage has been observed in many Cupressaceae species (See Additional file 1: Table S2) and other seed plants [34,35,36]. Heteroplasmy and recombination could neatly explain the anomalous phylogenetic signal that is confined to the ~ 15 kb region of the plastome, regardless of the fast-evolving properties of the two ycf genes.
The anomalous grouping of Juniperus and Cupressus in the ycf1 + ycf2 analyses suggests that the ancient introgression of the ycf1 and ycf2 genomic segment occurred between these two lineages. The crown group ages for Cupressus and Juniperus have been dated to ~ 30 and ~ 40 million years, respectively, while the crown group age for the entire CaCuHJX clade was estimated to be ~ 60 million years . These dates suggest that the ancient hybridization and recombination event probably occurred 40–60 million years ago, subsequent to the initial diversification of the CaCuHJX clade but prior to the diversification of the Cupressus and Juniperus lineages. However, the direction of ycf1 + ycf2 introgression (from Cupressus to Juniperus or from Juniperus to Cupressus) cannot be determined from the available data.
Taxonomic implications of phylogenetic results
Except for the intragenomic conflict observed in our plastomic data regarding the relationships among the Cupressus, Juniperus, and CaHX clades, phylogenetic results are otherwise largely congruent in the trees based on protein-coding genes and complete plastomes. Importantly, all data sets but one from this study strongly support a sister group relationship between Callitropsis and Hesperocyparis within the CaHX clade (Fig. 2; Fig. 4), which is generally consistent with previous studies using at least 10 kb of sequence data [9,10,11, 13, 14]. The lone contrasting data set (ycf1 + ycf2 genomic data) instead supports a sister group relationship between Ca. nootkatensis and X. vietnamensis (Fig. 4b, left), which has also been observed in a minority of previous studies, primarily based on nuclear internal transcribed spacer data [4, 7, 8]. Nevertheless, the weight of evidence from this study and others indicates that Ca. nootkatensis and X. vietnamensis are not sister taxa; thus, the previous suggestion  to classify both species into separate monotypic genera appears well justified.
Finally, alternative suggestions to treat the entire CaHX clade as a single genus Callitropsis , or to maintain a more broadly defined Cupressus sensu lato (s.l.) that includes the CaHX clade , are problematic. The maintenance of Cupressus s.l. is problematic due to uncertainty in the placement of Juniperus. Notably, a paraphyletic Cupressus s.l. is consistently recovered in the few studies that have utilized nuclear or mitochondrial protein-coding genes [7, 8, 12, 13] as well as a minority of plastid analyses from this (Fig. 2; Fig. 4) and other  studies; more nuclear and mitochondrial data is required to explore this issue further. Furthermore, while the CaHX clade is clearly monophyletic in this and many previous studies, there are a variety of morphological characters that distinguish Hesperocyparis from Ca. nootkatensis and X. vietnamensis , arguing against circumscribing all three genera into a single, more broadly defined genus. Collectively, while there is still room for debate on the precise relationships among species in the CaCuHJX clade of Cupressaceae, the weight of evidence strongly favors recognition of five separate genera: Callitropsis, Cupressus, Hesperocyparis, Juniperus, and Xanthocyparis.
Our results provide further evidence that standard phylogenomic analyses can produce strongly supported but conflicting trees, implying that phylogenomic results should be performed in multiple ways with different data partitioning schemes to unmask potential signals of conflict. In our study, we showed that the conflicting phylogenetic signal was localized to the ycf1 and ycf2 region of the genome, which we suggest was due to introgression of this region in an ancestor of this species complex. This hypothesis implies that plastomic recombination must have occurred between distinct haplotypes that coexisted in an ancestral heteroplasmic individual. Finally, after exclusion of the introgressed ycf1 and ycf2 genes from the data sets, our analyses recovered a robust phylogeny of the five genera and provided strong evidence in support of previous proposals to recognize five distinct genera in this species complex: Callitropsis, Cupressus, Hesperocyparis, Juniperus, and Xanthocyparis.
Sample collection and DNA sequencing
Leaf samples (50 mg each) from mature trees (Ca. nootkatensis, Cu. sempervirens, H. arizonica, H. benthamii, H. glabra, H. lindleyi, H. lusitanica, and J. communis) were collected on roadsides in common areas of public land. Leaf samples (50 mg each) from remaining samples (Cu. tonkinensis, Cu. torulosa, and X. vietnamensis) were collected from seedlings grown by Keith Rushforth (UK) in his garden from seeds collected by him. Thus, no samples were subject to institutional, national or international guidelines for collection. DNAs were extracted according to procedures described previously  and sequenced on the Illumina HiSeq 2500 platform at BGI (Shenzhen, China) or the Illumina MiSeq system at the Center for Genomics and Bioinformatics at Indiana University (Bloomington, IN). Details of collection sites, voucher numbers, and sequencing results are provided (See Additional file 1: Table S3).
Plastome assembly and annotation
Plastomes were assembled using an established procedure [21, 37, 38]. For each species, a draft sequence was assembled from raw reads using Velvet version 1.2.03  with pairwise combinations of different Kmer values (61, 71, 81, 91, 101) and expected coverage values (50, 100, 200, 500, 1000), and a final consensus sequence was generated from at least three independent assemblies. Genes were initially annotated using DOGMA , followed by manual correction of start and stop codons based on comparison to homologs from other Cupressaceae plastomes.
Gene and whole genome alignments
A total of 82 plastid protein-coding genes were extracted from the 11 genomes newly sequenced in this study plus additional species of Cupressaceae (See Additional file 1: Table S4). For each gene, a codon-based alignment was generated by aligning amino acid sequences with MUSCLE  and reverse translating the alignments into nucleotide sequences using PAL2NAL . A concatenated plastid data matrix was built with FASconCAT version 1.0 . The aligned 82-gene data set was 79,479 bp in length.
Whole plastome sequence alignments were also constructed from the 11 genomes newly sequenced in this study plus additional species of Cupressaceae (See Additional file 1: Table S4). First, a collinearity plot was generated with the progressiveMAUVE algorithm  using full genome sequences. When necessary, genomes were adjusted to start on the rbcL gene to ensure a consistent starting point for this plot. Next, whole genome alignments were performed with MAFFT version 7.245  using the fftnsi setting. To facilitate this whole plastome alignment, the orientation of an inverted segment in some Cupressaceae plastomes (mediated by a small trnQ-containing inverted repeat element termed trnQ-IR ) was manually reverted such that all examined genomes were globally collinear. Plastomes from more distant outgroups were more highly rearranged and were thus excluded from the whole plastome alignments. The aligned plastome data set was 144,492 bp in length.
The aligned gene and genome data sets were trimmed using Gblocks version 0.91b  with default strict settings J(b1 = 13, b2 = 21, b3 = 8, b4 = 10, b5 = none) or with more relaxed settings (b1 = 13, b2 = 13, b3 = 8, b4 = 5, b5 = half). The final 82-gene data set was trimmed in codon mode (t = c) to 74,772 bp (relaxed) or 71,871 bp (strict), while the whole plastome data set was trimmed in DNA mode (t = d) to 126,645 bp (relaxed) or 113,387 bp (strict).
Phylogenetic analysis and alternative topology tests
Phylogenetic analyses were performed using the maximum likelihood approach in PhyML version 3.0  under the GTR + G + I model with 100 bootstrap replicates. The shape of the gamma distribution of rate variation, proportion of invariant sites, and substitution rate parameters were estimated during the analysis. Two competing phylogenetic hypotheses of the relationships among Callitropsis, Cupressus, Hesperocyparis, Juniperus and Xanthocyparis were examined using the Shimodaira-Hasegawa test and the Approximately Unbiased test, as implemented in CONSEL . One topology forced Cupressus to be sister to Juniperus, while the second topology forced Cupressus as sister to the CaHX clade.
Assessment of phylogenetic incongruence in the plastome
To assess levels of substitutional saturation in the data sets, saturation tests were performed on untrimmed and trimmed data sets using an entropy test based on an index of substitution saturation [19, 20] as implemented in DAMBE version 6.4.110 . To examine phylogenetic signals among genomic regions, log-likelihoods for each site in the whole genome alignment were calculated on the two major topologies: Cupressus sister to Juniperus versus Cupressus sister to CaHX. Site likelihoods for each topology were reported in PhyML, and then the difference in log-likelihoods at each site was plotted along the genome. A sliding window analysis was performed (window size = 5000, step size = 100) that summed the difference in site likelihoods in order to show localized variation in likelihoods across 5 kb segments of the alignment.
Callitropsis + Cupressus + Hesperocyparis + Juniperus + Xanthocyparis clade
Callitropsis + Hesperocyparis + Xanthocyparis clade
- G + C:
Guanine plus cytosine
- s.l :
Farjon A, Hiep NT, Harder D, Loc PK, Averyanov L. A new genus and species in Cupressaceae (Coniferales) from northern Vietnam, Xanthocyparis vietnamensis. Novon. 2002;12:179–89.
Averyanov LV, Nguyen TH, Harder D, Phan KL. The history of discovery and natural habitats of Xanthocyparis vietnamensis (Cupressaceae). Turczaninowia. 2002;5:31–9.
Debreczy Z, Musial K, Price RA, Rácz I. Relationships and nomenclatural status of the nootka cypress (Callitropsis nootkatensis, Cupressaceae). Phytologia. 2009;91:140–59.
Little DP, Schwarzbach AE, Adams RP, Hsieh CF. The circumscription and phylogenetic relationships of Callitropsis and the newly described genus Xanthocyparis (Cupressaceae). Am J Bot. 2004;91:1872–81.
Farjon A. A monograph of Cupressaceae and Sciadopitys. Kew: Royal Botanic Gardens; 2005.
Xiang Q, Li J. Derivation of Xanthocyparis and Juniperus from within Cupressus: evidence from sequences of nrDNA internal transcribed spacer region. Harvard Pap Bot. 2005;9:375–82.
Little DP. Evolution and circumscription of the true cypresses (Cupressaceae: Cupressus). Syst Bot. 2006;31:461–80.
Adams RP, Bartel JA, Price RA. A new genus, Hesperocyparis, for the cypresses of the western hemisphere (Cupressaceae). Phytologia. 2009;91:160–85.
Mao K, Hao G, Liu J, Adams RP, Milne RI. Diversification and biogeography of Juniperus (Cupressaceae): variable diversification rates and multiple intercontinental dispersals. New Phytol. 2010;188:254–72.
Mao K, Milne RI, Zhang L, Peng Y, Liu J, Thomas P, Mill RR, Renner SS. Distribution of living Cupressaceae reflects the breakup of Pangea. Proc Natl Acad Sci U S A. 2012;109:7793–8.
Terry RG, Bartel JA, Adams RP. Phylogenetic relationships among the New World cypresses (Hesperocyparis; Cupressaceae): evidence from noncoding chloroplast DNA sequences. Plant Syst Evol. 2012;298:1987–2000.
Yang ZY, Ran JH, Wang XQ. Three genome-based phylogeny of Cupressaceae s.L.: further evidence for the evolution of gymnosperms and southern hemisphere biogeography. Mol Phylogenet Evol. 2012;64:452–70.
Terry RG, Adams RP. A molecular re-examination of phylogenetic relationships among Juniperus, Cupressus, and the Hesperocyparis-Callitropsis-Xanthocyparis clades of Cupressaceae. Phytologia. 2015;97:66–74.
Qu XJ, Jin JJ, Chaw SM, Li DZ, Yi TS. Multiple measures could alleviate long-branch attraction in phylogenomic reconstruction of Cupressoideae (Cupressaceae). Sci Rep. 2017;7:41005.
Christenhusz MJ, Reveal JL, Farjon A, Gardner MF, Mill RR, Chase MW. A new classification and linear sequence of extant gymnosperms. Phytotaxa. 2011;19:55–70.
Parks M, Cronn R, Liston A. Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes. BMC Biol. 2009;7:84.
Dong W, Xu C, Li C, Sun J, Zuo Y, Shi S, Cheng T, Guo J, Zhou S. ycf1, the most promising plastid DNA barcode of land plants. Sci Rep. 2015;5:8348.
Sullivan AR, Schiffthaler B, Thompson SL, Street NR, Wang XR. Interspecific plastome recombination reflects ancient reticulate evolution in Picea (Pinaceae). Mol Biol Evol. 2017;34:1689–701.
Xia X, Xie Z, Salemi M, Chen L, Wang Y. An index of substitution saturation and its application. Mol Phylogenet Evol. 2003;26:1–7.
Xia X, Lemey P. Assessing substitution saturation with DAMBE. In: Lemey P, Salemi M, Vandamme A-M, editors. The phylogenetic handbook: a practical approach to DNA and protein phylogeny, vol. 2. New York: Cambridge University Press; 2009. p. 611–26.
Guo WH, Grewe F, Cobo-Clark A, Fan WS, Duan ZL, Adams RP, Schwarzbach AE, Mower JP. Predominant and substoichiometric isomers of the plastid genome coexist within Juniperus plants and have shifted multiple times during cupressophyte evolution. Genome Biol Evol. 2014;6:580–90.
Hirao T, Watanabe A, Kurita M, Kondo T, Takata K. Complete nucleotide sequence of the Cryptomeria japonica D. Don. Chloroplast genome and comparative chloroplast genomics: diversified genomic structure of coniferous species. BMC Plant Biol. 2008;8:70.
Yi X, Gao L, Wang B, Su Y-J, Wang T. The complete chloroplast genome sequence of Cephalotaxus oliveri (Cephalotaxaceae): evolutionary comparison of Cephalotaxus chloroplast DNAs and insights into the loss of inverted repeat copies in gymnosperms. Genome Biol Evol. 2013;5:688–98.
Qu XJ, Wu CS, Chaw SM, Yi TS. Insights into the existence of isomeric plastomes in Cupressoideae (Cupressaceae). Genome Biol Evol. 2017;9:1110–9.
Erixon P, Oxelman B. Reticulate or tree-like chloroplast DNA evolution in Sileneae (Caryophyllaceae)? Mol Phylogenet Evol. 2008;48:313–25.
Carbonell-Caballero J, Alonso R, Ibanez V, Terol J, Talon M, Dopazo J. A phylogenetic analysis of 34 chloroplast genomes elucidates the relationships between wild and domestic species within the genus Citrus. Mol Biol Evol. 2015;32:2015–35.
Rieseberg LH, Soltis D. Phylogenetic consequences of cytoplasmic gene flow in plants. Evol Trend Plant. 1991;5:65–84.
Wolfe AD, Randle CP. Recombination, heteroplasmy, haplotype polymorphism, and paralogy in plastid genes: implications for plant molecular systematics. Syst Bot. 2004;29:1011–20.
Xiang QP, Wei R, Shao YZ, Yang ZY, Wang XQ, Zhang XC: Phylogenetic relationships, possible ancient hybridization, and biogeographic history of Abies (Pinaceae) based on data from nuclear, plastid, and mitochondrial genomes. Mol Phylogenet Evol 2015, 82 Pt A:1–14.
Peng D, Wang XQ. Reticulate evolution in Thuja inferred from multiple gene sequences: implications for the study of biogeographical disjunction between eastern Asia and North America. Mol Phylogenet Evol. 2008;47:1190–202.
Willyard A, Cronn R, Liston A. Reticulate evolution and incomplete lineage sorting among the ponderosa pines. Mol Phylogenet Evol. 2009;52:498–511.
Zhou Y, Duvaux L, Ren G, Zhang L, Savolainen O, Liu J. Importance of incomplete lineage sorting and introgression in the origin of shared genetic variation between two closely related pines with overlapping distributions. Heredity. 2017;118:211–20.
Whittle CA, Johnston MO. Male-driven evolution of mitochondrial and chloroplastidial DNA sequences in plants. Mol Biol Evol. 2002;19:938–49.
Wagner DB, Dong J, Carlson MR, Yanchuk AD. Paternal leakage of mitochondrial DNA in Pinus. Theor Appl Genet. 1991;82:510–4.
Havey M. Predominant paternal transmission of the mitochondrial genome in cucumber. J Hered. 1997;88:232–5.
Weihe A, Apitz J, Pohlheim F, Salinas-Hartwig A, Borner T. Biparental inheritance of plastidial and mitochondrial DNA and hybrid variegation in Pelargonium. Mol Gen Genomics. 2009;282:587–93.
Grewe F, Guo WH, Gubbels EA, Hansen AK, Mower JP. Complete plastid genomes from Ophioglossum californicum, Psilotum nudum, and Equisetum hyemale reveal an ancestral land plant genome structure and resolve the position of Equisetales among monilophytes. BMC Evol Biol. 2013;13:8.
Zhu AD, Guo WH, Jain K, Mower JP. Unprecedented heterogeneity in the synonymous substitution rate within a plant genome. Mol Biol Evol. 2014;31:1228–36.
Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–9.
Wyman SK, Jansen RK, Boore JL. Automatic annotation of organellar genomes with DOGMA. Bioinformatics. 2004;20:3252–5.
Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–7.
Suyama M, Torrents D, Bork P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 2006;34:W609–12.
Kuck P, Meusemann K. FASconCAT: convenient handling of data matrices. Mol Phylogenet Evol. 2010;56:1115–8.
Darling AE, Mau B, Perna NT. progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One. 2010;5:e11147.
Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30:772–80.
Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000;17:540–52.
Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010;59:307–21.
Shimodaira H, Hasegawa M. CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics. 2001;17:1246–7.
Xia X. DAMBE6: new tools for microbial genomics, Phylogenetics, and molecular evolution. J Hered. 2017;108:431–7.
The authors gratefully acknowledge Wenhu Guo and Felix Grewe for assistance with initial phylogenetic analyses, and Gaven Nelson for assistance with initial genome annotations.
This research was supported by in part by funding from Baylor University (award BU 0324512 to R.P.A) and by a scholarship from the Chinese Scholarship Council (to W.F.). None of the funding bodies had any role in the design or implementation of this project or in the writing of the manuscript.
Availability of data and materials
The annotated plastome sequences generated during the current study are available in the GenBank repository under accession numbers KP099642–KP099645 and MH121046–MH121052.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Zhu, A., Fan, W., Adams, R.P. et al. Phylogenomic evidence for ancient recombination between plastid genomes of the Cupressus-Juniperus-Xanthocyparis complex (Cupressaceae). BMC Evol Biol 18, 137 (2018) doi:10.1186/s12862-018-1258-2
- Plastid genome
- Phylogenetic incongruence