Annelid phylogeny and the status of Sipuncula and Echiura
© Struck et al. 2007
Received: 02 October 2006
Accepted: 05 April 2007
Published: 05 April 2007
Skip to main content
© Struck et al. 2007
Received: 02 October 2006
Accepted: 05 April 2007
Published: 05 April 2007
Annelida comprises an ancient and ecologically important animal phylum with over 16,500 described species and members are the dominant macrofauna of the deep sea. Traditionally, two major groups are distinguished: Clitellata (including earthworms, leeches) and "Polychaeta" (mostly marine worms). Recent analyses of molecular data suggest that Annelida may include other taxa once considered separate phyla (i.e., Echiura, and Sipuncula) and that Clitellata are derived annelids, thus rendering "Polychaeta" paraphyletic; however, this contradicts classification schemes of annelids developed from recent analyses of morphological characters. Given that deep-level evolutionary relationships of Annelida are poorly understood, we have analyzed comprehensive datasets based on nuclear and mitochondrial genes, and have applied rigorous testing of alternative hypotheses so that we can move towards the robust reconstruction of annelid history needed to interpret animal body plan evolution.
Sipuncula, Echiura, Siboglinidae, and Clitellata are all nested within polychaete annelids according to phylogenetic analyses of three nuclear genes (18S rRNA, 28S rRNA, EF1α; 4552 nucleotide positions analyzed) for 81 taxa, and 11 nuclear and mitochondrial genes for 10 taxa (additional: 12S rRNA, 16S rRNA, ATP8, COX1-3, CYTB, NAD6; 11,454 nucleotide positions analyzed). For the first time, these findings are substantiated using approximately unbiased tests and non-scaled bootstrap probability tests that compare alternative hypotheses. For echiurans, the polychaete group Capitellidae is corroborated as the sister taxon; while the exact placement of Sipuncula within Annelida is still uncertain, our analyses suggest an affiliation with terebellimorphs. Siboglinids are in a clade with other sabellimorphs, and clitellates fall within a polychaete clade with aeolosomatids as their possible sister group. None of our analyses support the major polychaete clades reflected in the current classification scheme of annelids, and hypothesis testing significantly rejects monophyly of Scolecida, Palpata, Canalipalpata, and Aciculata.
Using multiple genes and explicit hypothesis testing, we show that Echiura, Siboglinidae, and Clitellata are derived annelids with polychaete sister taxa, and that Sipuncula should be included within annelids. The traditional composition of Annelida greatly underestimates the morphological diversity of this group, and inclusion of Sipuncula and Echiura implies that patterns of segmentation within annelids have been evolutionarily labile. Relationships within Annelida based on our analyses of multiple genes challenge the current classification scheme, and some alternative hypotheses are provided.
Annelids are found throughout the world's terrestrial, aquatic, and marine habitats, and are the most abundant component of the macrofauna in the deep sea. As one of three major animal groups with segmentation, annelids are critical in any investigation of body plan evolution; we need to understand the composition and branching history of groups within the annelid radiation if we are to progress towards elucidation of the last common ancestor of bilaterians and evolution of segmentation. Surprisingly, annelid evolution is poorly understood. To rectify this situation, data sets of multiple genes are being used to evaluate diversity and relationships of major annelid clades.
Annelida is part of the lophotrochozoan radiation that includes Mollusca, Brachiopoda, Bryozoa, Nemertea and Sipuncula . Traditionally classified as "Polychaeta" and Clitellata, three taxa historically assigned phylum status have been hypothesized to also fall within Annelida: Sipuncula, Echiura, and Siboglinidae (previously known as Pogonophora and Vestimentifera). Sipunculan origins are the most controversial of these three; some morphological data suggest molluscan affiliations , but a growing body of data indicates annelid affinities for this group of unsegmented marine worms [3–7].
Both molecular and morphological data support Echiura and Siboglinidae as annelids [7–14]. Echiura possess annelid-like features such as the ultrastructure of cuticle and chaetae, the development of the mesoderm, and structure and position of the blood vessels ; additionally, their larval nervous system indicates a possible segmented ancestry even though overt segmentation is lacking in adults . Recent 18S rRNA analyses support a sister relationship of Echiura and capitellid polychaetes [12, 16, 17], while an analysis of five gene regions (18S, D1 & D9-10 of 28S, histone H3, snU2 RNA, COX1) by Colgan et al.  supports a sister group relationship with Terebelliformia, indicating that the former may reflect a single gene artifact. Recent analyses of four gene regions (18S rRNA, D1 of 28S RNA, histone H3, 16S rRNA) are inconclusive; one analysis favors a sistergroup relationship to Capitellidae and to Protodrilus purpureus and Pectinaridae in the other one .
The ultrastructure of the uncini (short, apically toothed chaetae) implies a closer relationship of Siboglinidae with the annelid taxa Terebelliformia and Sabellida , and Rouse and Fauchald  (R&F) presented a cladistic analysis of morphological characters that corroborates this. The placement of siboglinids with annelids is also supported, albeit weakly, by several molecular studies (mostly based on 18S rRNA) [e.g., [5–8, 12, 19–22]]. None of these hypotheses has been explicitly tested using a rigorous statistical framework.
Analyses of morphological and molecular data support clitellate monophyly and have provided robust hypotheses of relationships within this taxon [23–25]. However, monophyly of "Polychaeta" lacks support and resolving polychaete annelid relationships has been difficult [26, 27]. Although some authors regard "Polychaeta" as sister to Clitellata [e.g., ], increasing evidence based on molecular and morphological data suggests they are a paraphyletic grade including clitellates [e.g., [3, 8, 12, 20, 23, 28–30]].
Polychaete annelids have been classified into approximately 80 families, generally supported as monophyletic, but arrangement of these into well-supported more-inclusive nodes is wanting . R&F's  analyses provide the most taxonomically inclusive hypotheses of polychaete relationships to date. Based on morphological cladistic analyses of polychaete families, they proposed a monophyletic "Polychaeta" consisting of two major clades, Scolecida and Palpata, the latter divided into Canalipalpata and Aciculata. While these analyses provide objective and testable hypotheses of polychaete relationships, and thus significantly moved the field forward, many aspects of our current understanding of annelid phylogeny are still poorly understood. Monophyly of Scolecida, Palpata and Canalipalpata has been questioned by some morphologists [30, 32, 33], and uncertainties of character scoring, homology assessment, and difficulty in rooting of the annelid tree make resolution of annelid relationships using morphology unlikely [27, 30, 34, 35]. Molecular analyses to date show no evidence to support the classification scheme developed from the R&F analyses, but most molecular studies have been based on single genes [see [16, 20, 26, 27]]. In a recent study that included the most comprehensive annelid taxon sampling to date, incomplete character data for the four genes analyzed may explain the lack of resolution ; of the 217 taxa, only 52 had data for all genes, and over 50 were represented by data for a single gene only. Neither molecular nor morphological studies have yet convincingly eliminated the possibility that these clades are monophyletic despite the doubts raised.
To address these major outstanding issues of annelid inclusiveness and phylogeny, we reconstruct relationships of major annelid and lophotrochozoan taxa using two data sets. One data set is built on ~6.5 kb of sequence from three nuclear genes [nuclear small and large ribosomal subunits (18S rRNA and 28S rRNA), and elongation factor-1α (EF1α); referred to as the Nuc data set] for 81 taxa that span 45 traditional polychaete families as well as Siboglinidae, Echiura, Clitellata, and Sipuncula and nine outgroup OTUs to address both annelid inclusiveness and phylogeny. The other data set comprises the three nuclear genes and eight mitochondrial genes (~13.4 kb) for a restricted taxon sampling of 10 operational taxonomic units (OTUs) to address annelid inclusiveness with respect to Sipuncula, Echiura and Siboglinidae [referred to as the NucMt data set]. These are the two largest data sets for annelids explored to date in terms of amount of numbers of characters for a full range of taxa, and it is our hope that the taxonomic representation of these datasets will continue to grow in order to provide a more holistic comparison to morphological hypotheses. Besides analyses using Maximum likelihood (ML) and Bayesian inference (BI) a priori hypotheses about annelid relationships were explicitly tested against best trees using approximately unbiased (AU) and non-scaled bootstrap probability (NP) tests.
Results of the approximately unbiased (AU) and the non-scaled bootstrap probability (NP) tests of the Nuc and NucMt data sets.
Difference to best trees
Difference to best tree
Sipuncula & Mollusca
Echiura sister to Terebelliformia
Scolecida (including Echiura)
The hypothesis of a Sipuncula/Mollusca relationship is rejected by AU and NP tests for NucMt and thus corroborates some previous reports [3, 5, 6, 36]. These results agree with previous studies suggesting that the molluscan cross organization of micromeres during spiral cleavage should not be regarded as synapomorphic for mollusks and sipunculans [2, 37]. Wanninger et al.  discussed the possession of a ventral median nerve as additional morphological support for a sipunculid-annelid relationship, but these nerves are also observed in other taxa like Gnathostomulida [39–41]. The position of Sipuncula within the annelid radiation varies among analyzed data sets (Figs. 1, 2, Additional files 1, 2, 3, 4) and we were not able to significantly discriminate via the AU or NP test whether sipunculans were sister to or derived within Annelida (Table 1). However, in all of the resulting 81-OTU trees (Figs. 1, Additional files 1, 2, 3) sipunculans were nested within annelids, and never placed as a sister taxon, consistent with the NucMt results and some previous studies [6, 7]. Interestingly, the lining of nephridial podocytes is a morphological feature shared by Sipuncula and Terebelliformia that bolsters the close, albeit weak association between these two groups in the 81-taxon tree (Fig. 1) .
Echiuran inclusion within the annelids, first suggested by EF1α data , was significantly supported in our analyses (Figs. 1 & 2, Table 1). Moreover, all of our analyses of individual genes or combined data with 81 OTUs, placed echiurans as sister to Capitellidae [18S rRNA (BS: 95; PP: 1.00), 28S rRNA (BS: 100; PP: 1.00 both excluding Capitella), EF1α (BS < 50; PP: 0.56), and Nuc (BS = 99 excluding Capitella; PP = 1.00), note that Capitella 28S rRNA may have long branch issues] confirming earlier suggestions [12, 16–18, 20], and in sharp contrast to recent analyses of several gene regions placing echiurans with Terebelliformia taxa . Indeed, our explicit hypothesis testing significantly rejects such a close relationship (Table 1). In the Nuc analyses, Echiura/Capitellidae is sister to a scolecidan clade of Maldanidae and Arenicolidae clade (Fig. 1); this result is echoed in the NucMt data, which lacks a capitellid but places the echiuran and maldanid as sister taxa (Fig. 2). Due to this placement of Echiura as a derived annelid group, interpretations of nerve development in echiurans as vestigial segmentation are fully warranted .
The long-held hypothesis that Siboglinidae (including vent worms, pogonophorans and the recently discovered bone eating worms Osedax ) are within Annelida [e.g., [8, 10–12, 14, 21]] is supported with significant statistical rigor (Table 1). Our results corroborate previous molecular studies supporting a nested position of Siboglinidae within Annelida (Figs. 1, 2, Additional files 1, 2, 3, 4) [8, 12, 14]. Furthermore, the Nuc data set and 28S rRNA partition (Figs. 1, Additional file 1) show a close, albeit weak relationship between Siboglinidae and Sabellida, specifically Oweniidae, as suggested by analyses of morphological data [10, 11], some molecular data  and combined analyses of both . The ultrastructure of the uncini and the possession of only one pair of nephridia with dorsal pores in the first segment are possible synapomorphic features . Morphological characters supporting the sister group relationship of Oweniidae and Siboglinidae are neuropodial chaetae in posterior segments emerging straight from the body wall and an intraepidermal nerve cord . The close relationship to Clitellata as shown by the NucMt data set (Fig. 2) is not substantiated by morphological data.
In all analyses, Clitellata is derived from polychaetous ancestors; it does not form a major basal branch of annelids (Figs. 1, 2, Additional files 1, 2, 3, 4). This placement is significantly supported by hypothesis testing (Table 1), and corroborates a series of previous studies [see for review [20, 23, 27]]. In all analyses with 81 OTUs except those of the EF1α partition, the small mainly freshwater meiofaunal aeolosomatid Aeolosoma sp. is sister to Clitellata; however, the support for this relationship is weak (Figs. 1, Additional files 1 & 2). Previous authors have discussed Aeolosoma as either a highly derived clitellate [45, 46], as sister to or the most basal clitellate (depending on which morphological characters are used to define Clitellata) [47–52], or as not closely related to Clitellata [20, 28, 53–55]. Our findings strengthen the conclusions of previous but more limited studies [47–52], and offer further evidence of a freshwater origin of Clitellata . However, additional sampling of terrestrial and freshwater polychaetes would be needed before the relationship between Aeolosoma and Clitellata can be considered resolved.
While the relationships of polychaetous annelids are still poorly understood, our testing of the four major polychaete clades proposed by Rouse and Fauchald  are all significantly rejected by our analyses (Table 1). These results provide rigorous support for previous ad hoc arguments based on morphological and molecular data [7, 16, 18, 20, 26, 27, 30, 32, 33]. Due to the strong affiliation of echiurans with capitellids, we opted to provisionally include echiurans within Scolecida during hypothesis testing to understand whether that group was natural regardless of echiuran position. Rejection of Scolecida, Palpata and Canalipalpata is not surprising due to dispersion of their taxa throughout the tree (Fig. 1) and given that support of these groups by morphological data is weak [30, 32, 33]. For example, the presence of palps is doubted in some Palpata group members (Ampharetidae, Pectinariidae and Terebellidae), and palp nerves but no protruding palps have been shown in some scolecids (Scalibregmatidae and Paraonidae) .
Aciculata was the least problematic of the R&F taxa, in that it was the exclusion of only two taxa that precluded its monophyly on the tree. For the Nuc data, rejection of Aciculata by AU and NP tests is due to the placement of Orbinia within Aciculata (orbiniids are considered scolecids) and the placement of amphinomids at the base of the tree (Fig. 1). These results are in contrast to two recent multi-gene analyses where the Aciculata taxa Eunicida and Phyllodocida are dispersed throughout the trees . Interestingly, supportive chaetae of Orbiniidae may be homologous with the acicula of Aciculata , and Amphinomida has been regarded as basal based on the tetraneurous organization of the nervous system [57, 58].
As in previous studies encompassing polychaete taxa [e.g., [7, 12, 20]], nodal support above the family level is weak (BS < 50; PP < 0.95). Nonetheless, some taxa recovered by R&F's  analyses are also supported in our analyses (Fig. 1). For example, the traditional Capitellida consisting of Capitellidae, Arenicolidae and Maldanidae is recovered, but includes Echiura; Terebelliformia and Aphroditiformia are also recovered; and a clade encompassing all Sabellida and Spionida except Chaetopteridae appears in the tree. In contrast to Rousset et al.  and Colgan et al. , monophyly of Annelida including Echiura and Sipuncula is also revealed by all combined analyses (Fig. 1, 2, Additional file 4). Thus, with an increasing number of characters our understanding of annelid phylogeny and inclusiveness is progressing, even using ribosomal rRNA genes. A similar effect has been observed for protostomes, lophotrochozoans and eunicidans as well [4, 59, 60].
Segmentation has traditionally been thought to be a conserved morphological character complex supporting an Arthropoda plus Annelida clade, "Articulata"; an implicit assumption has been that homoplasy of segmentation is unlikely. Molecular phylogenetic analyses favor Lophotrochozoa and Ecdysozoa and reject Articulata with steadily increasing support, which implies convergence of their body plans and specifically of the complex "segmentation" [see ]. Interestingly, recent studies of Annelida and Arthropoda central nervous systems reveal a higher variability than previously expected and a typical rope ladder-like central nervous system, a key feature of the complex "segmentation" cannot be found [see [56, 61]]. Our results provide statistical support for trees that indicate a high plasticity in the evolution of this fundamental body plan character in annelids: the placement of unsegmented Echiura and Sipuncula within Annelida implies that these two groups have independently lost segmentation; and members of Siboglinidae possess highly modified segmentation that is only obvious in their posterior ophistosoma.
The molecular phylogenetic analyses presented here corroborate previous studies in suggesting a very different view of annelid evolution than is traditionally accepted; the tests of alternative hypotheses provide statistical support for our conclusions. Sipuncula is most likely a derived annelid or the annelid sister group. Echiura is the sister group to Capitellidae as revealed by three nuclear markers and their exclusion from Annelida is significantly rejected. Accepting Annelida as including Echiura and Sipuncula not only suggests that segmentation is more evolutionary labile than previously assumed, but that other characters distinctive of annelids, such as complex chaetae comprised of β-chitin and arranged in four groups, have also been reduced or secondarily lost in some of these derived taxa .
Hypothesis testing clearly rejects the exclusion of Clitellata and Siboglinidae from "Polychaeta", and monophyly of Scolecida, Palpata, Canalipalpata or Aciculata. Furthermore, some higher level annelid taxa are suggested by our analyses, i.e., Aphroditiformia, Terebelliformia, Sabellida/Spionida excluding Chaetopteridae, Capitellida including Echiura, Eunicida/Phyllodocida including Orbiniidae. However, we need to progress further in the resolution of annelid phylogeny, and we recognize that data from additional genes and for more complete taxonomic representation of annelids may robustly resolve the basal nodes in the annelid tree and elucidate the precise positions of Sipuncula, Clitellata, and Siboglinidae.
Additional file 5 lists taxa and genes employed, and GenBank accession numbers. Upon collection, samples were preserved in > 70% non-denatured Ethanol, RNAlater (Invitrogen) or frozen at 80°C. Procurement of 18S rRNA and 28S rRNA sequence data followed Struck et al. . For EF1α, total RNA was isolated using RNAwiz™ (Ambion) and reverse transcribed using SuperScript™ II (Invitrogen). Amplification of EF1α followed McHugh  with an additional primer (JH16R: 5'-KNRAANKNYTCNACRCACA-3') using touchdown PCR and a second round of amplification. Further amplification details can be found in Supplementary Data. The TOPO TA Cloning Kit for Sequencing (Invitrogen) was used to clone most EF1α products. An ABI Prism 310 Genetic Analyzer and Big Dye Terminator v.3.1 (Applied Biosystems) were used in sequencing.
A data set (NucMt) consisting of eight mitochondrial genes (COX1-3, CYTB, NAD6, ATP8, 12S and partial 16S), as well as three nuclear genes 18S rRNA, 28S rRNA and EF1α were analyzed using the 7 available ingroup mitochondrial genomes plus 3 outgroups (2 mollusks and 1 brachiopod). Five mollusks, two brachiopods, a nemertean and a platyhelminth were outgroups for individual genes and the concatenated (Nuc) data sets. Sequences were aligned with CLUSTAL W  using default settings and corrected by hand. Ambiguous positions were excluded from the subsequent analysis. The alignments (Accession #S1766; Nuc matrix #M3221 and NucMt data set #M3220 & #M3219) are available at TREEBASE. To assess phylogenetic signal, regions within genes were analyzed using a procedure modified from Jördens et al. . Saturated positions were removed. χ2 tests did not reject homogeneity of base frequencies across taxa [see Additional file 6]. Appropriate ML models for each of the 5 data sets [see Additional file 7] were determined with Modeltest V 3.06 [63, 64]. PAUP*4.0b  using heuristic searches, tree-bisection-reconnection (TBR) branch swapping, 10 random taxon additions and model parameters reconstructed topologies. Nodal support was estimated by 100 BS replicates with 10 random taxon addition and TBR branch swapping for the NucMt data set. However, for the 81-OTU data sets ML bootstrapping using heuristic searches is not applicable due to computational time burden. Therefore, each 81-OTU data set was analyzed using 10,000 BS replicates, neighbor-joining searches and ML settings for the parameter specification.
For BI, MrModeltest 1.1b  determined prior probability distributions of individual parameters of nucleotide substitution models for each gene partition in MrBayes 3.1 . The mixed amino acid substitution model option was chosen for each protein-coding gene partition when analyzing amino acid sequences in the NucMt data. Partitions were unlinked. Analyses employed two runs with three heated and one cold chain started simultaneously for either 1*107 generations (18S rRNA and 28S rRNA), 5*106 generations (EF1α and Nuc), or 1*106 generations (NucMt, protein-coding genes coded as either nucleotides or amino acids), with trees being sampled every 250 generations. Based on convergence of likelihood scores, burnin trees were discarded (18S rRNA : 6,760 trees; 28S rRNA : 28,000; EF1α : 10,000; Nuc: 8,000; NucMt, both: 201) and posterior probabilities determined from remaining trees.
Significance tests using both the AU and the NP test of CONSEL [68, 69] were performed under the ML criterion for the Nuc and NucMt data sets for several a priori hypotheses against the best trees. The following hypotheses were tested for Nuc and NucMt data sets: 1) Sipuncula is not an annelid, 2) Sipuncula and Mollusca are closely related, 3) Echiura is not an annelid, 4) Siboglinidae is not an annelid, 5) Clitellata is not a subtaxon of polychaetes, 6) monophyly of Scolecida including Echiura; and additionally for the Nuc data set a sister group relationship of Echiura and Terebelliformia as well as monophyly of the other three major polychaete clades (Palpata, Aciculata, Canalipalpata). To obtain the best result for each a priori hypothesis the analyses were constrained by allowing only trees congruent with the particular a priori hypothesis. Due to strong support for a Capitellidae/Echiura sistergroup relationship with 18S rRNA data sets, the taxon assemblage for Scolecida was changed from R&F  to include Echiura biasing the test in favor of supporting Scolecida. Furthermore, each clade obtained in NucMt nucleotide only analyses were compared against the best alternative hypothesis not congruent with the clade by means of AU tests resulting in the 1- p values of Fig. 2. This is an approach similar to decay indices in Parsimony analyses [70, 71], but with actual significance values; p < 0.05 shows significant difference between the two alternative clades. Thus, in congruence with BS and PP values 1- p has to be greater than 0.95.
ATP synthase subunit 8
cytochrome c oxidase subunit I
cytochrome c oxidase subunit II
cytochrome c oxidase subunit III
NADH dehydrogenase subunit 6
non-scaled bootstrap probability
nuclear genes only
nuclear and mitochondrial genes
operational taxonomic unit
Rouse & Fauchald
The crew of the R/V Point Sur and R/V Oceanus were most helpful in obtaining samples. For both cruises, we also acknowledge the help of the scientific crews (which are too numerous to list here). Friday Harbor Laboratories, University of Washington, is also acknowledged for their support. Computational assistance on GUMP (Genomics Using Multiple Processors) at Auburn University was kindly provided by Scott Santos. We also thank three anonymous reviewers for contributions to the paper. This work was support by the USA National Science Foundation (NSF) WormNet grant (EAR-0120646) to K.M.H. and D.M.H., NSF OCE-0425060 to K.M.H. and T.H.S., National Underwater Research Program (NURP) grant to KMH and Lisa Levin, and by the grant DFG-STR-683/3-1 from the Deutsche Forschungsgemeinschaft to T.H.S. This work is AU Marine Biology Program contribution #20.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.