Phylogenomic analyses of Crassiclitellata support major Northern and Southern Hemisphere clades and a Pangaean origin for earthworms
© The Author(s). 2017
Received: 22 August 2016
Accepted: 18 May 2017
Published: 30 May 2017
The Erratum to this article has been published in BMC Evolutionary Biology 2017 17:204
Earthworms (Crassiclitellata) are a diverse group of annelids of substantial ecological and economic importance. Earthworms are primarily terrestrial infaunal animals, and as such are probably rather poor natural dispersers. Therefore, the near global distribution of earthworms reflects an old and likely complex evolutionary history. Despite a long-standing interest in Crassiclitellata, relationships among and within major clades remain unresolved.
In this study, we evaluate crassiclitellate phylogenetic relationships using 38 new transcriptomes in combination with publicly available transcriptome data. Our data include representatives of nearly all extant earthworm families and a representative of Moniligastridae, another terrestrial annelid group thought to be closely related to Crassiclitellata. We use a series of differentially filtered data matrices and analyses to examine the effects of data partitioning, missing data, compositional and branch-length heterogeneity, and outgroup inclusion.
Results and discussion
We recover a consistent, strongly supported ingroup topology irrespective of differences in methodology. The topology supports two major earthworm clades, each of which consists of a Northern Hemisphere subclade and a Southern Hemisphere subclade. Divergence time analysis results are concordant with the hypothesis that these north-south splits are the result of the breakup of the supercontinent Pangaea.
These results support several recently proposed revisions to the classical understanding of earthworm phylogeny, reveal two major clades that seem to reflect Pangaean distributions, and raise new questions about earthworm evolutionary relationships.
“The plough is one of the most ancient and most valuable of man's inventions; but long before he existed the land was in fact regularly ploughed, and still continues to be thus ploughed by earth-worms. It may be doubted whether there are many other animals which have played so important a part in the history of the world, as have these lowly organised creatures.”
Charles Darwin, The formation of vegetable mould through the actions of worms, with observations on their habits, pg. 313 
Earthworms (Crassiclitellata) constitute a diverse group of primarily terrestrial, burrowing annelids comprising 6000+ extant species in 18 families and found on all continents except Antarctica. Most earthworm species live in soil, but some live in decaying logs, leaf litter, stream mud and riverbanks, as well as arboreal (e.g., epiphytic root masses) and even marine littoral habitats. Charles Darwin famously extolled the importance of earthworms as terrestrial ecosystem engineers, churning and aerating the soil with their burrows as well as burying and processing large fragments of organic matter and making their nutrients available to plants. Large-scale engineering by earthworms has recently been documented in South America  and may occur elsewhere. Even apart from their direct agricultural importance as soil processors, earthworms have a substantial economic impact—epigeic (leaf litter/compost-dwelling) species are used to process food waste (vermiculture), larger species are sold as bait for fish, and some earthworm species are considered delicacies and are sold for human consumption. Earthworms are prey items for many other species, including planarians, leeches, mollusks, insects, amphibians, lizards, snakes, birds and mammals, and thus serve as a crucial link in numerous terrestrial food webs. Many earthworms are considered invasive; approximately one-third of all earthworm species in North America are introduced from Europe and Asia [3, 4]. As invasive earthworms spread in recently glaciated and otherwise earthworm-free forests in North America, they affect many microbial, plant and invertebrate species that have come to rely on large amounts of undisturbed leaf material .
Widespread distribution and limited dispersal abilities make earthworms a promising model of historical biogeographic patterns at a global scale. Indeed, speculation about earthworm biogeography has a long history, perhaps unusually attractive to history-of-science enthusiasts. Early ideas about earthworm distributions relied on dubious land bridge hypotheses (review in ). Apocryphal lore has it that J.W. Michaelsen (a great Clitellata taxonomist of the late 19th and early 20th centuries; e.g., ) and Alfred Wegener were office neighbors in Hamburg, Germany for a time. Michaelsen  cited Wegener’s hypothesis of continental drift  as providing considerable explanatory power for the distributions of earthworms, and named an amphi-Atlantic genus after him (Wegeneriella Michaelsen 1933). Despite Michaelsen’s contribution, speculation about land bridges continued to pervade the earthworm biogeographic literature.
Earthworms have a very poor fossil record, and specialists have long disagreed about directions of character evolution within the group. Early earthworm phylogenies were highly intuitive (cf. ) and shed little light on earthworm historical biogeography. Earthworm phylogenetic understanding has progressed slowly since these initial attempts. The few applications of cladistic analysis, such as Jamieson’s (1988) morphological study , yielded mixed conclusions, and the first use of molecular data  overturned many of the morphology-based hypotheses. James and Davidson  included a broader gene (16S, 18S and 28S ribosomal RNA genes) and taxon sampling of Crassiclitellata and several outgroups and were able to reinterpret many morphological changes defining the families of crassiclitellates, proposing new hypotheses of morphological evolution and rehabilitating older ones.
Although James and Davidson  clarified many aspects of earthworm phylogeny, relationships among several major groups remain poorly supported. Fortunately, the advent of low-cost, high-throughput sequencing methods has revolutionized the study of higher-level relationships across the tree of life, allowing researchers to bring dozens to thousands of genes to bear on previously intractable questions. To test previous hypotheses of relationships among earthworms and provide a robust framework for historical biogeographic inference and studies of character evolution, we generated transcriptomic data from representatives of nearly all major extant lineages of Crassiclitellata and performed a series of analyses to infer relationships among the major lineages of earthworms.
Collection locality, museum location of voucher specimen, museum catalog number, SRA project number, number of Illumina reads, number of Trinity contigs and number of HaMSTr ortholog groups represented for each of the thirty-seven transcriptomes generated in this study
# HaMStR Orthologs
Argentina, Tierra del Fuego, Ushuaia (coll. E. Lapied)
Gabon, edge of Lac Vembo, Gamba complex, (coll. S James 18 May 2008)
France, Jargeau, Loiret Department, (coll. M. Koken)
Hungary (coll. C. Csuzdi)
Sweden, Södermanland, Vingåker, Valltrand, indoor compost, 59.0864 N, 16.0544 E (coll. E. Boräng, 1 Jan 2012)
in EtOH CE13942
Dichogaster sp. (green tree worm) Benhamiidae
Brazil, Amazonas, near Manaus, Reserva Campina (coll. S. James, S. Coral, 2 Feb 2012)
France, Guadeloupe, Basse Terre (colls. S. James, F. Gamiette Feb 2013)
France, Guadeloupe, Chutes Carbet, Basse Terre (colls. S. James, F. Gamiette Feb 2013)
USA, Tonganoxie, Kansas (coll. S. James? May 2013)
Sweden, Södermanland, Vingåker, Valltrand, indoor compost, 59.0864 N, 16.0544 E (coll. E. Boräng, 1 Jan 2012)
in EtOH CE13945
Brazil, Sao Paulo, bait shop (coll. S. James, 7 Nov 2010)
Brazil, Assistencia, São Paulo, Fazenda Sta Rosa (coll. S. James, 9 Nov 2012)
France, Aveyron, L'Hospitalet du Larzac (coll. S James, 1 Mar 2011)
South Africa, Western Cape, Stellenbosch (colls S. James, D. Plisko, 27 Aug 2011)
Brazil, Amazonas, near Manaus, Reserva Ducke (colls. S. James, S. Coral 1 Feb 2012)
Brazil, Parana, Campina Grande do Sul, Caratuva peak trail (coll. S. James, M. Bartz, 17 Oct 2010)
France, Aveyron, L'Hospitalet du Larzac (coll. S James, 1 Mar 2011)
SRA PRJNA196484*,Spain, El Molar, 40°44′22.9″N, 3°33′53.1″W
Kerriona sp. Graciosa1
Brazil, Parana, Graciosa Road (coll. S. James, 4 Nov 2010)
USA, Sideling Hill Wildlife Mgmt. Area, Washington County, Maryland. (colls. S. James, M. Callaham, May 2013)
Madagascar, Antsirabe, 19°46'38.60"S 47°06'41.69"E
USA, Louisiana, Washington Parish (coll S. James, M. Callaham, M. Damoff, C. Erseus, 17 Jan 2011)
New Zealand, Kelly’s Creek (coll. T. Buckley)
South Africa, Western Cape, Tokai Swamp (colls. S. James and D. Plisko, 29 Aug 2011)
South Africa, Northern Cape, Niewwoudtville (colls. S. James, D. Plisko 5 Sep 2011)
South Africa, Western Cape, Table Mountain (coll. James, Meassey, Plisko, 26 Aug 2011)
Place Kabary 2 sp.
Madagascar, Place Kabary, Antsiranana, 12°16'58.27''S 49°17'25.94''E
USA, Cedar Point, Alabama (colls. S. James, C. Erséus 17 January 2011)
Brazil, Amazonas, Reserve Ducke (colls. S. James, S. Coral, 3 Feb 2012)
France, Midi-Pyrénées, Ariège, Malegoude (coll. S. James, 2 Mar 2011)
USA, Iowa, Des Moines River, at Douds (coll. S. James 12 May 2012)
Brazil, Rio Grande do Sul, Santo Cristo (coll. G. Steffen 09 Sep 2009)
France, Aveyron, Montpellier (colls. S. James, M. Bouche, 1 Mar 2011)
France, Midi-Pyrénées, Ariège, Cazavet, L'Estelas Cave, in water, 43.000 N, 1.010 E (coll. M.C. des Chatelliers, P. Martin & N. Giani, 24 May 2011) (topotype)
in EtOH CE13924
Western Australia, 25.5 km S of Busselton, Rapids Conservation Park, Margaret River (coll. C. Erséus, 16 Sep 2012)
Sweden, Västergötland, Göteborg, seeping groundwater at Göteborg Botanical Garden (Vitsippsdalen), 57.6813 N, 11.9562 E (C. Erséus & A. Achurra, 29 Mar 2011)
SMNH 161294 in EtOH CE11200
Brazil, Amazonas, Reserva Ducke (colls. S. James, S. Coral, 3 Feb 2012) (topotype)
NCSM 000000 in EtOH CE14372
Sweden, Västergötland, Göteborg, Guldheden, spring S of Dr Fries Torg, 57.6827 N, 11.9707 E (coll. M. Svensson, 8 Nov 2011)
SMNH 161295 slide CE13679
Propappus volki Propappidae
Sweden, Blekinge, Ronneby, Väby, Bräkneån River, sand in rapids, 56.1792 N, 15.1052 E (C. Erséus, B. Williams & S. Martinsson, 31 May 2013) (topotype)
SMNH 161296 slide CE18375
The crassiclitellate samples represented all extant crassiclitellate families but one (Biwadrilidae) and at least 28 genera. Transcriptomes for thirty-one crassiclitellate taxa and all six outgroup taxa were generated as part of this study, and two additional crassiclitellate transcriptomes were assembled as described below from data in the Sequence Read Archive (SRA; http://www.ncbi.nlm.nih.gov/sra/) for Hormogaster elisae (PRJNA196484) and Eisenia andrei (DRX021555). A transcriptome was also generated for a representative of Moniligastridae (Drawida sp.). Voucher specimens are deposited at the North Carolina Museum of Natural Sciences (NCSM), the Swedish Museum of Natural History (SMNH) and the Western Australian Museum (WAM) (Table 1).
Total RNA was extracted from RNAlater®-preserved samples using the Ambion RNAqueous®-Micro Total RNA Isolation kit. First-strand cDNA was constructed using the SMART® cDNA Library Construction Kit (Clontech Laboratories, Inc.), replacing the included 3′ primer with the Cap-TRSA-CV oligo . We amplified double-stranded cDNA using the Advantage® 2 PCR Kit (Clontech Laboratories, Inc.). To minimize the risk of contamination, extractions and cDNA construction were performed in small batches of four tissue samples or fewer, and the workstation and tools were cleaned with bleach between each set of extractions. Where possible, we avoided sampling the external body surface and the gut to limit the potential for contamination from epibionts and gut contents (e.g., prey items and microorganisms).
Non-normalized cDNA libraries were sent to Hudson Alpha Institute for Biotechnology, Huntsville, Alabama USA for library preparation and 2 × 100–bp paired-end sequencing on an Illumina HiSeq 2000. Approximately one-sixth of a lane was used for each taxon.
Sequence assembly and processing
Raw PE Illumina reads were digitally normalized using khmer (normalize-by-median.py -C 30 -k 20 -N 4 -× 2.5e9)  and assembled using the October 5, 2012 release of Trinity . We used TransDecoder (http://transdecoder.github.io) to find open reading frames and translate nucleotide sequences into amino acid sequences that were at least 100 amino acids in length.
Translated data for all 40 taxa were searched against the Lophotrochozoa pHMMs in HaMStR v.13.2.3  using Helobdella robusta as reference species. We set HaMStR to output all sequences that fulfilled the reciprocity requirement and then used a custom script to generate FASTA-formatted files for each orthogroup that included all sequences and deleted duplicated contigs. Each orthogroup was then aligned with MAFFT (L-INS-i) .
One of the major difficulties in phylogenomic analysis—particularly when dealing with transcriptome data—is orthology assessment. Most animals harbor paralogous copies of many genes, but standard molecular phylogenetic analyses assume that data sampled from each taxon for each locus are orthologs. Failure to distinguish orthologs from paralogs can cause major problems in phylogenetic inference . Given this, we used a tree-based approach to remove likely paralogs from our alignments. We inferred a maximum-likelihood (ML) tree for each aligned orthogroup with FastTreeMP  (under the –slow and –gamma settings), and used PhyloTreePruner  to screen each of the resulting trees. In PhyloTreePruner, nodes on each ML tree with SH-like local support values <0.7 were collapsed into polytomies, and the largest subtree was retained where each taxon was represented by either no sequences or only one sequence, unless all sequences for a given taxon formed part of a clade or part of the same polytomy (in which case, all were retained). Sequences falling outside this maximally inclusive subtree were assumed to be paralogs and were deleted from the data set. If multiple in-paralogs were initially retained, all but the longest sequence were subsequently deleted by PhyloTreePruner. This returned an alignment for each orthogroup that included (at most) a single, putatively orthologous sequence for each taxon. PhyloTreePruner was used to retain only orthogroups found in at least 25% (10 taxa), 50% (20 taxa), 75% (30 taxa) and 100% (40 taxa) of transcriptomes. All loci were subsequently realigned with MAFFT (L-INS-i). FASconCAT v1.0.pl  was then used to concatenate orthogroups. The ProteinModelSelection.pl script (https://github.com/stamatak/standard-RAxML/blob/master/usefulScripts/ProteinModelSelection.pl) was used to find the best-fitting amino-acid substitution model for each orthogroup (for downstream analyses using TreSpEx; see below) and for each concatenated data matrix. We chose not to use any automated alignment filtering methods (e.g., GBlocks ), due to concerns about their efficacy in improving phylogenetic inference .
Distantly related outgroups may be problematic for phylogenomic inference . We used two approaches to explore the effect of outgroup sampling on estimates of ingroup relationships. First, we deleted Lumbriculus variegatus and Propappus volki (the two most distant outgroups in terms of summed branch length to the base of Crassiclitellata across analyses) and “?Haplotaxidae sp.” (a conspicuously long outgroup branch) from the set of transcriptomes prior to processing with the approach outlined above, leaving a total of 37 taxa. Following the approach outlined above, we used PhyloTreePruner to only retain orthogroups found in at least 25, 50 and 75% of the taxa (in this case, 10, 19 and 28 taxa, respectively). Second, we deleted only “?Haplotaxidae sp.” from the original set of transcriptomes, leaving a total of 39 taxa. For this data set, we processed the transcriptomes as described above, but used PhyloTreePruner to only retain orthogroups found in ≥75% of the taxa (i.e., 30 taxa). To assess the influence of sites with high percentages of gaps/missing data on our inferences, we produced two concatenated “no ?Haplotaxidae sp.” 75% data matrices. For one, we did no additional filtering. For the other, we used TrimAl v1.2  to remove all sites comprising >50% gaps from each individual orthogroup alignment prior to concatenation and model testing. Amounts of missing data per taxon were calculated using TREE-PUZZLE 4.3  for all matrices.
All data matrices, ML tree files, custom scripts and supplementary figures are available via the Dryad Digital Repository (http://datadryad.org/resource/doi:10.5061/dryad.n7n71).
Long-branch effects and compositional heterogeneity
Characteristics of all data matrices analyzed in this study
# Parsimony-informative Characters
Unpartitioned/partitioned original data
TreSpEx + BaCoCa filtered
Unpartitioned/partitioned original data
TreSpEx + BaCoCa filtered
Unpartitioned/partitioned original data
TreSpEx + BaCoCa filtered
No ?Haplotaxidae sp., all sites
No ?Haplotaxidae sp., no gappy sites
Unpartitioned/partitioned original data
Maximum Likelihood (ML) analyses
Partitioned maximum-likelihood (ML) analyses were conducted with RAxML versions 8.1.24 and 8.2.3  on CIPRES  with 1000 rapid bootstrap replicates, using the following options: -f a -x < random number seed for rapid bootstrapping; unique for each analysis > −p < random number seed for initial parsimony inferences; unique for each analysis > −# 1000 -m PROTGAMMA < amino acid model > −s < inputfile> − n < outputfile> (Table 2). Best-fitting amino acid substitution models were inferred for each locus and applied to each locus in RAxML by adding “-q < partitionfile>” to the command listed above. Identical random number seeds for rapid bootstrapping and parsimony inferences were used for the two “no ?Haplotaxidae sp.” 75% matrices (one that was not cleaned with TrimAl and one from which sites with >50% gaps were removed, both filtered with TreSpEx and BaCoCa) to allow a direct comparison of tree topologies for these two matrices.
We used SuperQ v.1.1  to visualize topological conflict among loci for the 25, 50 and 75% unfiltered data sets. SuperQ rescales the partial, unrooted ML gene trees for each data set to produce comparable branch lengths, decomposes the trees into weighted quartet trees and employs the QNet algorithm to produce a split network from the quartet trees. We used the Gurobi optimizer to calculate initial split weights and optimize the weights under the “balanced” objective function. We used SplitsTree v.4.14.4  to visualize the resulting networks.
Bayesian Inference (BI) analyses
Site-heterogeneous Bayesian Inference (BI) analyses of the 25, 50, and 75% data sets and for the two filtered “no ?Haplotaxidae sp.” 75% matrices (one that was not cleaned with TrimAl and one from which sites with >50% gaps were removed) were conducted with PhyloBayes-MPI v1.5a  under the CAT-GTR model with two independent chains and gamma-distributed rates on CIPRES. Analyses were allowed to run for up to 168 h (the CIPRES limit), constant sites were removed, and four categories were used for the discrete gamma distribution. Convergence checks were conducted automatically every 1800 s and analyses were terminated early if after a burn-in of 500 cycles, the minimum effective size exceeded 50, and the “maxdiff” value between chains was less than 0.1. For runs that terminated due to reaching the time limit, convergence of parameter estimates and topologies across chains was assessed by evaluating the basecomp and tracecomp files produced by PhyloBayes and via visual inspection of trace files in Tracer v1.6 .
Tree topologies recovered in our analyses contradicted previous hypotheses regarding the monophyly of Dichogaster (see below). The Shimodaira-Hasegawa and approximately unbiased tests [46, 47] are often used to evaluate particular topological hypotheses (including at least one hypothesis chosen a posteriori), but these tests are actually designed to evaluate whether all topologies in a plausible set of topologies are equally good explanations of the data, rather than to compare specific alternative topologies . Fortunately, the parametric bootstrapping (SOWH) test [48, 49] and Bayesian topology tests  are both appropriate in this context.
We used SOWHAT  to perform SOWH tests to test Dichogaster monophyly. SOWH tests require two ML analyses—an unconstrained analysis and an analysis in which the topology is constrained to match a particular alternative hypothesis. The difference in likelihoods between the trees resulting from each analysis (δ) constitutes the test statistic for the SOWH test. The ML topology and branch lengths from the constrained analysis are then used to simulate a large number of data sets using the model parameter estimates for the constrained ML topology and original data. We provided SOWHAT with a Dichogaster monophyly constraint (forcing monophyly of the three Dichogaster transcriptomes) in Newick format and a reduced data set in which three distant/long-branch outgroup taxa (Propappus volki, Lumbriculus variegatus and ?Haplotaxidae sp.) were removed, retaining only orthogroups found in at least 28 of the transcriptomes, emulating the 75% data set described above. SOWHAT called Seq-Gen 1.3.2  to simulate 100 data sets and RAxML 8.2.8  to infer topologies for each simulated data set in an unconstrained and constrained ML analysis. SOWHAT calculates confidence intervals around a SOWH test p-value after addition of each replicate to determine if the sample size of the test was adequate.
For Bayesian topology tests, we used the posterior sample of trees generated in the PhyloBayes CAT-GTR analysis of the 75% data set to estimate posterior model odds for alternative topological hypotheses, following suggestions by Bergsten et al. . We calculated posterior model odds by dividing the frequency of trees in the post burn-in sample of trees that support one hypothesis (e.g., Dichogaster is not monophyletic) by the frequency of trees that support the alternative hypothesis (e.g., Dichogaster is monophyletic; all three Dichogaster transcriptomes form a clade).
Divergence time estimation
Unfortunately, the dearth of fossils that can be attributed to earthworms [53, 54] presents a challenge for estimating divergence times, but there are some relevant fossils as well as some previous dating studies on earthworms. Putative earthworm trace fossils (burrows or casts) have been recovered from the Triassic , with possible body fossils in the Paleocene . Possible clitellate body fossils have been recovered from Permian deposits , and fossil leech cocoons are known from the late Triassic . Finally, a molecular study of hormogastrid earthworms (calibrated using the separation of the Corso-Sardinian microplate from continental Europe) suggests that they radiated in the Late Cretaceous ; if this is correct, the common ancestor of all crassiclitellates must have arisen much earlier.
These fossils and inferences give us a set of calibration points that we can use to estimate dates for key divergences within our phylogenies. We performed dating analyses for three data matrices: the unfiltered 75% data set (including ?Haplotaxidae sp.) and two versions of the 75% data matrix that did not include ?Haplotaxidae sp. (one with all sites and the other with sites containing >50% gaps removed, both filtered with TreSpEx and BaCoCa as described above) in PhyloBayes 3.3f . In each case, we used the CAT-GTR PhyloBayes majority rule consensus tree for each data matrix as a fixed topology. We ran four independent chains for each data set, sampling every ten cycles, under the CAT-GTR substitution model with gamma-distributed rates, a lognormal autocorrelated relaxed clock model and a uniform prior on divergence times.
We used three calibration points/ranges in our analyses—the oldest known leech cocoon fossil (201 Mya) , the divergence of Hormogastridae (67–97 Mya)  and a minimum age estimate for crown-group Annelida of 520 Mya (based on the earliest known—probably stem-group—polychaetes from the Sirius Passet deposit of North Greenland; [61–64]). Though we did not include leeches in our analyses, previous studies have supported a sister-group relationship between leeches and their allies (branchiobdellidans and Acanthobdella) and Lumbriculidae [13, 18], providing a minimum age for divergence of the Lumbiculidae + Hirudinea clade and Crassiclitellata based on the earliest fossil cocoons attributable to leeches. We used 67 Mya as a minimum age and 97 Mya as a maximum age for the deepest divergence within Hormogastridae as represented in our data matrices  (the node subtending Hemigastrodrilus monicae and Vignysa popi/Hormogaster elisae; a recent phylogenomic study of Hormogastridae  corroborates this pattern of relationships). Finally, we argue that a minimum age of crown-group Annelida (520 Mya) is suitable as a maximum age constraint for the root of our phylogeny, because no evidence of clitellates is known prior to the Permian, and the root of our phylogeny is deeply nested within Clitellata, which is itself deeply nested within the annelid crown group.
The calibration for the divergence between Lumbriculidae and Hirudinea (201 mya) was treated as a hard upper bound, with the lower bound modeled as a truncated Cauchy distribution (p = 0.1 and c = 1). We placed uniform priors of 67–97 mya and 201–520 mya on the Hormogastridae divergence and the root node, respectively. Convergence was assessed with estimated sample sizes and visual inspection of parameter traces in Tracer v1.6. To assess whether the priors conditional on our calibrations match our intended prior distributions, we ran PhyloBayes under the prior and our calibrations using the F81 model without rate variation across sites (these model parameters do not factor into the prior over divergence times) and visually inspected the results.
We focused on divergence times for two nodes in our phylogeny that separated Northern and Southern Hemisphere subclades—1) a node separating Kynotus pittarelli (Madagascar) and a clade comprising Sparganophilus sp. and Komarekiona eatoni (both found in eastern North America) and 2) a node separating a Northern Hemisphere clade comprising Lutodrilus (North America) and Lumbricoidea (Criodrilidae, Hormogastridae, Lumbricidae) (Europa and Asia) and a primarily Southern Hemisphere clade comprising representatives of Almidae, Acanthodrilidae, Eudrilidae, Glossoscolecidae, Megascolecidae, Microchaetidae and Ocnerodrilidae (Africa, Australia, New Zealand and South America). We hypothesized that these divergences may be due to vicariance during the breakup of Pangaea starting in the late Triassic to early Jurassic (~200–185 Mya) [66, 67]; divergence time estimation using molecular data allows a test of this hypothesis.
Ideally, we would also infer dates using a Bayesian method such as BEAST , but preliminary analyses suggested that the computational demands of inferring divergence times for our data in this manner would be prohibitive.
Transcriptomes for thirty-one crassiclitellates, one moniligastrid, and six outgroup taxa were generated as part of this study (Table 1; Additional file 1: Figure S1) and are available from the SRA at NCBI under BioProject accession number PRJNA362879. We added publicly available transcriptome data for two additional crassiclitellates—Hormogaster elisae (PRJNA196484) and Eisenia andrei (PRJDB3115)—for a total of forty transcriptomes. Four gene sets from these transcriptomes were analyzed, reflecting different levels of gene occupancy.
Percent missing data per taxon across all matrices analyzed in this study, calculated using TREE-PUZZLE 5.3
Dichogaster sp. (green tree worm)
Eisenia andrei SRA
Kerriona sp. Graciosa1
Place Kabary 2 sp.
Within Crassiclitellata, several clades were consistently recovered, including a clade comprising Kynotidae (Madagascar), Sparganophilidae and Komarekionidae (both found in the southern and eastern United States) as sister to the rest of Crassiclitellata. All analyses revealed a deep split between two major clades, one with a largely Northern Hemisphere (Laurasian) distribution (Lumbricoidea sensu James and Davidson ; Lumbricidae, Hormogastridae, Criodrilidae and Lutodrilus multivesiculatus) and the other with a primarily Southern Hemisphere (Gondwanan) distribution (Microchaetidae, Rhinodrilidae, Almidae, Glossoscolecidae, Eudrilidae and Megascolecoidea sensu James and Davidson ) (note that Rhinodrilidae was mistakenly given the name Pontoscolecidae in James and Davidson ; this error was subsequently corrected .
Data partitioning, deleting long-branch outgroup taxa (e.g., ?Haplotaxidae sp.) and removal of loci that showed variable rates of change in different lineages or signs of compositional heterogeneity had little impact on trees resulting from analyses of the 25, 50, and 75% data sets. Relationships recovered in analyses of the 100% data set were poorly supported, likely reflecting the very small size of this data set (7 loci, <2000 amino acids; Table 2), and will not be discussed in detail. Across the 25, 50, and 75% data sets, relationships differed in only two ingroup clades. In one of these cases (relationships among Maoridrilus, Parachilota and Acanthodrilidae sp.), bootstrap support for any particular resolution was low across all analyses (Fig. 2). In the other, support for Hormogastridae (represented here by Hemigastrodrilus, Vignysa and Hormogaster) and a Scherotheca + Eisenia pairing increased as the number of genes (and amount of missing data) increased (Fig. 2). Bootstrap support for both of these clades was >90% across all data sets from which one or more distant/long-branch outgroup taxa were deleted. With the exception of these cases, all ingroup relationships were identical across all trees, whether or not ?Haplotaxidae sp. was included, sites containing >50% gaps were deleted, the data were filtered with TreSpEx and BaCoCa or analyzed with a site-heterogeneous model in PhyloBayes.
Outgroups and basal relationships
Despite the high stability and levels of bootstrap support for relationships within the ingroup across analyses, positions of some outgroup taxa and the lone representative of Moniligastridae (Drawida) varied among data sets and analyses (Fig. 2). Based on previous analyses , we expected Propappus volki to be a suitable distant outgroup to root our phylogeny, with Lumbriculidae and Haplotaxidae forming successively closer outgroups to Crassiclitellata. However, most analyses failed to recover this pattern of relationships. Partitioned ML analyses of the 25% through 75% data matrices supported a paraphyletic Crassiclitellata and Metagynophora (due to the inclusion of Pelodrilus sp., an alleged haplotaxid) as well as a doubly paraphyletic Haplotaxidae (due to the inclusion of both Metagynophora and Lumbriculus variegatus) (Fig. 2). Recovery of topologies in which Lumbriculus variegatus is more closely related to Metagynophora than Haplotaxis gordioides is strains credulity; Lumbriculus is a member of Lumbriculidae, a clitellate group that previous molecular and morphological phylogenetic reconstructions (18S data; [13, 18]) suggest is more closely related to leeches (Hirudinida) than to haplotaxids or crassiclitellates.
Our outgroup sampling was designed to test crassiclitellate monophyly and root Crassiclitellata, not to infer deep-level relationships among major clitellate taxa. As such, unexpected relationships among outgroups may not be surprising, but failure to recover Crassiclitellata is of greater concern—in some trees (e.g., based on partitioned ML analyses of the 25 and 50% data sets), Drawida (Moniligastridae) was found to be nested within Crassiclitellata, usually as sister to a clade comprising all earthworms except Komarekiona, Kynotus and Sparganophilus (Fig. 2). Removal of outlier loci detected by TreSpEx and BaCoCa did not consistently recover expected relationships among the outgroup taxa, nor did it consistently yield a monophyletic Crassiclitellata across data sets (Fig. 2b).
Elimination of potentially problematic loci is one way to explore the impact of systematic bias and possibly improve inferences; elimination of potentially problematic taxa is another. Inclusion of distant outgroups can perturb phylogenomic analyses, particularly with respect to basal ingroup relationships . Cursory visual inspection of our trees revealed that one of the haplotaxids in our data sets—?Haplotaxidae sp.—is a rather long-branch taxon, and this could be confounding our results. To test this, we eliminated ?Haplotaxidae sp. alone, or the two (putatively) most distant outgroup taxa in our data sets (Propappus and Lumbriculus) and ?Haplotaxidae sp. Unfortunately, despite the seemingly positive impact of outgroup deletion on inference of some ingroup relationships (see above), analyses of these matrices failed to clarify basal crassiclitellate relationships, usually yielding trees in which either Pelodrilus sp. or Drawida sp. was weakly supported as sister to the Komarekiona + Kynotus + Sparganophilus clade (Fig. 2).
By contrast, PhyloBayes analysis of the unfiltered 75% data set recovered both a monophyletic Crassiclitellata and a monophyletic Metagynophora, though the posterior probability of Crassiclitellata was low (0.61) (Fig. 1). Assuming Crassiclitellata and Metagynophora are, indeed, monophyletic, our PhyloBayes results suggest that accounting for site-specific substitution processes, if computationally feasible, rather than simply partitioning by gene, can yield improved inferences.
In the SOWH test of Dichogaster monophyly, the observed δ test statistic was 4148.083, and Dichogaster monophyly was rejected (p-value <0.01, 95% confidence interval = 0.03621669–0). No trees in the post burn-in sample of 2578 trees from PhyloBayes include a monophyletic Dichogaster, making the posterior model odds in favor of a non-monophyletic Dichogaster infinite.
PhyloBayes divergence time estimates (mean ± standard error) in millions of years ago for three data matrices for two key nodes in the earthworm radiation—the node separating Kynotus from Sparganophilus + Komarekiona (node 1, Fig. 1) and the node separating the Northern Hemisphere clade comprising Lutodrilus and Lumbricoidea and the clade comprising Southern Hemisphere representatives of several families (node 2, Fig. 1)
Node 1 (mean ± SE)
Node 2 (mean ± SE)
75% with ?Haplotaxidae sp., unfiltered
164.527 ± 22.2868
161.104 ± 21.0178
no ?Haplotaxidae sp., filtered, all sites
186.1521 ± 23.2339
186.0185 ± 22.18
no ?Haplotaxidae sp., filtered, no >50% gaps
178.1059 ± 21.9198
177.6679 ± 21.1014
A second important revision to classical earthworm thinking supported by the current study is the placement of the monotypic families Lutodrilidae and Criodrilidae as successive sister taxa to the Hormogastridae + Lumbricidae clade. Both Lutodrilidae and Criodrilidae are aquatic, as is Almidae, and these three families show strong morphological similarities of body form (quadrangular tail segments), color (dusky gray with blue-green in the head segments) and clitellum length (extraordinarily long, tens of segments rather than the usual 3 to 10 or so seen in most terrestrial earthworms). The finding that the two closest relatives of the clade comprising the predominant earthworms of Europe (Lumbricidae and Hormogastridae) are aquatic suggests a possible aquatic ancestor for European earthworms. Typically, aquatic earthworms lack dorsal pores, but most members of Lumbricidae have them, as do members of the crown clade Megascolecoidea (represented here by representatives of Dichogaster, Maoridrilus, Parachilota and Pontodrilus, along with a thus-far-unidentified acanthodrilid from Madagascar, Place Kabary 2 sp.). Microchaetidae through Eudrilidae and Ocnerodrilidae (represented here by Kerriona) lack dorsal pores, with rare exceptions in the last family , indicating that dorsal pores probably evolved independently at least twice.
In the current study, placement of the only member of Ocnerodrilidae, Kerriona sp. Graciosa 1, as sister to a clade composed of Dichogaster saliens (Benhamiidae) and the acanthodriline Place Kabary 2 sp. is unusual and, if validated with a larger sampling of Ocnerodrilidae, would be a major change in the systematics of Megascolecoidea. Traditionally Ocnerodrilidae is considered to be close to acanthodriline earthworms (Acanthodrilidae, Benhamiidae, and “Octochaetidae”), because they share similar male reproductive apparatuses composed of prostate glands associated with the male gonopores (cf. [21–23]). They are also morphologically similar in a few respects to the African Eudrilidae.
The status of Dichogaster is uncertain from the present results, perhaps largely due to the inclusion of New World species, which have not been included in previous phylogenetic efforts. Two of the three sampled Dichogaster species, from Guadeloupe (French West Indies) and from an arboreal epiphyte root mass north of Manaus, Brazil (Dichogaster green tree worm), are clearly separated from D. saliens, historically endemic to Africa, and both a SOWH test and a Bayesian topology test strongly reject Dichogaster monophyly. The latter species has previously been included in a highly supported African and south Pacific Dichogaster clade, within the also highly supported Benhamiinae . Morphologically, the New and Old World Dichogaster species share many derived characters, but differ on a few points . The geographic distribution of the genus (equatorial Africa, north Neotropics, northern South America, South Pacific) remains enigmatic in the absence of a well resolved and more broadly sampled phylogeny of Benhamiinae.
The classically defined Glossoscolecidae was separated into Rhinodrilidae (“Pontoscolecidae”) and a restricted Glossoscolecidae based on a weakly supported node in the topology recovered in . That node had Almidae intervening between the two families. Our results confirm that node with strong support, suggesting that Almidae is probably secondarily aquatic given that Glossoscolecidae and Rhinodrilidae are predominantly terrestrial. We hypothesize that the common ancestor of Almidae and Rhinodrilidae occurred at a time when paleocontinents made possible the occupation of South American, African and Asian landmasses; South America would seem to be the most probable area of origin for Almidae.
The current study confirmed relationships within Lumbricoidea put forth by , and resolved an outstanding conflict about Hormogastridae, which was found to be monophyletic in  but paraphyletic or unresolved due to the placement of Hemigastrodrilus in . Although analyses of the 75% data set support paraphyly of Hormogastridae, analyses of the 25 and 50% data sets, as well as all “deleted outgroup” data sets, return a monophyletic Hormogastridae (Fig. 2).
Despite the consistent topological patterns seen across all analyses, supernetwork visualization revealed high levels of interlocus conflict (Fig. 3). Some regions of high incongruence—e.g., near the base of Crassiclitellata—are unsurprising, given that concatenated analyses of different data sets recover different relationships in this region of the tree. However, the networks also show a higher level of conflict among loci along the backbone of the Southern Hemisphere subclade than in the Northern Hemisphere group. The reasons for this are unclear, but more taxa were sampled from the Southern Hemisphere clade, and branches in this group on both the network (Fig. 3) and, less obviously, on the PhyloBayes tree (Fig. 1) are generally longer.
No known earthworm fossils exist. Although several ichnofossils have been attributed to earthworm-like organisms, these traces provide little or no concrete information about the clade membership of the author of any hole, burrow, fecal material or other fossilized biostructure made by an elongated soft-bodied invertebrate. However, we can make some inferences about the age of earthworm clades based on the biology and distributions of extant earthworm species and the results of our dating analyses. First, transoceanic movement of adult crassiclitellates seems unlikely except for a few cases where species have become salt-water tolerant inhabitants of marine littoral zones (e.g., Pontodrilus litoralis). Transoceanic dispersal of earthworms is nonetheless a possibility over geological time scales—such dispersal events have been inferred for other subterranean terrestrial animals (e.g., amphisbaenians; ), earthworm cocoons may be dispersed via rafting or by birds, and earthworms are known from many islands. Second, current earthworm distributions show a high degree of congruence with post-Pangaean continental movements [6, 70]. Third, current earthworm distributions generally show high degrees of local endemicity in topographically complex landscapes, and even in non-complex areas in some lowland tropical forests .
Relative ages of New Zealand earthworm clades are comparable to those of continental earthworm faunas , and multiple sister-group relationships span large distributional gaps (e.g., New Zealand-Madagascar and trans-Pacific relationships between Australia and North and Central America). Lumbricidae (Eurasia, North America) has been estimated to be about 125 million years old  using biogeographic calibrations, while the split between Lumbricoidea and earthworm families on the branch leading to Megascolecoidea was previously estimated at about 200 MYA, the Triassic-Jurassic boundary, coinciding with the separation of Laurasia from Gondwana . The latter split is also present in our trees, with comparable taxon sampling. Our date estimates for this node are somewhat more recent (ranging from 161 to 186 mya, depending on the data matrix; Table 4), but the standard errors on these estimates (±21- ± 23 my) are substantial. However, deletion of the long-branch outgroup taxon ?Haplotaxidae sp. yielded earlier divergence times for this node (~178 mya with >50% gap sites excluded, ~186 mya with all sites) that are more concordant with the breakup of Pangaea.
Recovery of a sister-group relationship between a Laurasian clade and a Gondwanan clade is not unprecedented; similar patterns have been seen in crayfish (Astacoidea and Parastacoidea) , dragonflies (Petaluridae) , stoneflies (Arctoperlaria and Antarctoperlaria) , mayflies (Ephemerelloidea)  and squeak beetles (Hygrobiidae) . Within Lumbricidae, the split between European Eisenia and a North American clade containing Eisenoides and others (not sampled in this study) may be consistent with the final separation of the two continents at ~72 MYA .
The clade containing Komarekiona, Sparganophilidae, Kynotidae and Biwadrilidae (the latter not sampled in this study) also shows some sign of a northern continent / southern continent split, which also suggests a Pangaean distribution. As above, our divergence time estimate for the split between Kynotus and Komarekiona + Sparganophilus using the 75% data set (165 ± 22 mya) is more recent than we might expect if the Pangaean hypothesis was true. Once again, however, estimates based on the “no ?Haplotaxidae sp.” data matrices (178 and 186 mya) are deeper in time, and the congruence in divergence time estimates for the two focal nodes is noteworthy (for the “no ?Haplotaxidae sp.” data matrices, the mean estimates are within one million years of each other). These are very small families, two from North America and one each from Madagascar and Japan, respectively. There is little evidence to lead us to a hypothesis about the geographic location of the ancestor of Crassiclitellata. Moniligastridae, the sister group of Crassiclitellata represented in this study by Drawida, is now only found in South and East Asia. It is extremely diverse in India, but less so elsewhere in Asia as far east as Borneo  and Mindoro Island, Philippines (James, unpublished data), an Asian crustal fragment. Based on this distribution, the Moniligastridae–Crassiclitellata divergence would seem most likely to have occurred in a southern landmass.
Our dating analyses seem to be broadly consistent with the hypothesis that the two major “north-south” divergences within Crassiclitellata were caused by the breakup of Pangaea, but they do not constitute a particularly strong test. Additional data and more thorough dating analyses will be required to provide a more rigorous test of the Pangaean breakup hypothesis.
There remain several unanswered questions about the evolutionary history of Clitellata. Within the former, for example, we do not yet have a clear picture of the sister group to Crassiclitellata, nor have we robust support for crassiclitellate monophyly using the data presented here. The shared presence of a multi-layered clitellum remains the strongest evidence for crassiclitellate monophyly, but the possibility of multiple origins of this trait cannot be disregarded. Ongoing phylogenomic work on Clitellata as a whole should shed substantial light on this question.
This study clarifies earthworm phylogeny and evolution, supporting several recently proposed revisions to our understanding of earthworm relationships and resolving others, most notably including 1) placement of Kynotidae (Madagascar) with a group containing the North American taxa Komarekionidae and Sparganophilidae, 2) a clade comprising Lutodrilidae, Criodrilidae, Hormogastridae and Lumbricidae, 3) Dichogaster paraphyly, 4) affirmation of a restricted Glossoscolecidae and 5) Hormgastridae monophyly. Recovery of two major clades, each consisting of a Northern Hemisphere subclade and a Southern Hemisphere subclade, suggested a major role for vicariance (specifically, the breakup of Pangaea during the Mesozoic) in earthworm phylogeny and biogeography. Divergence time estimation provided additional support for this hypothesis, dating the north-south splits within each major clade to ~161–185 Mya.
We would like to thank Marie Bartz, Marcel Bouché, George Brown, Csaba Csuzdi, Marcel Koken, Danuta Plisko, Malalatiana Razafindrakoto, Bruce Snyder, Gerusa Steffen, Michel Creuzé des Châtelliers, Marcus Svensson and Egil Boräng for providing specimens. We also thank Pamela Brannock and Damien Waits for laboratory and bioinformatics assistance, and Associate Editor Henner Brinkmann for his thoughtful and helpful comments on the manuscript. This is Molette Biology Laboratory contribution 63 and Auburn University Marine Biology Program contribution 157.
This work was supported by the U.S. National Science Foundation WormNet II (Assembling the Annelid Tree of Life) grant (DEB-1036516 to FEA, DEB-1036537 to KMH and DEB-1136604 to SWJ).
Availability of data and materials
The datasets generated during and analyzed in this study, custom scripts used in the phylogenomics pipeline, and phylogenetic trees are available in the Dryad repository (http://datadryad.org/resource/doi:10.5061/dryad.n7n71).
SWJ, BWW, CE and FEA designed the study; SWJ, BWW, and CE collected the data; FEA, SRS and K. Halanych coordinated the analyses; FEA and K. Horn performed the analyses; FEA, BWW and SWJ wrote the manuscript; FEA, BWW, CE, SRS, SWJ and K. Halanych contributed to editing the manuscript. All authors gave final approval for publication.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Darwin C. The formation of vegetable mould, through the action of worms, with observations on their habits. London: J. Murray; 1892.Google Scholar
- Cunha L, Brown GG, Stanton DWG, Da Silva E, Hansel FA, Jorge G, et al. Soil animals and pedogenesis: the role of earthworms in anthropogenic soils. Soil Sci. 2016;181:110–25.View ArticleGoogle Scholar
- Hendrix PF, Callaham MA, Drake JM, Huang C-Y, James SW, Snyder BA, et al. Pandora’s box contained bait: the global problem of introduced earthworms. Annu Rev Ecol Evol Syst. 2008;39:593–613.View ArticleGoogle Scholar
- Blakemore, RJ American earthworms (Oligochaeta) from North of Rio Grande—a species checklist. A series of searchable texts on earthworm biodiversity, ecology and systematics from various regions of the world, 2nd edn. COE Soil Ecology Research Group, Yokohama National University, Japan. 2006;1-16.Google Scholar
- Bohlen PJ, Scheu S, Hale CM, McLean MA, Migge S, Groffman PM, et al. Non-native invasive earthworms as agents of change in northern temperate forests. Front Ecol Environ. 2004;2:427–35. Eco Soc AmericaView ArticleGoogle Scholar
- James SW. Planetary processes and their interactions with earthworm distributions and ecology. Earthworm Ecol. 2nd ed. Boca Rat: CRC Press; 2004. p. 53–62.Google Scholar
- Michaelsen W. Das Tierreich Vol, 10, Oligochaeta. Friedländer Sohn, Berlin. Pp. XXIX. 1900;575.Google Scholar
- Michaelsen W. Die Oligochäten Surinames. Mit Erörterung der verwandtschaftlichen und Geogr. Beziehungen der Octochätinen.-Tijdschr. Nederl. Dierk Ver 1933;3:112–30.Google Scholar
- Wegener A. The origin of continents and oceans (Translated from the 4th revision of the German edition by John Biram). New York: Dover Publications; 1929.Google Scholar
- Erséus C. Phylogeny of oligochaetous Clitellata. Hydrobiologia. 2005;535–536:357–72.Google Scholar
- Jamieson BGM. On the phylogeny and higher classification of the Oligochaeta. Cladistics. 1988;4:367–401.View ArticleGoogle Scholar
- Jamieson BGM, Tillier S, Tillier A, Justine J-L, Ling E, James S, et al. Phylogeny of the Megascolecidae and Crassiclitellata (Annelida, Oligochaeta): combined versus partitioned analysis using nuclear (28S) and mitochondrial (12S, 16S) rDNA. Zoosystema. 2002;24:707–34.Google Scholar
- James SW, Davidson SK. Molecular phylogeny of earthworms (Annelida: Crassiclitellata) based on 28S, 18S and 16S gene sequences. Invertebr Syst. 2012;26:213. CSIRO PUBLISHING.View ArticleGoogle Scholar
- Brinkhurst RO. The position of the Haplotaxidae in the evolution of oligochaete annelids. Hydrobiologia. 1984;115:25–36. SpringerView ArticleGoogle Scholar
- Brinkhurst RO. A taxonomic analysis of the Haplotaxidae (Annelida, Oligochaeta). Can J Zool. 1988;66:2243–52. NRC Research PressView ArticleGoogle Scholar
- Martínez-Ansemil E, Creuzé Des Châtelliers M, Martin P, Sambugar B. The Parvidrilidae - a diversified groundwater family: description of six new species from southern Europe, and clues for its phylogenetic position within Clitellata (Annelida). Zool J Linnean Soc. 2012;166:530–58.View ArticleGoogle Scholar
- Brinkhurst RO. Retrospect and prospect: reflections on forty years of study of aquatic oligochaetes. Hydrobiologia. 1999;406:9–19. Kluwer Academic PublishersView ArticleGoogle Scholar
- Siddall ME, Apakupakul K, Burreson EM, Coates KA, Erséus C, Gelder SR, et al. Validating Livanow: molecular data agree that leeches, branchiobdellidans, and Acanthobdella peledina form a monophyletic group of oligochaetes. Mol Phylogenet Evol. 2001;21:346–51.View ArticlePubMedGoogle Scholar
- Meyer E, Aglyamova GV, Wang S, Buchanan-Carter J, Abrego D, Colbourne JK, et al. Sequencing and de novo analysis of a coral larval transcriptome using 454 GSFlx. BMC Genomics. 2009;10:219. BioMed CentralView ArticlePubMedPubMed CentralGoogle Scholar
- Crusoe MR, Alameldin HF, Awad S, Boucher E, Caldwell A, Cartwright R, et al. The khmer software package: enabling efficient nucleotide sequence analysis. F1000Res. 2015;4:900.PubMedPubMed CentralGoogle Scholar
- Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29:644–52.View ArticlePubMedPubMed CentralGoogle Scholar
- Ebersberger I, Strauss S, von Haeseler A. HaMStR: Profile hidden markov model based search for orthologs in ESTs. BMC Evol Biol. 2009;9:157.View ArticlePubMedPubMed CentralGoogle Scholar
- Katoh K, Kuma K, Toh H, Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005;33:511–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Struck TH. The impact of paralogy on phylogenomic studies - a case study on annelid relationships. PLoS One. 2013;8:–e62892. Public Library of ScienceGoogle Scholar
- Price MN, Dehal PS, Arkin AP. FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5:e9490. Public Library of ScienceView ArticlePubMedPubMed CentralGoogle Scholar
- Kocot KM, Citarella MR, Moroz LL, Halanych KM. PhyloTreePruner: a phylogenetic tree-based approach for selection of orthologous sequences for phylogenomics. Evol Bioinformatics Online. 2013;9:429–35.View ArticleGoogle Scholar
- Kück P, Meusemann K. FASconCAT: convenient handling of data matrices. Mol Phylogenet Evol. 2010;56:1115–8.View ArticlePubMedGoogle Scholar
- Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000;17:540–52.View ArticlePubMedGoogle Scholar
- Tan G, Muffato M, Ledergerber C, Herrero J, Goldman N, Gil M, et al. Current methods for automated filtering of multiple sequence alignments frequently worsen single-gene phylogenetic inference. Syst Biol. 2015;64:778–91. Oxford University PressView ArticlePubMedPubMed CentralGoogle Scholar
- Pisani D, Pett W, Dohrmann M, Feuda R, Rota-Stabelli O, Philippe H, et al. Genomic data do not support comb jellies as the sister group to all other animals. Proc Natl Acad Sci. 2015;112:15402–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Capella-Gutierrez S, Silla-Martinez JM, Gabaldon T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 2009;25:1972–3. Oxford University PressView ArticlePubMedPubMed CentralGoogle Scholar
- Schmidt HA, Strimmer K, Vingron M, Von Haeseler A. TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing Bioinformatics; 2002. p. 502–4.Google Scholar
- Felsenstein J. Cases in which parsimony or compatibility methods will be positively misleading. Syst. Zool. 1978;27:401-10.Google Scholar
- Hendy MD, Penny D. A framework for the quantitative study of evolutionary trees. Syst Zool. 1989;38:297–309.Google Scholar
- Foster PG, Hickey DA. Compositional bias may affect both DNA-based and protein-based phylogenetic reconstructions. J Mol Evol. 1999;48:284–90.Google Scholar
- Saccone C, Lanave C, Pesole G, Preparata G. Influence of base composition on quantitative estimates of gene evolution. Methods Enzymol. 1990;183:570–83.Google Scholar
- Struck TH. TreSpEx—Detection of misleading signal in phylogenetic reconstructions based on tree information. Evol Bioinforma. 2014;10:51.View ArticleGoogle Scholar
- Kück P, Struck TH. BaCoCa--a heuristic software tool for the parallel assessment of sequence biases in hundreds of gene and taxon partitions. Mol Phylogenet Evol. 2014;70:94–8.View ArticlePubMedGoogle Scholar
- Zhong M, Hansen B, Nesnidal M, Golombek A, Halanych KM, Struck TH. Detecting the symplesiomorphy trap: a multigene phylogenetic analysis of terebelliform annelids. BMC Evol Biol. 2011;11:369.View ArticlePubMedPubMed CentralGoogle Scholar
- Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–3.View ArticlePubMedPubMed CentralGoogle Scholar
- Miller M, Pfeiffer W, Schwartz T. Creating the CIPRES Science Gateway for inference of large phylogenetic trees. In: Gatew Comput. Environ. Work. (GCE), 2010; 2010. p. 1–8. IEEE.Google Scholar
- Grunewald S, Spillner A, Bastkowski S, Bogershausen A, Moulton V. SuperQ: computing supernetworks from quartets. IEEE/ACM Trans Comput Biol Bioinforma. 2013;10:151–60.View ArticleGoogle Scholar
- Huson DH, Bryant D. Application of phylogenetic networks in evolutionary studies. Mol Biol Evol. 2006;23:254–67. Oxford University PressView ArticlePubMedGoogle Scholar
- Lartillot N, Rodrigue N, Stubbs D, Richer J. PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. Syst Biol. 2013;62:611–5. Oxford University PressView ArticlePubMedGoogle Scholar
- Rambaut A, Drummond AJ. Tracer 2009. Available from: http://tree.bio.ed.ac.uk/software/tracer/.
- Shimodaira H, Hasegawa M. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol Biol Evol. 1999;16:1114–6.Google Scholar
- Shimodaira H. An approximately unbiased test of phylogenetic tree selection. Syst Biol. 2002;51:492–508.View ArticlePubMedGoogle Scholar
- Goldman N, Anderson JP, Rodrigo AG. Likelihood-based tests of topologies in phylogenetics. Syst Biol. 2000;49:652–70.View ArticlePubMedGoogle Scholar
- Swofford DL, Olsen GJ, Waddell PJ, Hillis DM. Phylogenetic inference. In: Hillis DM, Moritz C and Mable BK, editors. Molecular Systematics, 2nd ed. Sunderland: Sinauer Associates; 1996. p. 407–514.Google Scholar
- Bergsten J, Nilsson AN, Ronquist F. Bayesian tests of topology hypotheses with an example from diving beetles. Syst Biol. 2013;62:660–73.View ArticlePubMedPubMed CentralGoogle Scholar
- Church SH, Ryan JF, Dunn CW. Automation and evaluation of the SOWH test with SOWHAT. Syst Biol. 2015;64:1048–58.View ArticlePubMedPubMed CentralGoogle Scholar
- Rambaut A, Grassly NC. Seq-Gen: an application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees Comput. Appl Biosci. 1997;13:235–8.Google Scholar
- Chin K, Pearson D, Ekdale AA, Bambach R, Nichols D, Brown J, et al. Fossil worm burrows reveal very early terrestrial animal activity and shed light on trophic resources after the End-Cretaceous mass extinction. Butler RJ, editor. PLoS One. 2013;8:e70920. Public Library of ScienceView ArticlePubMedPubMed CentralGoogle Scholar
- Humphreys GS. Evolution of terrestrial burrowing invertebrates. In: Roach IC, editor. Adv. regolith Proc. CRC LEME Reg. Regolith Symp. 2003. CRC LEME Canberra; 2003. p. 211–5.Google Scholar
- Retallack GJ. Triassic palaeosols in the upper narrabeen group of New South Wales. Part I: features of the palaeosols. J Geol Soc Aust. 1976;23:383–99. Taylor & FrancisView ArticleGoogle Scholar
- Hazen BM. A fossil earthworm (?) from the Paleocene of Wyoming. J Paleontol Paleontological Soc. 1937;11:250.Google Scholar
- Morris SC, Pickerill RK, Harland TL. A possible annelid from the Trenton Limestone (Ordovician) of Quebec, with a review of fossil oligochaetes and other annulate worms. Can J Earth Sci. 1982;19:2150–7. NRC Research Press Ottawa CanadaView ArticleGoogle Scholar
- Manum SB, Bose MN, Sawyer RT. Clitellate cocoons in freshwater deposits since the Triassic. Zool Scr. 1991;20:347–66. Blackwell Publishing LtdView ArticleGoogle Scholar
- Novo M, Almodóvar A, Fernández R, Giribet G, Díaz Cosín DJ. Understanding the biogeography of a group of earthworms in the Mediterranean basin—The phylogenetic puzzle of Hormogastridae (Clitellata: Oligochaeta). Mol Phylogenet Evol. 2011;61:125–35.View ArticlePubMedGoogle Scholar
- Lartillot N, Lepage T, Blanquart S. PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating. Bioinformatics. 2009;25:2286–8.View ArticlePubMedGoogle Scholar
- Parry L. Fossil focus: annelids. Palaeontol Online. 2014;4:1–8.Google Scholar
- Parry L, Tanner A, Vinther J. The origin of annelids. Smith A, editor. Palaeontology. 2014;57:1091–103.View ArticleGoogle Scholar
- Vinther J, Eibye-Jacobsen D, Harper DAT, Fauchald K, Rouse G, Hints O, et al. An early Cambrian stem polychaete with pygidial cirri. Biol Lett. 2011;7:929–32. The Royal SocietyView ArticlePubMedPubMed CentralGoogle Scholar
- Morris SC, Peel JS. The earliest annelids: lower Cambrian polychaetes from the Sirius Passet Lagerstätte, Peary Land, North Greenland. Acta Palaeontol Pol. 2008;53:137–48. Institute of Paleobiology, Polish Academy of SciencesView ArticleGoogle Scholar
- Novo M, Fernández R, Andrade SCS, Marchán DF, Cunha L, Díaz Cosín DJ. Phylogenomic analyses of a Mediterranean earthworm family (Annelida: Hormogastridae). Mol Phylogenet Evol. 2016;94:473–8.View ArticlePubMedGoogle Scholar
- Frizon de Lamotte D, Fourdan B, Leleu S, Leparmentier F, de Clarens P. Style of rifting and the stages of Pangea breakup. Tectonics. 2015;34:1009–29.View ArticleGoogle Scholar
- Veevers JJ. Gondwanaland from 650–500 Ma assembly through 320 Ma merger in Pangea to 185–100 Ma breakup: supercontinental tectonics via stratigraphy and radiometric dating. Earth Science Rev. 2004;68:1–132.View ArticleGoogle Scholar
- Drummond AJ, Rambaut A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007;7:214. Bioinformatics Institute, University of Auckland, Auckland, New Zealand. email@example.com.View ArticlePubMedPubMed CentralGoogle Scholar
- James SW. Re-erection of Rhinodrilidae Benham, 1890, a senior synonym of Pontoscolecidae James, 2012 (Annelida: Clitellata). Zootaxa. 2012;3540:67–8.Google Scholar
- Omodeo P. Evolution and biogeography of megadriles (Annelida, Clitellata). Ital J Zool. 2000;67:179–201. Taylor & Francis GroupView ArticleGoogle Scholar
- Fragoso C, Rojas P. A new ocnerodrilid earthworm genus from Southeastern Mexico (Annelida: Oligochaeta), with a key for the genera of Ocnerodrilidae. Megadrilogica. 2009;13:141–52.Google Scholar
- Csuzdi C. A monograph of the Paleotropical Benhamiinae earthworms (Annelida: Oligochaeta, Acanthodrilidae). Budapest: Hungarian Natural History Museum; Systematic Zoology Research Group of the Hungarian Academy of Sciences; 2010.Google Scholar
- Longrich NR, Vinther J, Pyron RA, Pisani D, Gauthier JA, Pianka E, et al. Biogeography of worm lizards (Amphisbaenia) driven by end-Cretaceous mass extinction. Proc Biol Sci. 2015;282:20143034. The Royal SocietyView ArticlePubMedPubMed CentralGoogle Scholar
- Decaëns T, Porco D, James SW, Brown GG, Chassany V, Dubs F, et al. DNA barcoding reveals diversity patterns of earthworm communities in remote tropical forests of French Guiana. Soil Biol Biochem. 2015;92:171–83.View ArticleGoogle Scholar
- Buckley TR, James S, Allwood J, Bartlam S, Howitt R, Prada D. Phylogenetic analysis of New Zealand earthworms (Oligochaeta: Megascolecidae) reveals ancient clades and cryptic taxonomic diversity. Mol Phylogenet Evol. 2011;58:85–96.View ArticlePubMedGoogle Scholar
- Domínguez J, Aira M, Breinholt JW, Stojanovic M, James SW, Pérez-Losada M. Underground evolution: new roots for the old tree of lumbricid earthworms. Mol Phylogenet Evol. 2015;83:7–19.View ArticlePubMedGoogle Scholar
- Blakey R. Deep Time Maps; http://deeptimemaps.com2017.
- Crandall KA, Harris DJ, Fetzner JW. The monophyletic origin of freshwater crayfish estimated from nuclear and mitochondrial DNA sequences. Proc Biol Sci. 2000;267:1679–86.View ArticlePubMedPubMed CentralGoogle Scholar
- Ware JL, Beatty CD, Sánchez Herrera M, Valley S, Johnson J, Kerst C, et al. The petaltail dragonflies (Odonata: Petaluridae): Mesozoic habitat specialists that survive to the modern day. Ali J, editor. J Biogeogr. 2014;41:1291–300.View ArticleGoogle Scholar
- McCulloch GA, Wallis GP, Waters JM. A time-calibrated phylogeny of southern hemisphere stoneflies: testing for Gondwanan origins. Mol Phylogenet Evol. 2016;96:150–60.View ArticlePubMedGoogle Scholar
- McCafferty WP, Wang T-Q. Phylogenetic systematics of the major lineages of pannote mayflies (Ephemeroptera: Pannota). Trans Am Entomol Soc. 2000:9–101.Google Scholar
- Hawlitschek O, Hendrich L, Balke M. Molecular phylogeny of the squeak beetles, a family with disjunct Palearctic-Australian range. Mol Phylogenet Evol. 2012;62:550–4.View ArticlePubMedGoogle Scholar
- Gates GE. Burmese Earthworms: an introduction to the systematics and biology of megadrile oligochaetes with special reference to Southeast Asia. Trans Am Philos Soc. 1972;62:1–326. American Philosophical SocietyView ArticleGoogle Scholar
- Novo M, Riesgo A, Fernández-Guerra A, Giribet G. Pheromone evolution, reproductive genes, and comparative transcriptomics in Mediterranean earthworms (Annelida, Oligochaeta, Hormogastridae). Mol Biol Evol. 2013;30:1614–29.View ArticlePubMedGoogle Scholar