Integrated shotgun sequencing and bioinformatics pipeline allows ultra-fast mitogenome recovery and confirms substantial gene rearrangements in Australian freshwater crayfishes
© Gan et al.; licensee BioMed Central Ltd. 2014
Received: 1 November 2013
Accepted: 29 January 2014
Published: 3 February 2014
Although it is possible to recover the complete mitogenome directly from shotgun sequencing data, currently reported methods and pipelines are still relatively time consuming and costly. Using a sample of the Australian freshwater crayfish Engaeus lengana, we demonstrate that it is possible to achieve three-day turnaround time (four hours hands-on time) from tissue sample to NCBI-ready submission file through the integration of MiSeq sequencing platform, Nextera sample preparation protocol, MITObim assembly algorithm and MITOS annotation pipeline.
The complete mitochondrial genome of the parastacid freshwater crayfish, Engaeus lengana, was recovered by modest shotgun sequencing (1.2 giga bases) using the Illumina MiSeq benchtop sequencing platform. Genome assembly using the MITObim mitogenome assembler recovered the mitochondrial genome as a single contig with a 97-fold mean coverage (min. = 17; max. = 138). The mitogenome consists of 15,934 base pairs and contains the typical 37 mitochondrial genes and a non-coding AT-rich region. The genome arrangement is similar to the only other published parastacid mitogenome from the Australian genus Cherax.
We infer that the gene order arrangement found in Cherax destructor is common to Australian crayfish and may be a derived feature of the southern hemisphere family Parastacidae. Further, we report to our knowledge, the simplest and fastest protocol for the recovery and assembly of complete mitochondrial genomes using the MiSeq benchtop sequencer.
KeywordsFreshwater crayfish Mitochondrial genome characterization Bench top sequencing Engaeus Parastacidae
Sequencing of mitochondrial genomes has become an important endeavour for providing molecular resources for population genetic and phylogeographic studies [1–3]. With the rising number of sequenced mitogenomes there has also been increasing interest in using this information for phylogenetic studies [4–8]. Further, as more full mitochondrial genomes are sequenced, there are interesting patterns of mitochondrial gene order that demand explanation and are themselves an additional source of phylogenetic signal [7, 8]. Until recent years, the recovery of whole mitochondrial genomes (mitogenomes) has been inefficient, with most approaches utilising either cloning or long range PCR (which may or may not be successful), followed by a series of Sanger sequencing [4, 9–11]. More recently, however, the increasing power of Next Generation Sequencing (NGS) has allowed the amplification-free sequencing of whole mtDNA genomes [12, 13]. This latter approach has been demonstrated using modest shotgun sequencing on platforms such the Illumina HiSeq, Illumina Genome Analyzer IIx (GA IIx) and Roche 454 [14–17]. In the case of Miller et al’s study, a mitogenome to 800 × coverage from 1/16th of a plate was recovered using 454. Further, Berman et al. also showed that modest NGS outputs designed to identify microsatellite loci can be also be used to recover whole mitochondrial genomes . Although these platforms represent a significant improvement over the conventional Sanger sequencing, the sample preparation and sequencing protocols are still relatively time consuming and laborious. Additionally, for the HiSeq platform, the run time can take up to 10 days. Coupled with its overly massive data output (600 gb), the HiSeq platform is not practical for the sequencing of mitogenomes. Although the data output is lower (90 gb) for the GA IIx platform, the operation of this system can represent a major technical challenge due to the lack of automation as compared to the HiSeq thus increasing the chance of human error and run failure. The sequencing chemistry of Roche 454 has been acknowledged to be susceptible to homopolymer issues hence representing a potential threat to the accuracy of the mitogenome assembly. Further, its high running cost and low data output also rendered it less cost-friendly for mitogenome sequencing. The development of the MiSeq benchtop sequencer and the timely introduction of MITObim , a low computationally demanding software for the assembly of mitochondrial genomes using a novel baiting and iterative mapping approach, serves as an impetus for the growth in NGS-based mitogenome assemblies. Although a successful mitogenome assembly using the MiSeq benchtop sequencer has been demonstrated recently, the library preparation steps for sequencing on the MiSeq were not covered in sufficient details nor was MITObim implemented in the assembly pipeline . Using a sample of the Australian freshwater crayfish Engaeus lengana , we contribute to the growing interest in mtDNA genome sequencing by providing a detailed protocol for the fastest recovery, assembly and annotation of mitogenome using the MiSeq personal genome sequencer, MITObim software and MITOS annotation web service.
Australia has a diverse and distinctive freshwater crayfish fauna despite the continent’s aridity. One of the most intriguing genera of Australian crayfishes are the land yabbies from the genus Engaeus, which can complete their life cycle without access to surface water [20, 21]. While the understanding of the evolution of Engaeus and other crayfishes have benefited from access to molecular data, only one full mitogenome is available for the southern hemisphere crayfish from the genus Cherax . Major mitogenome rearrangements were identified in the species Cherax destructor compared to what is considered the Pan-crustacean plan . A recent study indicates that northern hemisphere crayfish also have profound mitochondrial gene order rearrangements [22, 23], albeit different to the rearrangements identified in Cherax. Kim et al. in their study of the marine lobster, Homarus americanus, emphasised the need for the sequencing of more mitogenomes from the superfamily Astacidae .
The purpose of this study is two-fold: first to demonstrate the simplest protocol, to our knowledge, for the recovery of whole mitochondrial genomes directly from shotgun sequencing reads using the MiSeq platform. This protocol requires only 50 ng of DNA extracted from a single ethanol-preserved specimen, without the need for mtDNA enrichment; and, secondly, we use the Australian freshwater crayfish E. lengana to investigate further mitochondrial genome evolution in parastacid freshwater crayfishes, building on the work of Miller et al.  who sequenced the C. destructor mitogenome.
Mitogenome assembly, coverage and composition
The mitochondrial genome organization of E. lengana
No. of nt (bp)
Putative initiation/termination codon
Proteins and codons
Base composition of protein coding genes in the E. lengana mitogenome
(A + T)
(G + C)
Genes encoded on α-strand
Genes encoded on β-strand
The time taken from sample extraction to submission-ready full mitogenome sequence was only three days, and hands-on time was less than four hours. This makes our workflow one of the most efficient thus far reported [11, 16, 19, 22, 25]. The low-input DNA-quantity requirement and the simplicity of this protocol are particularly advantageous for new users as it minimizes the risk of failure and reduces sample quantity requirements. The MiSeq data output per run (4.5-7.5 gb) will enable sequencing of multiple samples via barcoding. Given that the data generated in this study (97-fold coverage) is more than twice that required to confidently deduce mitogenomes using MiSeq  , this means $100 mt genomes are now obtainable. According to the animal genome database (http://www.genomesize.com), the average genome size of a crayfish can range from 3.5 to 6 gb. Therefore, the percentage of reads mapping to the mitochondrial genome (0.1%) in E. lengana can be considered to be on the high side for enrichment-free shotgun sequencing. This may also reflect the use of muscle tissue, which has a high proportion of mitochondria. With additional pre-sequencing sample processing including mitochondrial enrichment , this would further reduce per-sample sequencing cost by allowing even more samples to be processed per run. Using this pipeline, we were also able to recover complete mitogenomes from very different organisms including the Australian Macquarie Perch, Macquaria australasica [EMBL:HG799088] and the Viet Nam Snout Otter Clam, Lutraria rhynchaena [EMBL:HG799089]. If further studies demonstrate wide utility of our methodology across a broad range of animal taxa, there exists the possibility of up-scaling DNA barcoding  from a small fragment of the mitochondrial COI gene to the level of the entire mitogenome.
For situations where there is no close relative to provide a “bait” sequence for MITObim, we demonstrated that a de novo assembly followed by BLAST search against the conserved mitochondrial genes (such as 16S rRNA) was effective for identifying mitogenome fragment(s) from the generated sequences. These can then in turn be used as the bait for a MITObim assembly.
This study brings the number of freshwater crayfish mitogenomes to five, comprising three from the northern hemisphere superfamily the Astacoidea (Procambarus and Cambaroides)  and now two from the southern hemisphere superfamily Parastacidae (Engaeus and Cherax) . The representatives of each superfamily have substantial gene order rearrangements in comparison to each other and their nearest relative from the marine clawed lobsters, Homarus americanus (Superfamily: Nephropoidea). The latter has what is deemed to be the more primitive pan-crustacean gene order . It is now apparent that Australian, and perhaps parastacid crayfish more generally, have one of the most elaborate gene order rearrangements so far discovered in the Arthropoda. It is possible that the translocation and inversion of the RNA genes is a distinctive feature of Australian crayfish. Future studies of freshwater crayfishes will likely benefit from the phylogenetic signal provided in mitochondrial gene order rearrangements [11, 27, 28].
In conclusion, this study demonstrates that benchtop sequencers can be used to obtain fast and relatively inexpensive generation of mt DNA sequences using shotgun sequencing without mitochondrial enrichment. We also show that MITObim  is effective at recovering mitogenomes from raw benchtop sequencer output. Previously reported significant gene order rearrangements in Australian crayfish are confirmed. The further sequencing of mitogenomes of southern hemisphere crayfish (Parastacidae) and additional representatives of the northern hemisphere families (Cambaridae and Astacidae) will undoubtedly contribute to our phylogenetic knowledge of this significant group of crustaceans. Lastly, we predict that improvements to our workflow and increased output from benchtop sequencers will further reduce the cost of reconstructing mitogenomes to much less than $100 per mitogenome.
Genomics DNA extraction
Approximately 40 mg of tail muscle tissue was dissected from an ethanol-preserved specimen of E. lengana collected from northern Tasmania (−41.00877 ° S, 144.66869 ° E). Total genomic DNA was extracted using DNAeasy Blood and Tissue Kit (Qiagen, Germany) following the manufacturer’s instructions with minor modification. EB rather than AE buffer was used to avoid possible interference of EDTA with the Nextera enzyme.
Molecular procedures and sequencing
The purified genomic DNA was quantified with Qubit HS (Invitrogen, USA) and normalized to 2 ng/μL. The normalized DNA was processed using Nextera-based library preparation (Illumina, USA) following the manufacturer’s instructions. Quantification and size estimation of the library was performed on a Bioanalyzer 2100 High Sensitivity DNA chip (Agilent, USA). Next, the library was normalized to 2 nM and sequenced on the MiSeq Benchtop Sequencer (2 × 250 bp paired-end reads) (Illumina, USA).
Mitogenome assembly and annotation
Mitogenome visualization and linear comparison
The circular mitogenome of E. lengana was visualized with Blast Ring Image Generator (BRIG) . Mapping statistic was generated by mapping the raw reads against the assembled mitogenome using Bowtie2 . SAM file output from Bowtie2 was subsequently used to visualize the mapping coverage via BRIG . Linear mitogenome comparison of E. lengana, C. destructor, Cambroides similes and H. americanus (Figure 2), was performed using EasyFig2.1 (BLASTn, default setting) .
Funding for this study was provided by the Monash University Malaysia (MUM) Tropical Medicine and Biology Multidisciplinary Platform and the work was conducted in the MUM Genomics Facility. We are especially grateful for the advice and assistance of Dr Christoph Hanh in the use of the MITObim software.
- Avise JC: Phylogeography: The History and Formation of Species. 2000, Cambridge, MA: Harvard University PressGoogle Scholar
- Hillis DM, Mable BK, Larson A, Davis SK, Zimmer EA: Molecular Systematics. 1996, Sunderland, MA: Sinauer AssociatesGoogle Scholar
- Santini F, Sorenson L, Alfaro ME: A new multi-locus timescale reveals the evolutionary basis of diversity patterns in triggerfishes and filefishes (Balistidae, Monacanthidae; Tetraodontiformes). Mol Phylogenet Evol. 2013, 69 (1): 165-176.PubMedView ArticleGoogle Scholar
- Miller AD, Nguyen TT, Burridge CP, Austin CM: Complete mitochondrial DNA sequence of the Australian freshwater crayfish, Cherax destructor (Crustacea: Decapoda: Parastacidae): a novel gene order revealed. Gene. 2004, 331: 65-72.PubMedView ArticleGoogle Scholar
- Masta SE, Longhorn SJ, Boore JL: Arachnid relationships based on mitochondrial genomes: asymmetric nucleotide and amino acid bias affects phylogenetic analyses. Mol Phylogenet Evol. 2009, 50 (1): 117-128.PubMedView ArticleGoogle Scholar
- Liu Y, Cui Z: Complete mitochondrial genome of the Chinese spiny lobster Panulirus stimpsoni (Crustacea: Decapoda): genome characterization and phylogenetic considerations. Mol Biol Rep. 2011, 38 (1): 403-410.PubMedView ArticleGoogle Scholar
- Perseke M, Golombek A, Schlegel M, Struck TH: The impact of mitochondrial genome analyses on the understanding of deuterostome phylogeny. Mol Phylogenet Evol. 2013, 66 (3): 898-905.PubMedView ArticleGoogle Scholar
- Simon S, Hadrys H: A comparative analysis of complete mitochondrial genomes among Hexapoda. Mol Phylogenet Evol. 2013, 15 (13): 00153-00155.Google Scholar
- Cameron SL, Sullivan J, Song H, Miller KB, Whiting MF: A mitochondrial genome phylogeny of the Neuropterida (lace-wings, alderflies and snakeflies) and their relationship to the other holometabolous insect orders. Zool Scr. 2009, 38 (6): 575-590.View ArticleGoogle Scholar
- Prosdocimi F, de Carvalho DC, de Almeida RN, Beheregaray LB: The complete mitochondrial genome of two recently derived species of the fish genus Nannoperca (Perciformes, Percichthyidae). Mol Biol Rep. 2012, 39 (3): 2767-2772.PubMedView ArticleGoogle Scholar
- Poulsen J, Byrkjedal I, Willassen E, Rees D, Takeshima H, Satoh T, Shinohara G, Nishida M, Miya M: Mitogenomic sequences and evidence from unique gene rearrangements corroborate evolutionary relationships of myctophiformes (Neoteleostei). BMC Evol Biol. 2013, 13 (1): 111-PubMedPubMed CentralView ArticleGoogle Scholar
- Cui Z, Liu Y, Li CP, You F, Chu KH: The complete mitochondrial genome of the large yellow croaker, Larimichthys crocea (Perciformes, Sciaenidae): unusual features of its control region and the phylogenetic position of the Sciaenidae. Gene. 2009, 432 (1–2): 33-43.PubMedView ArticleGoogle Scholar
- Jex AR, Hall RS, Littlewood DT, Gasser RB: An integrated pipeline for next-generation sequencing and annotation of mitochondrial genomes. Nucleic Acids Res. 2010, 38 (2): 522-533.PubMedPubMed CentralView ArticleGoogle Scholar
- Hahn C, Bachmann L, Chevreux B: Reconstructing mitochondrial genomes directly from genomic next-generation sequencing reads–a baiting and iterative mapping approach. Nucleic Acids Res. 2013, 41 (13): 9-View ArticleGoogle Scholar
- Iorizzo M, Senalik D, Szklarczyk M, Grzebelus D, Spooner D, Simon P: De novo assembly of the carrot mitochondrial genome using next generation sequencing of whole genomic DNA provides first evidence of DNA transfer into an angiosperm plastid genome. BMC Plant Biol. 2012, 12 (61): 1471-2229.Google Scholar
- Miller AD, Good RT, Coleman RA, Lancaster ML, Weeks AR: Microsatellite loci and the complete mitochondrial DNA sequence characterized through next generation sequencing and de novo genome assembly for the critically endangered orange-bellied parrot, Neophema chrysogaster. Mol Biol Rep. 2013, 40 (1): 35-42.PubMedView ArticleGoogle Scholar
- Groenenberg DSJ, Pirovano W, Gittenberger E, Schilthuizen M: The complete mitogenome of Cylindrus obtusus (Helicidae, Ariantinae) using Illumina next generation sequencing. BMC Genomics. 2012, 13: 114-PubMedPubMed CentralView ArticleGoogle Scholar
- Berman M, Miller AD, Austin CM: Characterisation of the complete mitochondrial genome and 13 microsatellite loci through next-generation sequencing for the New Caledonian spider-ant Leptomyrmex pallens. Mol Biol Rep. 2013, In pressGoogle Scholar
- Williams ST, Foster PG, Littlewood DTJ: The complete mitochondrial genome of a turbinid vetigastropod from MiSeq Illumina sequencing of genomic DNA and steps towards a resolved gastropod phylogeny. Gene. 2014, 533: 38-47.PubMedView ArticleGoogle Scholar
- Horwitz P, Richardson AM: An ecological classification of the burrows of Australian freshwater crayfish. Aust J Mar Freshwat Res. 1986, 37: 237-242.View ArticleGoogle Scholar
- Schultz MB, Smith SA, Horwitz P, Richardson AM, Crandall KA, Austin CM: Evolution underground: a molecular phylogenetic investigation of Australian burrowing freshwater crayfish (Decapoda: Parastacidae) with particular focus on Engaeus Erichson. Mol Phylogenet Evol. 2009, 50 (3): 580-598.PubMedView ArticleGoogle Scholar
- Kim S, Lee SH, Park MH, Choi HG, Park JK, Min GS: The complete mitochondrial genome of the American lobster, Homarus americanus (Crustacea, Decapoda). Mitochondrial DNA. 2011, 22 (3): 47-49.PubMedView ArticleGoogle Scholar
- Kim S, Park M-H, Jung J-H, Ahn D-H, Sultana T, Kim S, Park J-K, Choi H-G, Min G-S: The mitochondrial genomes of Cambaroides similis and Procambarus clarkii (Decapoda: Astacidea: Cambaridae): the phylogenetic implications for Reptantia. Zool Scr. 2012, 41 (3): 281-292.View ArticleGoogle Scholar
- Ojala D, Montoya J, Attardi G: tRNA punctuation model of RNA processing in human mitochondria. Nature. 1981, 290 (5806): 470-474.PubMedView ArticleGoogle Scholar
- Bagatharia S, Joshi M, Pandya R, Pandit A, Patel R, Desai S, Sharma A, Panchal O, Jasmani F, Saxena A: Complete mitogenome of asiatic lion resolves phylogenetic status within Panthera. BMC Genomics. 2013, 14 (1): 572-PubMedPubMed CentralView ArticleGoogle Scholar
- Hebert PDN, Stoekle MY, Zemlak TS, Francis CM: Identification of birds through DNA barcodes. PLoS Biol. 2004, 2: e312-PubMedPubMed CentralView ArticleGoogle Scholar
- Kilpert F, Held C, Podsiadlowski L: Multiple rearrangements in mitochondrial genomes of Isopoda and phylogenetic implications. Mol Phylogenet Evol. 2012, 64 (1): 106-117.PubMedView ArticleGoogle Scholar
- Okajima Y, Kumazawa Y: Mitochondrial genomes of acrodont lizards: timing of gene rearrangements and phylogenetic and biogeographic implications. BMC Evol Biol. 2010, 10 (1): 141-PubMedPubMed CentralView ArticleGoogle Scholar
- Bernt M, Donath A, Juhling F, Externbrink F, Florentz C, Fritzsch G, Putz J, Middendorf M, Stadler PF: MITOS: Improved de novo metazoan mitochondrial genome annotation. Mol Phylogenet Evol. 2012, 7 (12): 00332-00336.Google Scholar
- Alikhan NF, Petty NK, Ben Zakour NL, Beatson SA: BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons. BMC Genomics. 2011, 12 (402): 1471-2164.Google Scholar
- Langmead B, Salzberg SL: Fast gapped-read alignment with Bowtie 2. Nat Meth. 2012, 9 (4): 357-359.View ArticleGoogle Scholar
- Sullivan MJ, Petty NK, Beatson SA: Easyfig: a genome comparison visualizer. Bioinformatics. 2011, 27 (7): 1009-1010.PubMedPubMed CentralView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.