- Research article
- Open Access
A high density of ancient spliceosomal introns in oxymonad excavates
© Slamovits and Keeling; licensee BioMed Central Ltd. 2006
- Received: 26 January 2006
- Accepted: 25 April 2006
- Published: 25 April 2006
Certain eukaryotic genomes, such as those of the amitochondriate parasites Giardia and Trichomonas, have very low intron densities, so low that canonical spliceosomal introns have only recently been discovered through genome sequencing. These organisms were formerly thought to be ancient eukaryotes that diverged before introns originated, or at least became common. Now however, they are thought to be members of a supergroup known as excavates, whose members generally appear to have low densities of canonical introns. Here we have used environmental expressed sequence tag (EST) sequencing to identify 17 genes from the uncultivable oxymonad Streblomastix strix, to survey intron densities in this most poorly studied excavate group.
We find that Streblomastix genes contain an unexpectedly high intron density of about 1.1 introns per gene. Moreover, over 50% of these are at positions shared between a broad spectrum of eukaryotes, suggesting theyare very ancient introns, potentially present in the last common ancestor of eukaryotes.
The Streblomastix data show that the genome of the ancestor of excavates likely contained many introns and the subsequent evolution of introns has proceeded very differently in different excavate lineages: in Streblomastix there has been much stasis while in Trichomonas and Giardia most introns have been lost.
- Splice Site
- Branch Point
- Intron Position
- Intron Gain
One of the prominent features that distinguishes eukaryotic genomes from those of prokaryotes is the presence of spliceosomal introns. Introns are intervening sequences that are removed from expressed RNAs, in the case of spliceosomal introns through a series of transesterfications mediated by a large riboprotein complex called the spliceosome . Spliceosomal introns are only known from eukaryotic nuclear genomes, and were the subject of intense controversy over their potential role in early gene origins and evolution, the so-called introns early versus late debate [2–4]. One of the interesting features of intron evolution that came to light during this debate was the large range in intron density. At one extreme, introns appeared to be lacking in several protist lineages that were, at the time, thought to be the earliest-branching eukaryotes. These lineages included diplomonads (e.g., Giardia) and parabasalia (e.g. Trichomonas).
The early-branching status of these organisms has since been undermined by a variety of data, and now diplomonads and parabasalia are thought to be part of a large assemblage of protists called excavates, which also includes trypanosomes, euglenids, and a number of parasitic and free living flagellate or amoeboflagellate lineages . However, despite the accumulation of a considerable quantity of molecular data from both Giardia and Trichomonas, as well as the identification of proteins involving splicing in Trichomonas , evidence for introns in their genomes remained intriguingly elusive. Indeed, only recently were introns finally characterized in these organisms [7–9], and remain extremely rare. Only three introns have been found in G. intestinalis among thousands of known genes [8, 9] and forty-one introns were identified in the T. vaginalis genome after exhaustive searches . Information from excavates other than Trichomonas and Giardia is scarce, but overall there seems to be a generally low density of introns (with the possible exception of Jakobid flagellates based on one family of genes). Moreover, other instances of non-canonical introns and splicing are known in excavates [11–13], as are systems where splicing machinery is put to a slightly different use such as trans-splicing [14–16].
One of the excavate groups about which we know very little are the oxymonads. Oxymonads are anaerobic flagellates found almost exclusively in association with animals, many in the guts of termites and wood-eating roaches . This is the only group of amitochondriates for which secondary loss of mitochondria has not been yet demonstrated, but they are closely related to the flagellate Trimastix, which has a vestigial organelle, so a primary lack of mitochondria in oxymonads is unlikely. Mostoxymonads are not available in culture because they live in complex communities with other protists and prokaryotes. As a result, there are few molecular data available from any oxymonad, and no introns have been identified . The oxymonad Streblomastix strix is asymbiont of the dampwood termite Zootermopsis angusticollis from North American Pacific coastal region. This species has a number of unusual morphological characters, including a peculiar long slender cell shape with deep longitudinal vanes which is apparently maintained by intimate association with epibiotic bacteria , So far, many copies of four genes (alpha-tubulin, beta-tubulin, HSP90, and elongation factor-1 alpha) have been characterized from S. strix , and the complete absence of introns from all sequences (a total of 19,888 bp) suggests the oxymonads might share low intron densities apparently common to excavates. Here, we have used the recent documentation of a rare non-canonical genetic code in Streblomastix  to identify 17 oxymonad genes from an environmental expressed sequence tag (EST) pool from the hindgut of Zootermopsis. The genomic DNA sequence for each mRNA was determined and we found that, in contrast to other amitochondriate protists and the limited data previously available for Streblomastix, a relatively high density of canonical spliceosomal introns. Moreover, a large proportion of these introns are shared in position with other distantly related eukaryotes, suggesting that they are ancient intron positions retained in oxymonads but lost in other excavates such as Giardia and Trichomonas.
Identification of oxymonad sequences from ESTs
A total of 5,337 ESTs from a Z. angusticollis termite hindgut cDNAlibrary were sequenced and found to form 2,595 clusters of unique sequences. Overall, the sample was dominated by sequences of parabasalian origin (transcripts encoding parabasalian actin and actin-related proteins alone represented 32% of all ESTs). Moreover, there are few oxymonad sequences known outside this sample, so Streblomastix cDNAs could not be identified based on similarity to known genes (only 2 ESTs, corresponding to known Streblomastix alpha- and beta-tubulin sequences, were identified by BLASTX searches). Accordingly, we used the presence of a rare non-canonical genetic code in Streblomastix as a filter to identify at least those genes where non-canonical codons were sampled. In Streblomastix, TAA and TAG encode glutamine (Q) rather than stop as in the universal code , so all clusters were compared to public databases using BLASTX and examined individually for in frame stop codons, in particular at positions normally encoding glutamine. No other protist known to exist in Z. angusticollis has been shown to possess a non-canonical genetic code. The other prominent protists in this insect are parabasalia, which are not known to deviate from the universal genetic code and whose sequences are also easy to identify with BLASTX searches given their high similarity with T. vaginalis genomic sequences.
S. strix genes identified in this study. Streblomastix genes recovered from the Z. angusticollis hindgut RNA sample. For incomplete sequences, the number of missing amino acids were estimated from homologues from Giardia and/or Trichomonas. UPF1 shows extensive size variation among eukaryotic lineages (between 800 and 1600 amino acids, approximately), so it is difficult to determine how much sequence this fragment is lacking. ND: not determined.
Cathespin B (1)
Cathespin B (2)
NAD-dependent glutamate dehydrogenase
Pyruvate phosphate dikinase (1)
Pyruvate phosphate dikinase (2)
Nuclear transport factor 2
Conserved hypothetical protein
Introns in Streblomastix genes
Genomic DNA sequences were obtained for all Streblomastix coding regions identified from the cDNA library. Despite the fact that many alleles and loci representing four proteins were previously found to contain no introns, we found that most of the genes encoding these transcripts were interrupted by introns. In total, we found 21 introns in our sample of 17 genes with genes having as many as 5 introns (Table 1). Including previously known intronless EF-1 alpha and HSP90 genes (alpha and beta tubulin are included in our sample) , the overall density is 1.1 introns per gene. However this is likely to be an underestimation since some of our sequences are truncated and could contain further introns, and there is a bias favouring genes that are more often intronless (e.g. HSP90). This density is less than that observed in the relatively intron-rich mammals and plants, but comparable to many other eukaryotic genomes, and certainly much higher than Giardia and Trichomonas where only 3 and 41 introns have been detected despite very large quantities of genomic data [7, 9].
Basic features of the S. strix introns. Characteristics of 21 introns found in 17 Streblomastix genes analysed. Size and base composition are shown. GC% mRNA shows base composition of the coding sequence (excluding introns).
Cathespin B (1)
Cathespin B (2)
Pyruvate phosphate dikinase (1)
Nuclear transport factor 2
Conservation of intron positions in oxymonad genes
The present sampling of protein-coding gene sequences from Streblomastix suggests that oxymonad genomes contain a relatively large number of canonical splicesomal introns, many of which are at ancient conserved positions. This is in contrast to the better studied excavate genomes such as those of kinetoplastids, Giardia and Trichomonas where canonical spliceosomal introns are either rare or have been co-opted in specific ways, such as the spliced leaders in euglenozoa. The fact that many Streblomastix introns are ancient shows that the genome of the ancestor of these organisms, and indeed probably all extant eukaryotes, contained many introns and that the intron-poor state found in Giardia and Trichomonas is more likely independently derived.
cDNA library construction and EST sequencing
Termites were collected from a rotten log in Point Grey, Vancouver, Canada. The whole hindgut content of about 60 individuals of Zootermopsis angusticollis from a single colony was collected and total RNA was extracted using TRIZOL (Invitrogen). A directionally cloned cDNA library was constructed (Amplicon Express) and 5,337 clones were sequenced from the 5' end. ESTs were trimmed for vector and quality, and assembled into clusters by PEPdb http://amoebidia.bcm.umontreal.ca/public/pepdb/agrm.php.
Identification and genomic characterisation of Streblomastix genes
Streblomastix sequences were recovered from EST data by identifying protein coding sequences containing in-frame TAA and TAG stop codons. Putatively stop-coding containing mRNAs were re-sequenced in both strands. In cases where cDNA clones were truncated, the sequences were extended by means of 3' and 5' RACE (Ambion) using total termite hindgut RNA. The genomic sequence for each mRNA was amplified using specific primers corresponding to the ends of each complete or partial cDNA and PCR-amplified using genomic DNA purified from the termite hindgut content. All PCR products were cloned using TOPO and sequenced both strands. Accession numbers for new sequences are [genbankDQ363664, genbankDQ363665, genbankDQ363666, genbankDQ363667, genbankDQ363668, genbankDQ363669, genbankDQ363670, genbankDQ363671, genbankDQ363672, genbankDQ363673, genbankDQ363674, genbankDQ363675, genbankDQ363676, genbankDQ363677, genbankDQ363678, genbankDQ363679].
This work was supported by a grant from the Natural Sciences and Engineering Research Council of Canada, and EST sequencing was supported by the Protist EST Program through Genome Canada/Genome Atlantic. We thank A. de Koning for help isolating termite gut RNA, and N. Fast for critical reading of the manuscript. PJK is a Fellow of the Canadian Institute for Advanced Research and a New Investigator of the Canadian Institutes for Health Research and the Michael Smith Foundation for Health Research.
- Maniatis T, Reed R: The role of small nuclear ribonucleoprotein particles in pre-mRNA splicing. Nature. 1987, 325 (6106): 673-678. 10.1038/325673a0.View ArticlePubMedGoogle Scholar
- Gilbert W: Why genes in pieces?. Nature. 1978, 271 (5645): 501-10.1038/271501a0.View ArticlePubMedGoogle Scholar
- Logsdon JM: The recent origins of spliceosomal introns revisited. Curr Opin Genet Dev. 1998, 8 (6): 637-648. 10.1016/S0959-437X(98)80031-2.View ArticlePubMedGoogle Scholar
- Zhaxybayeva O, Gogarten JP: Spliceosomal introns: new insights into their evolution. Curr Biol. 2003, 13 (19): R764-766. 10.1016/j.cub.2003.09.017.View ArticlePubMedGoogle Scholar
- Simpson AG: Cytoskeletal organization, phylogenetic affinities and systematics in the contentious taxon Excavata (Eukaryota). Int J Syst Evol Microbiol. 2003, 53 (Pt 6): 1759-1777. 10.1099/ijs.0.02578-0.View ArticlePubMedGoogle Scholar
- Fast NM, Doolittle WF: Trichomonas vaginalis possesses a gene encoding the essential spliceosomal component, PRP8. Mol Biochem Parasitol. 1999, 99 (2): 275-278. 10.1016/S0166-6851(99)00017-1.View ArticlePubMedGoogle Scholar
- Vanacova S, Yan W, Carlton JM, Johnson PJ: Spliceosomal introns in the deep-branching eukaryote Trichomonas vaginalis. Proc Natl Acad Sci U S A. 2005, 102 (12): 4430-4435. 10.1073/pnas.0407500102.PubMed CentralView ArticlePubMedGoogle Scholar
- Nixon JE, Wang A, Morrison HG, McArthur AG, Sogin ML, Loftus BJ, Samuelson J: A spliceosomal intron in Giardia lamblia. Proc Natl Acad Sci U S A. 2002, 99 (6): 3701-3705. 10.1073/pnas.042700299.PubMed CentralView ArticlePubMedGoogle Scholar
- Russell AG, Shutt TE, Watkins RF, Gray MW: An ancient spliceosomal intron in the ribosomal protein L7a gene (Rpl7a) of Giardia lamblia. BMC Evol Biol. 2005, 5: 45-10.1186/1471-2148-5-45.PubMed CentralView ArticlePubMedGoogle Scholar
- Archibald JM, O'Kelly CJ, Doolittle WF: The chaperonin genes ofjakobid and jakobid-like flagellates: implications for eukaryotic evolution. Mol Biol Evol. 2002, 19 (4): 422-431.View ArticlePubMedGoogle Scholar
- Muchhal US, Schwartzbach SD: Characterization of the unique intron-exon junctions of Euglena gene(s) encoding the polyprotein precursorto the light-harvesting chlorophyll a/b binding protein of photosystem II. Nucleic Acids Res. 1994, 22 (25): 5737-5744.PubMed CentralView ArticlePubMedGoogle Scholar
- Breckenridge DG, Watanabe Y, Greenwood SJ, Gray MW, Schnare MN: U1 small nuclear RNA and spliceosomal introns in Euglena gracilis. Proc Natl Acad Sci U S A. 1999, 96 (3): 852-856. 10.1073/pnas.96.3.852.PubMed CentralView ArticlePubMedGoogle Scholar
- Canaday J, Tessier LH, Imbault P, Paulus F: Analysis of Euglenagracilis alpha-, beta- and gamma-tubulin genes: introns and pre-mRNA maturation. Mol Genet Genomics. 2001, 265 (1): 153-160. 10.1007/s004380000403.View ArticlePubMedGoogle Scholar
- Muchhal US, Schwartzbach SD: Characterization of a Euglena geneencoding a polyprotein precursor to the light-harvesting chlorophyll a/b-binding protein of photosystem II. Plant Mol Biol. 1992, 18 (2): 287-299. 10.1007/BF00034956.View ArticlePubMedGoogle Scholar
- Tessier LH, Chan RL, Keller M, Weil JH, Imbault P: The Euglena gracilis rbcS gene contains introns with unusual borders. FEBS Lett. 1992, 304 (2–3): 252-255. 10.1016/0014-5793(92)80631-P.View ArticlePubMedGoogle Scholar
- Tessier LH, Paulus F, Keller M, Vial C, Imbault P: Structure and expression of Euglena gracilis nuclear rbcS genes encoding the small subunits of the ribulose 1, 5-bisphosphate carboxylase/oxygenase: a novel splicing process for unusual intervening sequences?. J Mol Biol. 1995, 245 (1): 22-33.View ArticlePubMedGoogle Scholar
- Brugerolle G, Lee JJ: Order Oxymonadida. The illustrated guide to the protozoa. Edited by: Lee JJ, Leedale GF, Bradbury P. 2000, Lawrence, KA: Society of Protozoologists, 1186-1195. 2Google Scholar
- Keeling PJ, Leander BS: Characterisation of a non-canonical genetic code in the oxymonad Streblomastix strix. J Mol Biol. 2003, 326 (5): 1337-1349. 10.1016/S0022-2836(03)00057-3.View ArticlePubMedGoogle Scholar
- Leander BS, Keeling PJ: Symbiotic innovation in the oxymonad Streblomastix strix. J Eukaryot Microbiol. 2004, 51 (3): 291-300. 10.1111/j.1550-7408.2004.tb00569.x.View ArticlePubMedGoogle Scholar
- Moriya S, Tanaka K, Ohkuma M, Sugano S, Kudo T: Diversificationof the microtubule system in the early stage of eukaryote evolution: elongation factor 1 alpha and alpha-tubulin protein phylogeny of termite symbiotic oxymonad and hypermastigote protists. J Mol Evol. 2001, 52 (1): 6-16.View ArticlePubMedGoogle Scholar
- Hampl V, Horner DS, Dyal P, Kulda J, Flegr J, Foster PG, Embley TM: Inference of the phylogenetic position of oxymonads based on nine genes: support for metamonada and excavata. Mol Biol Evol. 2005, 22 (12): 2508-2518. 10.1093/molbev/msi245.View ArticlePubMedGoogle Scholar
- Slamovits CH, Keeling PJ: Pyruvate-phosphate dikinase of oxymonads and parabasalia and the evolution of pyrophosphate-dependent glycolysis in anaerobic eukaryotes. Eukaryot Cell. 2006, 5 (1): 148-154. 10.1128/EC.5.1.148-154.2006.PubMed CentralView ArticlePubMedGoogle Scholar
- Baker KE, Parker R: Nonsense-mediated mRNA decay: terminating erroneous gene expression. Curr Opin Cell Biol. 2004, 16 (3): 293-299. 10.1016/j.ceb.2004.03.003.View ArticlePubMedGoogle Scholar
- Culbertson MR, Leeds PF: Looking at mRNA decay pathways throughthe window of molecular evolution. Curr Opin Genet Dev. 2003, 13 (2): 207-214. 10.1016/S0959-437X(03)00014-5.View ArticlePubMedGoogle Scholar
- Jensen TH, Boulay J, Rosbash M, Libri D: The DECD box putative ATPase Sub2p is an early mRNA export factor. Curr Biol. 2001, 11 (21): 1711-1715. 10.1016/S0960-9822(01)00529-2.View ArticlePubMedGoogle Scholar
- Linder P, Stutz F: mRNA export: travelling with DEAD box proteins. Curr Biol. 2001, 11 (23): R961-963. 10.1016/S0960-9822(01)00574-7.View ArticlePubMedGoogle Scholar
- Patel AA, Steitz JA: Splicing double: insights from the second spliceosome. Nat Rev Mol Cell Biol. 2003, 4 (12): 960-970. 10.1038/nrm1259.View ArticlePubMedGoogle Scholar
- Reed R, Maniatis T: The role of the mammalian branchpoint sequence in pre-mRNA splicing. Genes Dev. 1988, 2 (10): 1268-1276.View ArticlePubMedGoogle Scholar
- Lin RJ, Newman AJ, Cheng SC, Abelson J: Yeast mRNA splicing in vitro. J Biol Chem. 1985, 260 (27): 14780-14792.PubMedGoogle Scholar
- Lorkovic ZJ, Wieczorek Kirk DA, Lambermon MH, Filipowicz W: Pre-mRNA splicing in higher plants. Trends Plant Sci. 2000, 5 (4): 160-167. 10.1016/S1360-1385(00)01595-8.View ArticlePubMedGoogle Scholar
- Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV: Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution. Curr Biol. 2003, 13 (17): 1512-1517. 10.1016/S0960-9822(03)00558-X.View ArticlePubMedGoogle Scholar
- Yoshihama M, Nakao A, Nguyen HD, Kenmochi N: Analysis of Ribosomal Protein Gene Structures: Implications for Intron Evolution. PLoS Genet. 2006, 2 (3): e25-10.1371/journal.pgen.0020025.PubMed CentralView ArticlePubMedGoogle Scholar
- Sverdlov AV, Rogozin IB, Babenko VN, Koonin EV: Conservation versus parallel gains in intron evolution. Nucleic Acids Res. 2005, 33 (6): 1741-1748. 10.1093/nar/gki316.PubMed CentralView ArticlePubMedGoogle Scholar
- Roy SW, Gilbert W: Rates of intron loss and gain: implications for early eukaryotic evolution. Proc Natl Acad Sci U S A. 2005, 102 (16): 5773-5778. 10.1073/pnas.0500383102.PubMed CentralView ArticlePubMedGoogle Scholar
- Rogozin IB, Sverdlov AV, Babenko VN, Koonin EV: Analysis of evolution of exon-intron structure of eukaryotic genes. Brief Bioinform. 2005, 6 (2): 118-134. 10.1093/bib/6.2.118.View ArticlePubMedGoogle Scholar
- Nguyen HD, Yoshihama M, Kenmochi N: New maximum likelihood estimators for eukaryotic intron evolution. PLoS Comput Biol. 2005, 1 (7): e79-10.1371/journal.pcbi.0010079.PubMed CentralView ArticlePubMedGoogle Scholar
- Simpson AG, Radek R, Dacks JB, O'Kelly CJ: How oxymonads lost their groove: an ultrastructural comparison of Monocercomonoides and excavate taxa. J Eukaryot Microbiol. 2002, 49 (3): 239-248. 10.1111/j.1550-7408.2002.tb00529.x.View ArticlePubMedGoogle Scholar
- Dacks JB, Silberman JD, Simpson AG, Moriya S, Kudo T, Ohkuma M, Redfield RJ: Oxymonads are closely related to the excavate taxon Trimastix. Mol Biol Evol. 2001, 18 (6): 1034-1044.View ArticlePubMedGoogle Scholar
- Cavalier-Smith T: The excavate protozoan phyla Metamonada Grasse emend. (Anaeromonadea, Parabasalia, Carpediemonas, Eopharyngia) and Loukozoa emend. (Jakobea, Malawimonas): their evolutionary affinities and new higher taxa. Int J Syst Evol Microbiol. 2003, 53 (Pt 6): 1741-1758. 10.1099/ijs.0.02548-0.View ArticlePubMedGoogle Scholar
- Douglas S, Zauner S, Fraunholz M, Beaton M, Penny S, Deng LT, Wu X, Reith M, Cavalier-Smith T, Maier UG: The highly reduced genome of an enslaved algal nucleus. Nature. 2001, 410 (6832): 1091-1096. 10.1038/35074092.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.